面向基础系列之---Java网络编程---网络连接组件的使用(URLConnection)

URL主要功能用于读取服务端或者目标地址的数据,但是要具体对一个请求的元数据进行解析,就无能为力了。这个时候,URLConnection就是一个很好的切入口。这东西不仅仅能读取数据,还能对元数据进行读取,还能读取header(头header还是很重要的,互联网开发中,往往我们就是要header里面的数据),并且,URLConnection还能使用各种的HTTP方法(POST/GET/OPTIONS/PUT/DELETE)往服务端发数据。本章我不介绍太多,尽量精简。

一、构建与读

整体上使用URLConnection的基本步骤如下:

  1. 构造一个URL对象
  2. 调用这个URL的openConnection()获取一个对应的URLConnection对象
  3. 配置这个URLConnection
  4. 读取首部字段
  5. 获取输入流并读取数据
  6. 获得输出流并写入数据
  7. 关闭连接

基本的代码片段如下:

try{
    URL u = new URL("http://www.baidu.com");
    URLConnection conn = u.openConnection();
    // 从URL读取
} catch(MalformedURLException ex){
    System.err.println(ex);
} catch(IOException ex){
    System.err.println(ex)
}

1、内部一些简单原理

  • URLConnection是一个抽象类,只有一个方法没有实现:public void connect() throws IOException
  • 一些常见的实现类:
    • sun.net.www.protocol.file.FileURLConnection:文件名相关
    • sun.net.www.protocol.http.HttpURLConnection:网络相关
  • 创建URLConnection之后,不进行connect的调用,在第一次要进行数据通信的时候,才调用,例如:getInputStream、getContent、getHeaderField等

2、读取服务器的数据

public class NetworkMain {

    public static void main(String[] args) {
        try {
            URL url = new URL("https://www.baidu.com");
            URLConnection urlConnection = url.openConnection();

            try (InputStream inputStream = urlConnection.getInputStream();){
                InputStream buffer = new BufferedInputStream(inputStream);
                InputStreamReader reader = new InputStreamReader(buffer);
                int c;
                while ((c=reader.read()) != -1){
                    System.out.print((char) c);
                }
            }
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}

URL与URLConnection的区别:

  • URLConnection提供了对HTTP首部的读取
  • URLConnection可以配置发送给服务器的请求
  • URLConnection出了读取,还能写入,箱服务器

3、首部与读取

下面是一个百度首页获取的header具体信息:

Accept-Ranges:[bytes]
null:[HTTP/1.1 200 OK]
Server:[bfe/1.0.8.18]
Etag:["58860402-98b"]
Cache-Control:[private, no-cache, no-store, proxy-revalidate, no-transform]
Connection:[Keep-Alive]
Set-Cookie:[BDORZ=27315; max-age=86400; domain=.baidu.com; path=/]
Pragma:[no-cache]
Last-Modified:[Mon, 23 Jan 2017 13:24:18 GMT]
Content-Length:[2443]
Date:[Thu, 13 Sep 2018 09:51:05 GMT]
Content-Type:[text/html]

获取的代码如下:

public class NetworkMain {

    public static void main(String[] args) {
        try {
            URL url = new URL("https://www.baidu.com");
            URLConnection urlConnection = url.openConnection();
            Map<String, List<String>> headerFields = urlConnection.getHeaderFields();
            for(Map.Entry<String,List<String>> entry : headerFields.entrySet()){
                System.out.println(entry.getKey()+":"+entry.getValue().toString());
            }
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}
a、Content-type

返回响应主题的MIME((Multipurpose Internet Mail Extensions)多用途互联网邮件扩展类型)。其实就是内容的类型和编码方式

  • 没指定不会抛异常,直接返回null
  • text/html不指定编码方式,默认使用ISO-8859-1,是http默认编码方式
  • 其他常用的类型还有:text/plain、image/gif、application/xml、image/jpeg
  • 可通过getContentEncoding方法进行获取编码方式,不指定会返回null
b、Content-length

获取内容的总共的字节大小,如果没有content-length头,getContentLength()方法就返回-1

  • Java7中增加了个getContentLengthLong方法,返回long,防止超出int最大值
  • http下载一个二进制文件,最好使用getContentLength方法来判断什么时候结束InputStream对象
c、Date

指出文件何时发送的

d、Expires

指示何时从缓存中删除文档,如果没有这个header,getExpireation方法放回0,表示永远不会过期

e、Last-Modified

文档最后修改时间,没有这个header的话,getLastModified方法返回0

二、缓存

缓存是永恒的话题~好吧,web浏览器的缓存,也是一个能屠龙的功能。这小节会介绍下如果使用web缓存,与Java中设置缓存的几个类

1、如何设置header使之能够缓存

一般来说GET的HTTP请求都会缓存,也应该缓存,但是POST请求就不应该缓存。当然这些都可以通过header进行调整:

  • Expires首部(HTTP1.0)指示可以缓存这个资源,知道指定的时间为止
  • Cache-control首部(HTTP1.1)细粒度的缓存控制,如果这个和expires首部都有,会以这个首部为主,多个cache-control是被允许的:
    • Max-age=[second]:从现在到缓存项过期之前的秒数
    • s-maxage=[seconds]:从现在起,知道缓存项再共享缓存中过期之前的秒数。私有缓存可以将缓存项保存更长时间
    • Public:可以缓存一个经过认证的响应。否则已认证的响应不能缓存
    • Private:仅单个用户缓存可以保存响应,而共享缓存不应该保存
    • No-cache:缓存项仍然可以缓存,不过客户端在每次访问时都要用一个Etag或者Last-modified头重新验证响应
    • no-store:不管怎样都不缓存
  • Last-modified:最后一次修改日期。客户端可以使用一个HEAD请求来检查这个日期,只有当本地缓存的日期早于这个值,才会真正执行GET请求
  • Etag:资源的唯一标识。HEAD请求访问这个Etag服务端的值,只有与本地的Etag值不同的情况下,说明缓存失效了,才会调用GET请求

2、Java的Web缓存

默认请款下,直接使用URL请求资源的时候,Java是不进行缓存的,要默认实现几个类来增加Java对Web请求的缓存功能:

ResponseCache//设置默认缓存策略的对象
CacheRequest//设置请求的对象
CacheResponse//设置回复请求的对象

一个简简单单的实现代码,稍微有点长,不过不难,其中还有对header中cache-control字段的解析对象构建,是一个不错的起步例子:

import java.io.*;
import java.net.*;
import java.util.*;
import java.util.concurrent.ConcurrentHashMap;

public class NetworkMain {

    public static class CacheControl {

        private Date maxAge = null;
        private Date sMaxAge = null;
        private boolean mustRevalidate = false;
        private boolean noCache = false;
        private boolean noStore = false;
        private boolean proxyRevalidate = false;
        private boolean publicCache = false;
        private boolean privateCache = false;

        public CacheControl(String s) {
            if (s == null || !s.contains(":")) {
                return; // default policy
            }

            String value = s.split(":")[1].trim();
            String[] components = value.split(",");

            Date now = new Date();
            for (String component : components) {
                try {
                    component = component.trim().toLowerCase(Locale.US);
                    if (component.startsWith("max-age=")) {
                        int secondsInTheFuture = Integer.parseInt(component.substring(8));
                        maxAge = new Date(now.getTime() + 1000 * secondsInTheFuture);
                    } else if (component.startsWith("s-maxage=")) {
                        int secondsInTheFuture = Integer.parseInt(component.substring(8));
                        sMaxAge = new Date(now.getTime() + 1000 * secondsInTheFuture);
                    } else if (component.equals("must-revalidate")) {
                        mustRevalidate = true;
                    } else if (component.equals("proxy-revalidate")) {
                        proxyRevalidate = true;
                    } else if (component.equals("no-cache")) {
                        noCache = true;
                    } else if (component.equals("public")) {
                        publicCache = true;
                    } else if (component.equals("private")) {
                        privateCache = true;
                    }
                } catch (RuntimeException ex) {
                    continue;
                }
            }
        }

        public Date getMaxAge() {
            return maxAge;
        }

        public Date getSharedMaxAge() {
            return sMaxAge;
        }

        public boolean mustRevalidate() {
            return mustRevalidate;
        }

        public boolean proxyRevalidate() {
            return proxyRevalidate;
        }

        public boolean noStore() {
            return noStore;
        }

        public boolean noCache() {
            return noCache;
        }

        public boolean publicCache() {
            return publicCache;
        }

        public boolean privateCache() {
            return privateCache;
        }
    }

    public static class SimpleCacheRequest extends CacheRequest {
        private ByteArrayOutputStream out = new ByteArrayOutputStream();


        @Override
        public OutputStream getBody() throws IOException {

            return out;
        }

        @Override
        public void abort() {
            out.reset();
        }

        public byte[] getData() {
            if (out.size() == 0) {
                return null;
            } else {
                return out.toByteArray();
            }
        }
    }

    public static class SimleCacheResponse extends CacheResponse {
        private final Map<String, List<String>> headers;
        private final SimpleCacheRequest request;
        private final Date expires;
        private final CacheControl control;

        public SimleCacheResponse(SimpleCacheRequest request, URLConnection uc, CacheControl control) throws IOException {
            this.request = request;
            this.control = control;
            this.expires = new Date(uc.getExpiration());
            this.headers = Collections.unmodifiableMap(uc.getHeaderFields());
        }

        @Override
        public InputStream getBody() {
            return new ByteArrayInputStream(request.getData());
        }

        @Override
        public Map<String, List<String>> getHeaders()
                throws IOException {
            return headers;
        }

        public CacheControl getControl() {
            return control;
        }

        public boolean isExpired() {
            Date now = new Date();
            if (control.getMaxAge().before(now)) return true;
            else if (expires != null && control.getMaxAge() != null) {
                return expires.before(now);
            } else {
                return false;
            }
        }
    }
    public static class MemoryCache extends ResponseCache {

        private final Map<URI, SimleCacheResponse> responses
                = new ConcurrentHashMap<URI, SimleCacheResponse>();
        private final int maxEntries;

        public MemoryCache() {
            this(100);
        }

        public MemoryCache(int maxEntries) {
            this.maxEntries = maxEntries;
        }

        @Override
        public CacheRequest put(URI uri, URLConnection conn)
                throws IOException {

            if (responses.size() >= maxEntries) return null;

            CacheControl control = new CacheControl(conn.getHeaderField("Cache-Control"));
            if (control.noStore()) {
                return null;
            } else if (!conn.getHeaderField(0).startsWith("GET ")) {
                // only cache GET
                return null;
            }

            SimpleCacheRequest request = new SimpleCacheRequest();
            SimleCacheResponse response = new SimleCacheResponse(request, conn, control);

            responses.put(uri, response);
            return request;
        }

        @Override
        public CacheResponse get(URI uri, String requestMethod,
                                 Map<String, List<String>> requestHeaders)
                throws IOException {

            if ("GET".equals(requestMethod)) {
                SimleCacheResponse response = responses.get(uri);
                // check expiration date
                if (response != null && response.isExpired()) {
                    responses.remove(response);
                    response = null;
                }
                return response;
            } else {
                return null;
            }
        }
    }

    public static void main(String[] args) {
        ResponseCache.setDefault(new MemoryCache());
        try {
            URL url = new URL("https://www.baidu.com");
            URLConnection urlConnection = url.openConnection();
            Map<String, List<String>> headerFields = urlConnection.getHeaderFields();
            for (Map.Entry<String, List<String>> entry : headerFields.entrySet()) {
                System.out.println(entry.getKey() + ":" + entry.getValue().toString());
            }
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}


三、一些连接配置项

URLConnection类有7个保护的字段,定义了客户端如何向服务端作出请求,JDK源码中,对这些配置项做了很好的说明,直接读英文无压力,我就不多说了:

	/**
     * The URL represents the remote object on the World Wide Web to
     * which this connection is opened.
     * <p>
     * The value of this field can be accessed by the
     * {@code getURL} method.
     * <p>
     * The default value of this variable is the value of the URL
     * argument in the {@code URLConnection} constructor.
     *
     * @see     java.net.URLConnection#getURL()
     * @see     java.net.URLConnection#url
     */
	protected URL url;

   	/**
     * This variable is set by the {@code setDoInput} method. Its
     * value is returned by the {@code getDoInput} method.
     * <p>
     * A URL connection can be used for input and/or output. Setting the
     * {@code doInput} flag to {@code true} indicates that
     * the application intends to read data from the URL connection.
     * <p>
     * The default value of this field is {@code true}.
     *
     * @see     java.net.URLConnection#getDoInput()
     * @see     java.net.URLConnection#setDoInput(boolean)
     */
    protected boolean doInput = true;

   /**
     * This variable is set by the {@code setDoOutput} method. Its
     * value is returned by the {@code getDoOutput} method.
     * <p>
     * A URL connection can be used for input and/or output. Setting the
     * {@code doOutput} flag to {@code true} indicates
     * that the application intends to write data to the URL connection.
     * <p>
     * The default value of this field is {@code false}.
     *
     * @see     java.net.URLConnection#getDoOutput()
     * @see     java.net.URLConnection#setDoOutput(boolean)
     */
    protected boolean doOutput = false;
   	/**
     * If {@code true}, this {@code URL} is being examined in
     * a context in which it makes sense to allow user interactions such
     * as popping up an authentication dialog. If {@code false},
     * then no user interaction is allowed.
     * <p>
     * The value of this field can be set by the
     * {@code setAllowUserInteraction} method.
     * Its value is returned by the
     * {@code getAllowUserInteraction} method.
     * Its default value is the value of the argument in the last invocation
     * of the {@code setDefaultAllowUserInteraction} method.
     *
     * @see     java.net.URLConnection#getAllowUserInteraction()
     * @see     java.net.URLConnection#setAllowUserInteraction(boolean)
     * @see     java.net.URLConnection#setDefaultAllowUserInteraction(boolean)
     */
    protected boolean allowUserInteraction = defaultAllowUserInteraction;
	/**
     * If {@code true}, the protocol is allowed to use caching
     * whenever it can. If {@code false}, the protocol must always
     * try to get a fresh copy of the object.
     * <p>
     * This field is set by the {@code setUseCaches} method. Its
     * value is returned by the {@code getUseCaches} method.
     * <p>
     * Its default value is the value given in the last invocation of the
     * {@code setDefaultUseCaches} method.
     *
     * @see     java.net.URLConnection#setUseCaches(boolean)
     * @see     java.net.URLConnection#getUseCaches()
     * @see     java.net.URLConnection#setDefaultUseCaches(boolean)
     */
	protected boolean useCaches = defaultUseCaches;

   	/**
     * Some protocols support skipping the fetching of the object unless
     * the object has been modified more recently than a certain time.
     * <p>
     * A nonzero value gives a time as the number of milliseconds since
     * January 1, 1970, GMT. The object is fetched only if it has been
     * modified more recently than that time.
     * <p>
     * This variable is set by the {@code setIfModifiedSince}
     * method. Its value is returned by the
     * {@code getIfModifiedSince} method.
     * <p>
     * The default value of this field is {@code 0}, indicating
     * that the fetching must always occur.
     *
     * @see     java.net.URLConnection#getIfModifiedSince()
     * @see     java.net.URLConnection#setIfModifiedSince(long)
     */
    protected long ifModifiedSince = 0;

   	/**
     * If {@code false}, this connection object has not created a
     * communications link to the specified URL. If {@code true},
     * the communications link has been established.
     */
    protected boolean connected = false;

对象中有相对应的set和get方法,一般如果在openConnection方法调用之后进行set,都会抛出IllegalStateException异常

四、向服务端写数据

这部分两块,写header,写内容

1、设置请求数据的header

这里设置header和前面的不一样,前面是对服务端请求过来的数据进行header读取,这里会回写服务端的时候,对这个请求Request进行header添加的操作,主要用下面这几个方法:

public void setRequestProperty(String key, String value);//设置一个key对应的值,value可以逗号分隔设置多个
public void addRequestProperty(String key, String value);//对一个key的值进行添加值的操作
    

比较好玩的是,发现setRequestProperty的源码不难,可以看看,增加源码的亲密度

public abstract class URLConnection {
    ...
	public void setRequestProperty(String key, String value) {
        if (connected)
            throw new IllegalStateException("Already connected");
        if (key == null)
            throw new NullPointerException ("key is null");

        if (requests == null)
            requests = new MessageHeader();

        requests.set(key, value);
    }
    
    ...
}
public class MessageHeader {
    private String[] keys;
    private String[] values;
    private int nkeys;

	public synchronized void set(String var1, String var2) {
        int var3 = this.nkeys;

        do {
            --var3;
            if (var3 < 0) {
                this.add(var1, var2);
                return;
            }
        } while(!var1.equalsIgnoreCase(this.keys[var3]));

        this.values[var3] = var2;
    }
    public synchronized void add(String var1, String var2) {
        this.grow();
        this.keys[this.nkeys] = var1;
        this.values[this.nkeys] = var2;
        ++this.nkeys;
    }
    private void grow() {
        if (this.keys == null || this.nkeys >= this.keys.length) {
            String[] var1 = new String[this.nkeys + 4];
            String[] var2 = new String[this.nkeys + 4];
            if (this.keys != null) {
                //会发现底层JDK会使用这种,因为快速!
                System.arraycopy(this.keys, 0, var1, 0, this.nkeys);
            }

            if (this.values != null) {
                System.arraycopy(this.values, 0, var2, 0, this.nkeys);
            }

            this.keys = var1;
            this.values = var2;
        }

    }
}

2、POST写数据

其实对于是GET还是POST写数据,Java的URLConnection会有个类似于自动判断的功能:

  • 默认是GET
  • 如果将doOutput参数置为true,使用OutputStream写数据,就是POST,会自动设置header
  • 当然,有其他方法主动设置请求方法

下面是一个提交POST请求的小小例子:

public static void main(String[] args) {
    try {
        URL url = new URL("https://www.baidu.com");
        URLConnection urlConnection = url.openConnection();
        urlConnection.setDoOutput(true);
        OutputStream outputStream = urlConnection.getOutputStream();
        BufferedWriter bw = new BufferedWriter(new OutputStreamWriter(outputStream, "8859_1"));
        bw.write("lalalalallala");
        bw.flush();
        bw.close();
    } catch (IOException e) {
        e.printStackTrace();
    }
}

五、HttpURLConnection

默认如何URL请求是一个http的协议的话,返回的就是这个HttpURLConnection这个对象,他是URLConnection的抽象子类。使用public void setRequestMethod(String method) throws ProtocolException 方法来设置具体使用什么HTTP请求方法。下面几个常用的方法罗列:

  • GET
  • POST
  • HEAD:经常用于获取最后修改时间以淘汰缓存
  • PUT
  • DELETE
  • OPTIONS:跨域使用(重点),询问服务器支持哪些HTTP的方法
  • TRACE:查看服务器和客户端之间的代理服务器做了哪些修改,可以ng配置查询使用

转载于:https://my.oschina.net/UBW/blog/2051115

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值