Apache HttpClient4.5 设置TLS协议

1 篇文章 0 订阅
1 篇文章 0 订阅

在使用webmagic制作的爬虫爬取网站数据时,发现有些图片爬取不到。比较了一下,发现所有无法爬取的图片都报相同的错误:

Encrypted HTTPS traffic flows through this CONNECT tunnel. HTTPS Decryption is enabled in Fiddler, so decrypted sessions running in this tunnel will be shown in the Web Sessions list. Secure Protocol: Tls12 Cipher: Aes256 256bits Hash Algorithm: Sha384 ?bits Key Exchange: RsaKeyX 2048bits == Server Certificate ==========

看来是传输层SSL的设置不太对。先瞅下源码,webmagic0.73使用的HttpClient是4.5.2版本。webmagic构造httpclient代码如下:

 private CloseableHttpClient generateClient(Site site) {
        HttpClientBuilder httpClientBuilder = HttpClients.custom();

        httpClientBuilder.setConnectionManager(connectionManager);
        if (site.getUserAgent() != null) {
            httpClientBuilder.setUserAgent(site.getUserAgent());
        } else {
            httpClientBuilder.setUserAgent("");
        }
        if (site.isUseGzip()) {
            httpClientBuilder.addInterceptorFirst(new HttpRequestInterceptor() {

                public void process(
                        final HttpRequest request,
                        final HttpContext context) throws HttpException, IOException {
                    if (!request.containsHeader("Accept-Encoding")) {
                        request.addHeader("Accept-Encoding", "gzip");
                    }
                }
            });
        }
        //解决post/redirect/post 302跳转问题
        httpClientBuilder.setRedirectStrategy(new CustomRedirectStrategy());

        SocketConfig.Builder socketConfigBuilder = SocketConfig.custom();
        socketConfigBuilder.setSoKeepAlive(true).setTcpNoDelay(true);
        socketConfigBuilder.setSoTimeout(site.getTimeOut());
        SocketConfig socketConfig = socketConfigBuilder.build();
        httpClientBuilder.setDefaultSocketConfig(socketConfig);
        connectionManager.setDefaultSocketConfig(socketConfig);
        httpClientBuilder.setRetryHandler(new DefaultHttpRequestRetryHandler(site.getRetryTimes(), true));
        generateCookie(httpClientBuilder, site);
        return httpClientBuilder.build();
    }

它是基于custom方法构建一个自定义的httpclient对象,没有看到显式的设置安全层协议。
通过debug,发下有关安全层的设置,构建socketFactory如下:

    private SSLConnectionSocketFactory buildSSLConnectionSocketFactory() {
        try {
            return new SSLConnectionSocketFactory(createIgnoreVerifySSL()); // 优先绕过安全证书
        } catch (KeyManagementException e) {
            logger.error("ssl connection fail", e);
        } catch (NoSuchAlgorithmException e) {
            logger.error("ssl connection fail", e);
        }
        return SSLConnectionSocketFactory.getSocketFactory();
    }

其中,createIgnoreVerifySSL方法是关键,代码如下:

    private SSLContext createIgnoreVerifySSL() throws NoSuchAlgorithmException, KeyManagementException {
        // 实现一个X509TrustManager接口,用于绕过验证,不用修改里面的方法
        X509TrustManager trustManager = new X509TrustManager() {

            @Override
            public void checkClientTrusted(X509Certificate[] chain, String authType) throws CertificateException {
            }

            @Override
            public void checkServerTrusted(X509Certificate[] chain, String authType) throws CertificateException {
            }

            @Override
            public X509Certificate[] getAcceptedIssuers() {
                return null;
            }

        };

        SSLContext sc = SSLContext.getInstance("SSLv3");
        sc.init(null, new TrustManager[] { trustManager }, null);
        return sc;
    }

绕过安全证书的逻辑先不探讨。在这里,它是基于SSLv3构建SSLContext的。所以TLS1的协议肯定是不支持。

TLS和SSL同作为TCP/IP协议层的安全层的协议,提供数据保密性和完整性,TLS对于SSL来说,就像更优秀的继任者。

那怎么设置HttpClient支持TLS1呢。先看下官方文档,基于官方文档的SSL设置代码修改下,得到:

       // Trust own CA and all self-signed certs
        SSLContext sslcontext = SSLContexts.custom()
//                .loadTrustMaterial(new File("my.keystore"), "nopassword".toCharArray(),
//                        new TrustSelfSignedStrategy())
                .build();
        // Allow TLSv1 protocol only
        SSLConnectionSocketFactory sslsf = new SSLConnectionSocketFactory(
                sslcontext,
                new String[] { "TLSv1" },
                null,
                SSLConnectionSocketFactory.getDefaultHostnameVerifier());
        CloseableHttpClient httpclient = HttpClients.custom()
                .setSSLSocketFactory(sslsf)
                .build();
        try {

            HttpGet httpget = new HttpGet("");

            System.out.println("Executing request " + httpget.getRequestLine());

            CloseableHttpResponse response = httpclient.execute(httpget);
            try {
                HttpEntity entity = response.getEntity();

                System.out.println("----------------------------------------");
                System.out.println(response.getStatusLine());
                byte[] bytes = EntityUtils.toByteArray(entity);
                FileUtils.writeByteArrayToFile(new File("E:/test"), bytes);
            } finally {
                response.close();
            }
        } finally {
            httpclient.close();
        }

在socketFactory中显示设置TLSv1协议,经测试有效。

demo功能是下载一个图片,保存到本地,经测试,下载到本地的图片能正常打开

  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
要在 HttpClient设置 TLS v1.2,您可以使用以下代码示例: ```java import java.security.KeyManagementException;import java.security.NoSuchAlgorithmException; import javax.net.ssl.SSLContext; import javax.net.ssl.SSLParameters; import org.apache.http.client.HttpClient; import org.apache.http.client.config.RequestConfig; import org.apache.http.conn.ssl.SSLContextBuilder; import org.apache.http.impl.client.HttpClients; public class HttpClientExample { public static void main(String[] args) { try { SSLContext sslContext = SSLContextBuilder.create() .setProtocol("TLSv1.2") .build(); SSLParameters sslParameters = new SSLParameters(); sslParameters.setProtocols(new String[]{"TLSv1.2"}); RequestConfig requestConfig = RequestConfig.custom() .setSocketTimeout(5000) .setConnectTimeout(5000) .build(); HttpClient httpClient = HttpClients.custom() .setSSLContext(sslContext) .setDefaultSSLParameters(sslParameters) .setDefaultRequestConfig(requestConfig) .build(); // 在此处使用 httpClient 发起请求 } catch (NoSuchAlgorithmException | KeyManagementException e) { e.printStackTrace(); } } } ``` 在上面的示例中,我们首先使用 `SSLContextBuilder` 创建一个 `SSLContext` 对象,并设置协议为 "TLSv1.2"。然后,我们创建一个 `SSLParameters` 对象,并将协议设置为 "TLSv1.2"。接下来,我们使用 `RequestConfig` 来设置连接和套接字超时时间。最后,我们使用 `HttpClients.custom()` 创建自定义的 HttpClient,并设置 SSLContext、SSLParameters 和 RequestConfig。您可以在 `httpClient` 上执行您所需的请求操作。 请注意,以上示例使用的是 Apache HttpClient 4.x 版本。如果您正在使用其他版本的 HttpClient,可能需要进行适当的调整。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值