SSLSocket getInputStream()阻塞问题分析
这里的分析源自一个bug:我们的无人客服SDK在接入京东金融后,从京东金融App第一次打开无人客服业务正常, 当用户处于无人客服聊天界面中然后从历史栈把京东金融移除掉(强制干掉京东金融App的进程),然后马上再次启动京东金融App并且进入到无人客服,会发现Socket对象的getInputStream()方法阻塞了(7-12秒). 正常情况下socket对象的getInputStream()方法的耗时是100毫秒以内.
0x01. 创建socket
// encrypt is true
private Socket createSocket(boolean encrypt) {
if (encrypt) {
try {
return CustomSSLSocketFactory.getSocketFactory().createSocket();
} catch (Exception e) {}
}
return new Socket();
}
0x02. 封装加密socket工厂类
这里贴出这个类的部分内容主要是展示一下引用到了哪些抱路径下的哪些库文件.
import org.apache.http.conn.ssl.SSLSocketFactory;
import java.io.IOException;
import java.net.Socket;
import java.security.KeyManagementException;
import java.security.KeyStore;
import java.security.KeyStoreException;
import java.security.NoSuchAlgorithmException;
import java.security.UnrecoverableKeyException;
import javax.net.ssl.SSLContext;
import javax.net.ssl.TrustManager;
import javax.net.ssl.X509TrustManager;
import jd.xxx.LogUtils;
/**
* trust all certs
*/
public class CustomSSLSocketFactory extends SSLSocketFactory {
//...
}
0x03. SSLSocketFactory
源码:android-7.0.0_r1
搜索文件: SSLSocketFactory.java
find . -name SSLSocketFactory.java
搜索结果:
./frameworks/base/core/java/org/apache/http/conn/ssl/SSLSocketFactory.java
./libcore/ojluni/src/main/java/javax/net/ssl/SSLSocketFactory.java
./packages/apps/Email/emailcommon/src/com/android/emailcommon/utility/SSLSocketFactory.java
从上面的代码中的包引用路径(import org.apache.http.conn.ssl.SSLSocketFactory;)就直观的知道是第一个搜索结果.
不过这个方法目前是:@Deprecated, 可以替代的是搜索结果中的第二个SSLSocketFactory.java. 至于这个类被废弃的原因,有兴趣的话再研究一下.
虽然这个方法被废弃了,但是依然是可以使用的,需要再build.gradle中包含进来.
android {
useLibrary 'org.apache.http.legacy'
}
好了,我们继续分析socket的创建流程.
No. | Time | Source | Destination | Protocol | Length | Info |
---|---|---|---|---|---|---|
包序号 | 包的时间 | 源地址 | 目的地址 | 协议类型 | 包长度 | 包信息 |
包序号:按照收发的顺序显示.
包时间: 单位为毫秒
源地址: 发送这个包的ip地址.
目的地址: 接收这个包的ip地址.
协议类型: tcp/http/dns/udp等.
包长度:数据包的长度.
包信息: 当前包包含的可读信息.
0x04. 下面是一个完整的TCP连接以及SSL握手过程(有问题的)
包数据略去(公司内部数据),做个介绍,通过抓包就会发现每次代码走到getInputStream()的时候,会发生一个DNS反向解析的行为.
这里需要一些网络数据包分析和过滤的知识.
0x05. 下面是网上搜到的同类问题的一个问题反馈
这些问题的表现形式都是一样的,因为触发了dns的反向解析,虽然触发的路径不同.
http://lihuanghe.github.io/2016/05/19/jdkbug-SSLSocket-slow-when-InetAddress.host-is-null.html
摘录内容如下,方便根据邮件内容中的栈信息看出来是哪里调用了反向dns解析.
For info
---------- Forwarded message ----------
From: Philippe Mouawad <philippe.mouawad@gmail.com>
Date: Tue, Aug 26, 2014 at 9:47 PM
Subject: Re: HTTPClient 4 : Request hangs for 4-5 seconds when using IP's
is used without reverse DNS only on Windows
To: HttpClient User Discussion <httpclient-users@hc.apache.org>
Hello Oleg,
Getting back to this old thread which is cause of a JMeter issue:
https://issues.apache.org/bugzilla/show_bug.cgi?id=54449
I think it could be related to a JDK bug or feature:
http://bugs.java.com/bugdatabase/view_bug.do?bug_id=6450279
JDK7 has introduced a new method which could fix this issue:
-
http://download.java.net/jdk7/archive/b123/docs/api/java/net/InetSocketAddress.html#getHostString()
But it would make HTTPClient require Java 7 and JMeter also as a
consequence.
What's your opinion.
Regards
Philippe M.
@philmdot
On Tue, Feb 5, 2013 at 10:50 AM, Oleg Kalnichevski <olegk@apache.org> wrote:
> On Mon, 2013-02-04 at 22:25 +0100, Philippe Mouawad wrote:
> > Hello,
> >
> > We had an issue reported in JMerer related to HttpClient version 4.X.X
> > which does not happen in version 3.1.
> >
> > Thread dump shows thread hangs within InetAddress$1.getHostByAddr:
> >
> > "Thread Group 1-1" prio=6 tid=0x038f3c00 nid=0xd80 runnable [0x03b7f000]
> > java.lang.Thread.State: RUNNABLE
> > at java.net.Inet4AddressImpl.getHostByAddr(Native Method)
> > at java.net.InetAddress$1.getHostByAddr(Unknown Source)
> > at java.net.InetAddress.getHostFromNameService(Unknown Source)
> > at java.net.InetAddress.getHostName(Unknown Source)
> > at java.net.InetAddress.getHostName(Unknown Source)
> > at sun.security.ssl.SSLSocketImpl.getHost(Unknown Source)
> > - locked <0x1349be48> (a sun.security.ssl.SSLSocketImpl)
> > at sun.security.ssl.Handshaker.getHostSE(Unknown Source)
> > at sun.security.ssl.ClientHandshaker.getKickstartMessage(Unknown
> Source)
> > at sun.security.ssl.Handshaker.kickstart(Unknown Source)
> > at sun.security.ssl.SSLSocketImpl.kickstartHandshake(Unknown
> Source)
> > - locked <0x1349be48> (a sun.security.ssl.SSLSocketImpl)
> > at sun.security.ssl.SSLSocketImpl.performInitialHandshake(Unknown
> > Source)
> > - locked <0x1349c038> (a java.lang.Object)
> > at sun.security.ssl.SSLSocketImpl.startHandshake(Unknown Source)
> > at sun.security.ssl.SSLSocketImpl.getSession(Unknown Source)
> > at
> org.apache.http.conn.ssl.AbstractVerifier.verify(AbstractVerifier.java:91)
> > at
> org.apache.http.conn.ssl.SSLSocketFactory.connectSocket(SSLSocketFactory.java:572)
> > at
> org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:180)
> > at
> org.apache.http.impl.conn.ManagedClientConnectionImpl.open(ManagedClientConnectionImpl.java:294)
> > at
> org.apache.http.impl.client.DefaultRequestDirector.tryConnect(DefaultRequestDirector.java:640)
> > at
> org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:479)
> > at
> org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906)
> > at
> org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805)
> > at
> org.apache.jmeter.protocol.http.sampler.HTTPHC4Impl.sample(HTTPHC4Impl.java:284)
> > at
> org.apache.jmeter.protocol.http.sampler.HTTPSamplerProxy.sample(HTTPSamplerProxy.java:62)
> > at
> org.apache.jmeter.protocol.http.sampler.HTTPSamplerBase.sample(HTTPSamplerBase.java:1075)
> > at
> org.apache.jmeter.protocol.http.sampler.HTTPSamplerBase.sample(HTTPSamplerBase.java:1064)
> > at
> org.apache.jmeter.threads.JMeterThread.process_sampler(JMeterThread.java:426)
> > at
> org.apache.jmeter.threads.JMeterThread.run(JMeterThread.java:255)
> > at java.lang.Thread.run(Unknown Source)
> >
> >
> >
> > Do you remember fixing this kind of issue within version 3.X ?, something
> > like this:
> > -
> http://www.velocityreviews.com/forums/showpost.php?p=2959030&postcount=8
> >
>
> Caching of resolved addresses also has downsides. For instance, it
> breaks simple load distribution schemes based on DNS round-robin.
>
> I am pretty certain HC 3.x does not use InetAddress caching. However, HC
> 4.x socket initialization logic is significantly different from that of
> 3.x.
>
> Oleg
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
> For additional commands, e-mail: httpclient-users-help@hc.apache.org
>
>
看完这些,然后再根据自己的实际情况来看,我的场景是阻塞在SSLSocket.getInputStream()方法.那么必然是它触发了握手!
0x06. SSLSocket源码跟踪
源码位置:
libcore/ojluni/src/main/java/java/net/Socket.java
libcore/ojluni/src/main/java/javax/net/ssl/SSLSocket.java
libcore/ojluni/src/main/java/sun/security/ssl/SSLSocketImpl.java
Socket.java
^
|
SSLSocket.java
^
|
BaseSSLSocketImpl.java
^
|
SSLSocketImpl.java
先看看Socket.java中的getInputStream().
/**
* Returns an input stream for this socket.
*
* <p> If this socket has an associated channel then the resulting input
* stream delegates all of its operations to the channel. If the channel
* is in non-blocking mode then the input stream's <tt>read</tt> operations
* will throw an {@link java.nio.channels.IllegalBlockingModeException}.
*
* <p>Under abnormal conditions the underlying connection may be
* broken by the remote host or the network software (for example
* a connection reset in the case of TCP connections). When a
* broken connection is detected by the network software the
* following applies to the returned input stream :-
*
* <ul>
*
* <li><p>The network software may discard bytes that are buffered
* by the socket. Bytes that aren't discarded by the network
* software can be read using {@link java.io.InputStream#read read}.
*
* <li><p>If there are no bytes buffered on the socket, or all
* buffered bytes have been consumed by
* {@link java.io.InputStream#read read}, then all subsequent
* calls to {@link java.io.InputStream#read read} will throw an
* {@link java.io.IOException IOException}.
*
* <li><p>If there are no bytes buffered on the socket, and the
* socket has not been closed using {@link #close close}, then
* {@link java.io.InputStream#available available} will
* return <code>0</code>.
*
* </ul>
*
* <p> Closing the returned {@link java.io.InputStream InputStream}
* will close the associated socket.
*
* @return an input stream for reading bytes from this socket.
* @exception IOException if an I/O error occurs when creating the
* input stream, the socket is closed, the socket is
* not connected, or the socket input has been shutdown
* using {@link #shutdownInput()}
*
* @revised 1.4
* @spec JSR-51
*/
public InputStream getInputStream() throws IOException {
if (isClosed())
throw new SocketException("Socket is closed");
if (!isConnected())
throw new SocketException("Socket is not connected");
if (isInputShutdown())
throw new SocketException("Socket input is shutdown");
final Socket s = this;
InputStream is = null;
try {
is = AccessController.doPrivileged(
new PrivilegedExceptionAction<InputStream>() {
public InputStream run() throws IOException {
return impl.getInputStream();//这里调用的是具体的实现类里面的方法.
}
});
} catch (java.security.PrivilegedActionException e) {
throw (IOException) e.getException();
}
return is;
}
看到这里即将拨云见日了:), 马上把上面几个实现类的getInputStream打开看看.
只有SSLSocketImpl.java实现了这个方法,但是很不幸,这里面没有调用SSL握手的方法.
/**
* Gets an input stream to read from the peer on the other side.
* Data read from this stream was always integrity protected in
* transit, and will usually have been confidentiality protected.
*/
synchronized public InputStream getInputStream() throws IOException {
if (isClosed()) {
throw new SocketException("Socket is closed");
}
/*
* Can't call isConnected() here, because the Handshakers
* do some initialization before we actually connect.
*/
if (connectionState == cs_START) {
throw new SocketException("Socket is not connected");
}
return input;
}
可是有文档是这么说的:
http://javadoc.iaik.tugraz.at/isasilk/current/iaik/security/ssl/SSLSocket.html
getInputStream
public java.io.InputStream getInputStream()
throws java.io.IOException
Description copied from interface: SSLCommunication
Returns an input stream for this socket. Invoking this method starts the SSL handshake if setAutoHandshake() is true and it has not been started already.
Specified by:
getInputStream in interface SSLCommunication
Overrides:
getInputStream in class java.net.Socket
Returns:
an input stream for reading bytes from this socket.
Throws:
java.io.IOException - if an error occurs when creating the input stream.
从这个文档上看确实是这么回事,但是源码里面确实又不是这么回事啊!
在看看oracle是怎么说的.
http://docs.oracle.com/javase/7/docs/api/javax/net/ssl/SSLSocket.html
The initial handshake on this connection can be initiated in one of three ways:
- * calling startHandshake which explicitly begins handshakes, or
- * any attempt to read or write application data on this socket causes an implicit handshake, or
- * a call to getSession tries to set up a session if there is no currently valid session, and an implicit handshake is done.
有点乱!!!难道是前面分析有问题?
对确实有问题,看看OpenSSLSocketImpl.java.
external/conscrypt/src/main/java/org/conscrypt/OpenSSLSocketImpl.java
@Override
public InputStream getInputStream() throws IOException {
checkOpen();
InputStream returnVal;
synchronized (stateLock) {
if (state == STATE_CLOSED) {
throw new SocketException("Socket is closed.");
}
if (is == null) {
is = new SSLInputStream();
}
returnVal = is;
}
// Block waiting for a handshake without a lock held. It's possible that the socket
// is closed at this point. If that happens, we'll still return the input stream but
// all reads on it will throw.
waitForHandshake();//握手!握手!握手!
return returnVal;
}
从上面代码分析来看应该不会使用这个类啊?我们看到的是SSLSocket是通过socket工厂返回的.细节下次继续分析.到目前位置,getInputStream为什么会阻塞的原因就彻底弄清楚了.