众所周知,Androidpn存在如下bug:Server重启时,Client不会自动重连。
今天跟了下代码,将解决方法思路记录如下:
其实Androidpn Client中已经有 ReconnectionThread 这个类了,该线程会调用 xmppManager.connect() 方法,从而实现重连功能。只不过重连线程在用的过程中原作者的代码出现了一些问题。
先理清Client的代码逻辑:
首先,在XmppManager中存在一个LoginTask 登录线程,Client在每次服务启动的时候,都会start该线程,可以发现在LoginTask run方法内存在如下代码:
//will_awoke 触发心跳保持 ReconnectionThread
getConnection().startKeepAliveThread(xmppManager);
也即调用XMPPConnection的startKeepAliveThread方法。有人要问了,这个XMPPConnection是在asmack.jar里面呀,我怎么去改代码。所以, 笔者这里强烈建议:以源码的形式引入 asmack包,而不是用jar包,因为后面你会发现这个通信包有很多问题需要你去改源码方能处理。
继续讲代码,进入到XMPPConnection的startKeepAliveThread方法中,你会发现实际上调用的是 packetWriter.startKeepAliveProcess(xmppManager) 方法,PacketWriter# startKeepAliveProcess方法如下:
/**
* Starts the keep alive process. A white space (aka heartbeat) is going to be
* sent to the server every 30 seconds (by default) since the last stanza was sent
* to the server.
* @param username
* @param xmppManager
* @throws Exception
*/
void startKeepAliveProcess(XmppManager xmppManager) throws Exception {
// Schedule a keep-alive task to run if the feature is enabled. will write
// out a space character each time it runs to keep the TCP/IP connection open.
int keepAliveInterval = SmackConfiguration.getKeepAliveInterval();
if (keepAliveInterval > 0) {
KeepAliveTask task = new KeepAliveTask(keepAliveInterval, xmppManager);
keepAliveThread = new Thread(task);
task.setThread(keepAliveThread);
keepAliveThread.setDaemon(true);
keepAliveThread.setName("Smack Keep Alive (" + connection.connectionCounterValue + ")");
keepAliveThread.start();
}
}
根据此代码和注释可知,客户端为了保持与服务端的tcp连接,在没有消息发往服务端的时候,会每隔三十秒(当然这个30秒是默认的,可以在SmackConfiguration类中配置)发一个空格给服务端以保持正常的心跳。(PS:其实英文注释已经很清楚了,这里翻译过来反而显得不伦不类了~)
心跳保持在KeepAliveTask线程中实现,KeepAliveTask的run方法如下:
public void run() {
try {
// Sleep 15 seconds before sending first heartbeat. This will give time to
// properly finish TLS negotiation and then start sending heartbeats.
Thread.sleep(15000);
}
catch (InterruptedException ie) {
// Do nothing
}
while (!done && keepAliveThread == thread) {
// Send heartbeat if no packet has been sent to the server for a given time
if (System.currentTimeMillis() - lastActive >= delay) {
try {
synchronized (writer) {
writer.write(" ");
writer.flush();
}
}
// bug fixed
// @will_awoke 服务端重启时,客户端在发送心跳包时,writer.flush()时会抛出
// SSLException,
// catch该exception,start ReconnectionThread
catch (SSLException ssl) {
Log.e("SSLException", ssl.toString());
connection.disconnect();
xmppManager.startReconnectionThread();
} catch (SocketException se) {
Log.e("SocketException", se.toString());
connection.disconnect();
xmppManager.startReconnectionThread();
} catch (IOException io) {
Log.e("IOException", io.toString());
connection.disconnect();
xmppManager.startReconnectionThread();
} catch (Exception e) {
Log.e("Exception", e.toString());
connection.disconnect();
xmppManager.startReconnectionThread();
}
}
try {
// Sleep until we should write the next keep-alive.
Thread.sleep(delay);
}
catch (InterruptedException ie) {
// Do nothing
}
}
}
run方法中,实际上就是执行发一个空格给服务端,逻辑很简单,注释也很清楚。回到本文的出发问题: 服务端重启时客户端为什么不会自动重连。注意catch块,代码中我已注释到,因为 服务端重启时,客户端在发送心跳包时,writer.flush()时会抛出 javax.net.ssl.SSLException: Write error: ssl=0x2a181060: I/O error during system call, Broken pipe ,而原作者的代码中,此处只捕获了SocketException和IOException,并且是在catch 到 SocketException时,断开原连接,start ReconnectionThread,并未有处理SSLException,故会出现不自动重连的bug,解决办法如上代码所示, catch SSLException进行处理。这里强烈建议,每个catch块内都执行disconnect和startReconnectionThread。
其实,client直接关闭服务时,Server端也会出现SSLException,异常如下:
javax.net.ssl.SSLException: Inbound closed before receiving peer's close_notify: possible truncation attack?
at sun.security.ssl.Alerts.getSSLException(Alerts.java:208)
at sun.security.ssl.SSLEngineImpl.fatal(SSLEngineImpl.java:1619)
at sun.security.ssl.SSLEngineImpl.fatal(SSLEngineImpl.java:1587)
at sun.security.ssl.SSLEngineImpl.closeInbound(SSLEngineImpl.java:1517)
at org.apache.mina.filter.ssl.SslHandler.destroy(SslHandler.java:168)
at org.apache.mina.filter.ssl.SslFilter.sessionClosed(SslFilter.java:393)
at org.apache.mina.core.filterchain.DefaultIoFilterChain.callNextSessionClosed(DefaultIoFilterChain.java:395)
at org.apache.mina.core.filterchain.DefaultIoFilterChain.access$900(DefaultIoFilterChain.java:46)
at org.apache.mina.core.filterchain.DefaultIoFilterChain$EntryImpl$1.sessionClosed(DefaultIoFilterChain.java:778)
at org.apache.mina.core.filterchain.IoFilterAdapter.sessionClosed(IoFilterAdapter.java:95)
at org.apache.mina.core.filterchain.DefaultIoFilterChain.callNextSessionClosed(DefaultIoFilterChain.java:395)
at org.apache.mina.core.filterchain.DefaultIoFilterChain.fireSessionClosed(DefaultIoFilterChain.java:388)
at org.apache.mina.core.service.IoServiceListenerSupport.fireSessionDestroyed(IoServiceListenerSupport.java:210)
at org.apache.mina.core.polling.AbstractPollingIoProcessor.removeNow(AbstractPollingIoProcessor.java:535)
at org.apache.mina.core.polling.AbstractPollingIoProcessor.removeSessions(AbstractPollingIoProcessor.java:497)
at org.apache.mina.core.polling.AbstractPollingIoProcessor.access$600(AbstractPollingIoProcessor.java:61)
at org.apache.mina.core.polling.AbstractPollingIoProcessor$Processor.run(AbstractPollingIoProcessor.java:974)
at org.apache.mina.util.NamePreservingRunnable.run(NamePreservingRunnable.java:64)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:724)
这是因为Client直接关闭时,Server端无法向该Client对应的Connect发送结束符</stream>,导致出现异常。
综上,issue完毕。
update : 2014-04-11 16:17
bug fixed : 客户端一次断开之后它就不断地进行重连,即使重新连接登录成功也会持续重连。
因为ReconnectionThread里面仅仅是判断isInterrupted(),然后sleep,重复xmppManager.connect(),这样就导致持续重连。
修改办法如下:
ReconnectionThread run方法:
while (!isInterrupted()) {
Log.d(LOGTAG, "Trying to reconnect in " + waiting()
+ " seconds" + " thread : " + hashCode());
Thread.sleep((long) waiting() * 1000L);
xmppManager.connect();
waiting++;
Log.d(LOGTAG, "xmppManager is connected : " + xmppManager.isConnected());
//will_awoke
//bug fixed : 一次断开之后重连它就不断地进行重连,就算连接登录成功也重连
if(xmppManager.isConnected())
{
interrupt();
}
}
XmppManager isConnected()修改为public级别。同时修改startReconnectionThread()方法:
public void startReconnectionThread() {
Thread reconnection = new ReconnectionThread(this);
reconnection.setName("Xmpp Reconnection Thread");
reconnection.start();
/*synchronized (reconnection) {
if (!reconnection.isAlive()) {
reconnection.setName("Xmpp Reconnection Thread");
reconnection.start();
}
}*/
}
PersistentConnectionListener的connectionClosedOnError()方法中,注释掉xmppManager.startReconnectionThread:
public void connectionClosedOnError(Exception e) {
Log.d(LOGTAG, "connectionClosedOnError()..." + e);
if (xmppManager.getConnection() != null
&& xmppManager.getConnection().isConnected()) {
xmppManager.getConnection().disconnect();
}
//xmppManager.startReconnectionThread();
}
因为SSLException时,执行closeConnect方法时,会触发connectionClosedOnError,进而执行xmppManager.startReconnectionThread,这时也会完成重连工作。实际上,起一个重连线程实例即可完成重连,故此处注释掉。