近日解决了一个问题,就是当移动数据连接突然断开或手机进入到了飞行模式,youtube在线播放视频并没有提示用户当前网络不可用。其实大部分的Android应用在数据连接断开后或者手机进入飞行模式都会收到数据状态变化的intent,这时监测到当前网络的数据连接不可用,提示用户当前网络不可用,但是有一部分的应用,它只是通过普通的socket来进行数据交互的,这样的应用却无法得知当前socket是坏连接,我们知道网络本来就有可能出现无数据包回复的情况。
后来,在linux上也做了socket 通信的实验,发现原生的linux socket就是这样。
而我的问题如何解决呢?既然Android的linux kernel解决了此问题,我的手机为什么还是会出现呢? 而且youtube仅仅在移动数据断开的时候才出现,而WIFI关闭的时候它却能提示网络不可用,我使用一个native的程序来测试发现移动数据断开的时候我的read()显然是block了,而WIFI断开的时候直接返回错误值“socket time out”,我从上而下都是一样的库,仅仅是移动的driver和WIFI的driver不同,那肯定是driver 的问题了,但是经过一番分析没有从driver发现什么错误,但是从kernel socket buffer 打的日志看当出现数据断开的时候进入了睡眠队列, 但是WIFI后来被唤醒了但是移动连接却没有被connectivitySer唤醒,这下可以断定可能是connectivityser处理不同造成的。经过一番分析发现当framework call NetworkUtils.resetConnections(iface, resetMask); native code的 result = ioctl(ifc_ctl_sock, SIOCKILLADDR, &ifr); 直接返回的错误是设备不存在。
Android:While disabling the data call, android frameworks calls the function ifc_reset_connections() i.e SIOCKILLADDR ioctl.
现在终于明朗了,与Galaxy nexus 比较发现,当数据连接断开后我的手机的网卡被remove掉了,但是Android的设计是只需将网卡状态设置成down即可,显然mobile的网卡driver设计缺陷。
解决方法:
1. Workaround, 修改一下处理顺序,对于mobile在断开连接前进行socet reset 清理地址的socket.
2. 修改移动连接的网卡驱动。
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
从网上查了些资料,Android 对linux kernel进行了修改解决了这个问题。大家可参考如下内容了解一下:
http://oss.sgi.com/projects/netdev/archive/2001-01/txt4wEDN17dER.txt
diff -u --exclude *~ --recursive linux-2.4.0-orig/include/linux/sockios.h linux-hacked-dynip/include/linux/sockios.h
--- linux-2.4.0-orig/include/linux/sockios.h Sat Dec 30 00:20:32 2000
+++ linux-hacked-dynip/include/linux/sockios.h Sat Jan 27 17:04:34 2001
@@ -65,6 +65,7 @@
#define SIOCDIFADDR 0x8936 /* delete PA address */
#define SIOCSIFHWBROADCAST 0x8937 /* set hardware broadcast addr */
#define SIOCGIFCOUNT 0x8938 /* get number of devices */
+#define SIOCKILLADDR 0x8939 /* kill all connections with this local address */
#define SIOCGIFBR 0x8940 /* Bridging support */
#define SIOCSIFBR 0x8941 /* Set bridging options */
diff -u --exclude *~ --recursive linux-2.4.0-orig/include/net/tcp.h linux-hacked-dynip/include/net/tcp.h
--- linux-2.4.0-orig/include/net/tcp.h Fri Jan 5 21:41:37 2001
+++ linux-hacked-dynip/include/net/tcp.h Sat Jan 27 18:02:21 2001
@@ -787,9 +787,8 @@
extern int tcp_disconnect(struct sock *sk, int flags);
extern void tcp_unhash(struct sock *sk);
-
extern int tcp_v4_hash_connecting(struct sock *sk);
-
+extern void tcp_v4_zap_saddr(u32 saddr);
/* From syncookies.c */
extern struct sock *cookie_v4_check(struct sock *sk, struct sk_buff *skb,
diff -u --exclude *~ --recursive linux-2.4.0-orig/net/ipv4/af_inet.c linux-hacked-dynip/net/ipv4/af_inet.c
--- linux-2.4.0-orig/net/ipv4/af_inet.c Tue Jan 2 09:26:19 2001
+++ linux-hacked-dynip/net/ipv4/af_inet.c Sat Jan 27 18:27:38 2001
@@ -854,6 +854,7 @@
case SIOCSIFPFLAGS:
case SIOCGIFPFLAGS:
case SIOCSIFFLAGS:
+ case SIOCKILLADDR:
return(devinet_ioctl(cmd,(void *) arg));
case SIOCGIFBR:
case SIOCSIFBR:
diff -u --exclude *~ --recursive linux-2.4.0-orig/net/ipv4/devinet.c linux-hacked-dynip/net/ipv4/devinet.c
--- linux-2.4.0-orig/net/ipv4/devinet.c Sat Dec 30 00:22:05 2000
+++ linux-hacked-dynip/net/ipv4/devinet.c Sat Jan 27 21:09:48 2001
@@ -510,6 +510,7 @@
case SIOCSIFBRDADDR: /* Set the broadcast address */
case SIOCSIFDSTADDR: /* Set the destination address */
case SIOCSIFNETMASK: /* Set the netmask for the interface */
+ case SIOCKILLADDR: /* Kill all connections with this local address */
if (!capable(CAP_NET_ADMIN))
return -EACCES;
if (sin->sin_family != AF_INET)
@@ -536,7 +537,10 @@
break;
}
- if (ifa == NULL && cmd != SIOCSIFADDR && cmd != SIOCSIFFLAGS) {
+ if (ifa == NULL
+ && cmd != SIOCSIFADDR
+ && cmd != SIOCSIFFLAGS
+ && cmd != SIOCKILLADDR) {
ret = -EADDRNOTAVAIL;
goto done;
}
@@ -646,6 +650,9 @@
ifa->ifa_prefixlen = inet_mask_len(ifa->ifa_mask);
inet_insert_ifa(ifa);
}
+ break;
+ case SIOCKILLADDR: /* Kill all connections with this local address */
+ tcp_v4_zap_saddr(sin->sin_addr.s_addr);
break;
}
done:
diff -u --exclude *~ --recursive linux-2.4.0-orig/net/ipv4/tcp_ipv4.c linux-hacked-dynip/net/ipv4/tcp_ipv4.c
--- linux-2.4.0-orig/net/ipv4/tcp_ipv4.c Fri Jan 5 21:17:42 2001
+++ linux-hacked-dynip/net/ipv4/tcp_ipv4.c Sat Jan 27 18:07:25 2001
@@ -390,6 +390,38 @@
wake_up(&tcp_lhash_wait);
}
+/* Terminate all active connections with a local address equal to
+ * SADDR. If sysctl_ip_dynaddr is set, connections in the SYN_SENT
+ * state are not closed, because their source address will presumably
+ * be rewritten.
+ */
+void tcp_v4_zap_saddr(u32 saddr)
+{
+ int i;
+ rwlock_t *lock;
+ struct sock *sk;
+
+ for (i = 0; i < (tcp_ehash_size<<1); i++) {
+ lock = &tcp_ehash[i].lock;
+
+ read_lock(lock);
+
+ for(sk = tcp_ehash[i].chain; sk; sk = sk->next)
+ if(sk->rcv_saddr == saddr)
+ {
+ if(sysctl_ip_dynaddr && sk->state == TCP_SYN_SENT)
+ continue;
+
+ sk->err = ENETRESET;
+ sk->error_report(sk);
+
+ tcp_done(sk);
+ }
+
+ read_unlock(lock);
+ }
+}
+
/* Don't inline this cruft. Here are some nice properties to
* exploit here. The BSD API does not allow a listening TCP
* to specify the remote port nor the remote address for the
还有人发现在动态IP地址处理上也存在这个问题:
When the IP address of an interface changes, TCP connections with the
old source address are useless. Applications are not notified of this
and time out ordinarily, just as if nothing had happened. This is
behaviour isn't very helpful when you have a dynamic IP and know
you're probably not going to get the old one back. In that case, you
want processes to get errors when they try to use one of the dead
connections, so they can handle the disconnect more cleanly. Otherwise
fetchmail, etc. can just hang waiting for ages. Andi Kleen implemented
this functionality with a per interface flag in 2.2. See
ftp.suse.com:/pub/people/ak/v2.2/iff-dynamic*.
The following patch against 2.4.0 does it a different way. It
introduces a new ioctl, called SIOCKILLADDR. When this ioctl is
called, it makes all IPv4 sockets with the specified source address
return -ENETRESET when they are used.