● 游戏服务器偶尔出现如下针状
2018-11-10 13:51:43,689 | INFO | SocketReader | c.s.b.sessions.DefaultSessionManager | Session removed: { Id: 4, Type: DEFAULT, Logged: No, IP: 192.168.0.104:55975 }
2018-11-10 13:52:19,338 | INFO | SocketReader | c.s.b.sessions.DefaultSessionManager | Session created: { Id: 5, Type: DEFAULT, Logged: No, IP: 192.168.0.104:55989 } on Server port: 9988 <---> 55989
2018-11-10 13:53:08,704 | INFO | SocketReader | c.s.b.sessions.DefaultSessionManager | Session removed: { Id: 5, Type: DEFAULT, Logged: No, IP: 192.168.0.104:55989 }
2018-11-10 13:53:10,609 | INFO | SocketReader | c.s.b.sessions.DefaultSessionManager | Session created: { Id: 6, Type: DEFAULT, Logged: No, IP: 192.168.0.104:55994 } on Server port: 9988 <---> 55994
2018-11-10 13:53:55,742 | INFO | SocketReader | c.s.b.sessions.DefaultSessionManager | Session removed: { Id: 6, Type: DEFAULT, Logged: No, IP: 192.168.0.104:55994 }
2018-11-10 13:54:04,833 | INFO | SocketReader | c.s.b.sessions.DefaultSessionManager | Session created: { Id: 7, Type: DEFAULT, Logged: No, IP: 192.168.0.104:56014 } on Server port: 9988 <---> 56014
2018-11-10 13:54:14,219 | INFO | SocketReader | c.s.b.sessions.DefaultSessionManager | Session removed: { Id: 7, Type: DEFAULT, Logged: No, IP: 192.168.0.104:56014 }
2018-11-10 13:54:27,805 | INFO | SocketReader | c.s.b.sessions.DefaultSessionManager | Session created: { Id: 8, Type: DEFAULT, Logged: No, IP: 192.168.0.104:56021 } on Server port: 9988 <---> 56021
2018-11-10 13:55:13,444 | INFO | SocketReader | c.s.b.sessions.DefaultSessionManager | Session removed: { Id: 8, Type: DEFAULT, Logged: No, IP: 192.168.0.104:56021 }
2018-11-10 13:55:20,640 | INFO | SocketReader | c.s.b.sessions.DefaultSessionManager | Session created: { Id: 9, Type: DEFAULT, Logged: No, IP: 192.168.0.104:56028 } on Server port: 9988 <---> 56028
2018-11-10 13:55:31,303 | INFO | pool-2-thread-5 | c.s.v2.util.stats.CCULoggerTask | CCU stats: { Zone: --=={{{ AdminZone }}}==-- }, CCU: 0/0
2018-11-10 13:55:31,304 | INFO | pool-2-thread-5 | c.s.v2.util.stats.CCULoggerTask | CCU stats: { Zone: COK1 }, CCU: 0/0
2018-11-10 13:55:31,304 | INFO | pool-2-thread-5 | c.s.v2.util.stats.CCULoggerTask | CCU stats: CCU: 0/0
2018-11-10 13:56:05,317 | INFO | SocketReader | c.s.b.sessions.DefaultSessionManager | Session removed: { Id: 9, Type: DEFAULT, Logged: No, IP: 192.168.0.104:56028 }
2018-11-10 13:56:12,304 | INFO | SocketReader | c.s.b.sessions.DefaultSessionManager | Session created: { Id: 10, Type: DEFAULT, Logged: No, IP: 192.168.0.104:56036 } on Server port: 9988 <---> 56036
2018-11-10 13:57:43,277 | INFO | SocketReader | c.s.b.sessions.DefaultSessionManager | Session removed: { Id: 10, Type: DEFAULT, Logged: No, IP: 192.168.0.104:56036 }
2018-11-10 13:58:13,233 | INFO | SocketReader | c.s.b.sessions.DefaultSessionManager | Session created: { Id: 11, Type: DEFAULT, Logged: No, IP: 192.168.0.104:56050 } on Server port: 9988 <---> 56050
2018-11-10 13:58:18,775 | INFO | SocketReader | c.s.b.sessions.DefaultSessionManager | Session removed: { Id: 11, Type: DEFAULT, Logged: No, IP: 192.168.0.104:56050 }
2018-11-10 13:58:36,421 | INFO | SocketReader | c.s.b.sessions.DefaultSessionManager | Session created: { Id: 12, Type: DEFAULT, Logged: No, IP: 192.168.0.104:56058 } on Server port: 9988 <---> 56058
2018-11-10 13:58:57,724 | INFO | SocketReader | c.s.b.sessions.DefaultSessionManager | Session removed: { Id: 12, Type: DEFAULT, Logged: No, IP: 192.168.0.104:56058 }
2018-11-10 13:59:06,673 | INFO | SocketReader | c.s.b.sessions.DefaultSessionManager | Session created: { Id: 13, Type: DEFAULT, Logged: No, IP: 192.168.0.104:56062 } on Server port: 9988 <---> 56062
● 一旦出现此bug,所有玩家全部登录不上,在线玩家不受影响.只要是玩家执行登录流程,全部卡在50%界面,死活登录不了.
● 经过N久的研究,最后发现是线程的原因.让我们慢慢道来.
首先,这个smartFox源码中用于执行登录流程的有一个线程池,这个线程池比较坑,看如下源码
public final class SFSEventManager extends BaseCoreService implements ISFSEventManager {
private int corePoolSize = 4;
private int maxPoolSize = 5;
private int threadKeepAliveTime = 60;
private final ThreadPoolExecutor threadPool;
private final Map<SFSEventType, Set<ISFSEventListener>> listenersByEvent;
private final Logger logger;
public SFSEventManager() {
this.setName("SFSEventManager");
this.logger = LoggerFactory.getLogger(SFSEventManager.class);
this.threadPool = new ThreadPoolExecutor(this.corePoolSize, this.maxPoolSize, (long)this.threadKeepAliveTime, TimeUnit.SECONDS, new LinkedBlockingQueue());
this.listenersByEvent = new ConcurrentHashMap();
}
如上图所示,这个线程池只有4个核心线程,只有4个核心线程,只有4个核心线程!!重要的事说三遍,最大线程数量为5,然并卵,线程的数量永远永远永远只可能是4个,为什么呢?请看线程使用的阻塞队列LinkedBlockingQueue,我们来看下默认阻塞队列的大小:
public LinkedBlockingQueue() {
this(Integer.MAX_VALUE);
}
看到没,阻塞对列的大小是int的最大,也就是21E多,这是什么概念?对于我们小游戏公司来说,玩家登录次数可能永远达不到这么多,也就是说可能永远也触发不了新的线程,换一种说法,线程池最大大小5永远触发不了,线程池的线程大小永远是核心数量大小4.这是一个坑的地方!
● 坑二
上面我们说过用于执行登录流程的线程数量只有4个,如果游戏玩家不多且并发不高,那么还算稍微正常点,但是一旦遇到一些特殊情况,如游戏刚开服,或者游戏维护结束之后,此时有大批玩家登录,这就出现问题了.
登录流程加载东西过多,假设一个玩家耗费5秒,此时若是1000个玩家同时登录,后面的玩家只能慢慢等待前面玩家登录走完才能登录.4个线程执行,每个玩家耗时5秒,想想第1000个玩家得等待多久!!!
● 坑三
坑三也是最坑,前面几个坑忍忍也就过去了,这个坑直接让游戏完蛋.为什么呢?
前面说过执行登录线程4个,only 4个,也就要求你要保证这4个线程不能出一丁点意外,比如线程阻塞,一旦出现线程阻塞了,那么玩完了,4个线程同时阻塞(只要出现一个阻塞,四个很快同样阻塞),导致后面登录的玩家永远获取不到线程用来执行登录.这样就会出现频繁的session create, session remove.
我们做了个实验,在登录逻辑处给线程睡一下,用一个号登录游戏,这时候,一个线程被占用,一直睡在那,反复执行四次,所有的线程都被耗尽了,此时登录线程池已经无能为力了,它已经没有能力创建新的线程来执行登录了,因为阻塞队列根本超不出队列的长度.也就是说最大线程数量5是无效的.
● 解决方法
1.有能力的改源码.
2.找到线程阻塞的原因
3.个人认为最重要的是如何让阻塞的线程释放