java socketread0,Java socketRead0问题

I'm developing a web cralwer with htmlunit and I have added all required timeout but I notice that the app hangs when the server of some website been crawled is not responding at when I use the Java VisualVM to do a thread dump:

java.lang.Thread.State: RUNNABLE

at java.net.SocketInputStream.socketRead0(Native Method)

at java.net.SocketInputStream.read(SocketInputStream.java:129)

at java.net.SocksSocketImpl.readSocksReply(SocksSocketImpl.java:88)

at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:429)

at java.net.Socket.connect(Socket.java:525)

at com.gargoylesoftware.htmlunit.SocksSocketFactory.connectSocket(SocksSocketFactory.java:89)

at org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:148)

at org.apache.http.impl.conn.AbstractPoolEntry.open(AbstractPoolEntry.java:149)

at org.apache.http.impl.conn.AbstractPooledConnAdapter.open(AbstractPooledConnAdapter.java:121)

at org.apache.http.impl.client.DefaultRequestDirector.tryConnect(DefaultRequestDirector.java:573)

at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:425)

at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:820)

at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:776)

at com.gargoylesoftware.htmlunit.HttpWebConnection.getResponse(HttpWebConnection.java:152)

at app.plugin.core.net.QHttpWebConnection.getResponse(QHttpWebConnection.java:30)

at com.gargoylesoftware.htmlunit.WebClient.loadWebResponseFromWebConnection(WebClient.java:1439)

at com.gargoylesoftware.htmlunit.WebClient.loadWebResponse(WebClient.java:1358)

at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:307)

at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:373)

at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:358)

This is really frustrating since I have no control of those servers. This issue is seriously affecting the performance of my application.

Question:

How can I solve this issue?

Is there a way to get a list of socket connection opened by a Java app and use that to terminate the socket, like simluate that the server closed the connection?

解决方案

I believe that when you are in a Java native method, the stack trace will say RUNNABLE even if the call is actually blocked waiting for some event. In essence, I don't believe Java has any way of knowing what a native method is actually doing, so it flags these calls as RUNNABLE. I have seen this with socketRead0() and socketAccept() -- both of which typically block.

You need to set your timeout to a reasonable length of time such that your request will time out if the server is not responding but not too short in case the server is simply busy. Your application should be written to use multiple threads. I would try running a dozen or more threads and have each thread wait up to five or ten seconds for a response. There is virtually no overhead in having a handful of threads waiting. You should also be mindful of not bombarding a server with lots of requests when writing a web spider.

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值