JAVA调svnClient大文件超时_HttpClient超时机制(安全问题处理:访问超大文件控制)

最新推荐文章于 2023-04-26 16:16:29 发布

吕诺OK镜

最新推荐文章于 2023-04-26 16:16:29 发布

阅读量321

点赞数

文章标签： JAVA调svnClient大文件超时

本文链接：https://blog.csdn.net/weixin_30209121/article/details/114834190

版权

摘要：背景最近一直在做项目，其中的一个功能点，主要是访问外部网站并获取页面的字符串，具体的网站url完全是由用户输入，所以存在一定的安全隐患。从测试来看，如果给定的一部电影的url地址，链接会一直不能被关闭，直到数据流被读完，如果来个几十次这样的请求，应用估计也差不多崩溃了 ...

背景

最近一直在做项目，其中的一个功能点，主要是访问外部网站并获取页面的字符串，具体的网站url完全是由用户输入，所以存在一定的安全隐患。

从测试来看，如果给定的一部电影的url地址，链接会一直不能被关闭，直到数据流被读完，如果来个几十次这样的请求，应用估计也差不多崩溃了

说明：项目中使用的HttpClient版本是3.0.1

测试

一般的HttpClient使用例子：

1.MultiThreadedHttpConnectionManager manager = new MultiThreadedHttpConnectionManager();

2. HttpClient client = new HttpClient(manager);

3. client.setConnectionTimeout(30000);

4. client.setTimeout(30000);

6. GetMethod get = new GetMethod("http://download.jboss.org/jbossas/7.0/jboss-7.0.0.Alpha1/jboss-7.0.0.Alpha1.zip");

7. try {

8. client.executeMethod(get); //发起请求

9. String result = get.getResponseBodyAsString(); //获取数据

10. } catch (Exception e) {

11. } finally {

12. get.releaseConnection(); //释放链接

13. }

这里我给出的一个url是近20MB的一个下载资源，很快发现线程要等个很久。咋办，得加个timeout超时机制。

1."main" prio=10 tid=0x0899e800 nid=0x4010 runnable [0xb7618000..0xb761a1c8]

2. java.lang.Thread.State: RUNNABLE

3. at java.net.SocketInputStream.socketRead0(Native Method)

4. at java.net.SocketInputStream.read(SocketInputStream.java:129)

5. at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)

6. at java.io.BufferedInputStream.read1(BufferedInputStream.java:258)

7. at java.io.BufferedInputStream.read(BufferedInputStream.java:317)

8. - locked <0xb23a4c30> (a java.io.BufferedInputStream)

9. at org.apache.commons.httpclient.ContentLengthInputStream.read(ContentLengthInputStream.java:156)

10. at org.apache.commons.httpclient.ContentLengthInputStream.read(ContentLengthInputStream.java:170)

11. at org.apache.commons.httpclient.ChunkedInputStream.exhaustInputStream(ChunkedInputStream.java:338)

12. at org.apache.commons.httpclient.ContentLengthInputStream.close(ContentLengthInputStream.java:104)

13. at java.io.FilterInputStream.close(FilterInputStream.java:155)

14. at org.apache.commons.httpclient.AutoCloseInputStream.notifyWatcher(AutoCloseInputStream.java:179)

15. at org.apache.commons.httpclient.AutoCloseInputStream.close(AutoCloseInputStream.java:143)

16. at org.apache.commons.httpclient.HttpMethodBase.releaseConnection(HttpMethodBase.java:1341)

分析

目前httpClient3.1只支持3种timeout的设置：

connectionTimeout : socket建立链接的超时时间，Httpclient包中通过一个异步线程去创建socket链接，对应的超时控制。

timeoutInMilliseconds : socket read数据的超时时间， socket.setSoTimeout(timeout);

httpConnectionTimeout : 如果那个的是MultiThreadedHttpConnectionManager，对应的是从连接池获取链接的超时时间。

分析一下问题，我们需要的是一个HttpClient整个链接读取的一个超时时间，包括请求发起，Http Head解析，response流读取的一系列时间的总和。

目标很明确，对应的修正后的测试代码：

1.final MultiThreadedHttpConnectionManager manager = new MultiThreadedHttpConnectionManager();

2. final HttpClient client = new HttpClient(manager);

3. client.setConnectionTimeout(30000);

4. client.setTimeout(30000);

5. final GetMethod get = new GetMethod(

6. "http://download.jboss.org/jbossas/7.0/jboss-7.0.0.Alpha1/jboss-7.0.0.Alpha1.zip");

8. Thread t = new Thread(new Runnable() {

10. @Override

11. public void run() {

12. try {

13. client.executeMethod(get);

14. String result = get.getResponseBodyAsString();

15. } catch (Exception e) {

16. // ignore

17. }

18. }

19. }, "Timeout guard");

20. t.setDaemon(true);

21. t.start();

22. try {

23. t.join(5000l); //等待5s后结束

24. } catch (InterruptedException e) {

25. System.out.println("out finally start");

26. ((MultiThreadedHttpConnectionManager) client.getHttpConnectionManager()).shutdown();

27. System.out.println("out finally end");

28. }

29. if (t.isAlive()) {

30. System.out.println("out finally start");

31. ((MultiThreadedHttpConnectionManager) client.getHttpConnectionManager()).shutdown();

32. System.out.println("out finally end");

33. t.interrupt();

34. // throw new TimeoutException();

35. }

36. System.out.println("done");

这里通过Thread.join方法，设置了超时时间为5000 ms，这是比较早的用法。如果熟悉cocurrent包的，可以直接使用Future和ThreadPoolExecutor进行异步处理，缓存对应的Thread。

1.ExecutorService service = Executors.newCachedThreadPool();

2. Future future = service.submit(new Callable() {

4. @Override

5. public String call() throws Exception {

7. try {

8. client.executeMethod(get);

9. return get.getResponseBodyAsString();

10. } catch (Exception e) {

11. e.printStackTrace();

12. } finally {

13. System.out.println("future finally start");

14. ((MultiThreadedHttpConnectionManager) client.getHttpConnectionManager()).shutdown();

15. System.out.println("future finally end");

16. }

17.

18. return "";

19. }

20.

21. });

22.

23. try {

24. future.get(5000, TimeUnit.MILLISECONDS);

25. } catch (Exception e) {

26. System.out.println("out finally");

27. e.printStackTrace();

28. ((MultiThreadedHttpConnectionManager) client.getHttpConnectionManager()).shutdown();

29. System.out.println("out finally end");

30. }

31.

32. service.shutdown();

说明：这里为什么释放链接未采用get.releaseConnection()

看下release的实现：

1.public void releaseConnection() {

3. if (responseStream != null) {

4. try {

5. // FYI - this may indirectly invoke responseBodyConsumed.

6. responseStream.close(); // 会先关闭流

7. } catch (IOException e) {

8. // the connection may not have been released, let's make sure

9. ensureConnectionRelease();

10. }

11. } else {

12. // Make sure the connection has been released. If the response

13. // stream has not been set, this is the only way to release the

14. // connection.

15. ensureConnectionRelease();

16. }

17. }

这里会先关闭responseStream流，这就是问题点。

对应的responseStream是在方法：readResponseBody(HttpConnection conn)。一般的html页面返回的是一个ContentLengthInputStream对象

ContentLengthInputStream在调用close方法时会用ChunkedInputStream.exhaustInputStream读完所有流数据

1.public void close() throws IOException {

2. if (!closed) {

3. try {

4. ChunkedInputStream.exhaustInputStream(this);

5. } finally {

6. // close after above so that we don't throw an exception trying

7. // to read after closed!

8. closed = true;

9. }

10. }

11. }

ChunkedInputStream.exhaustInputStream代码

1.static void exhaustInputStream(InputStream inStream) throws IOException {

2. // read and discard the remainder of the message

3. byte buffer[] = new byte[1024];

4. while (inStream.read(buffer) >= 0) {

5. ;

6. }

7. }

说明：

因为非sleep和park的方法，不会响应InterruptedException事件，所以普通future超时发起的Thread.interrpt()并没有效果。

默认的SimpleHttpConnectionManager不支持这样的操作，所以选择MultiThreadedHttpConnectionManager.shutdown()方法，强制关闭底层HttpConnection的sock的输入输出流。

总结

理解一下HttpClient这样设计的理由： socket重用，keepAlive协议的支持等，保证上一次数据不会对新的请求有影响。

Thread.interrpt()处理，只会在Thread处于sleep或者wait状态才会被唤醒(api的描述)。而且该方法的调用并不自动产生InterruptedException异常，一般是需要自己判断Thread.isInterrupted()，然后throw异常。我们目前使用的一些jdk cocurrent类比如future.cancel也是类似处理。