1、概述
大家好,我是欧阳方超,可以关注我的公众号“欧阳方超”,后续内容将在公众号首发。
先来看一个现象吧,某接口A通过feign调用另外一个服务上的接口B,接口A实现的是下载zip文件的功能,线上出现了什么问题呢,zip文件下载下来了,但是解压时却报错——文件已损坏。
2、问题排查
2.1、错误日志
查看接口A所在服务有如下报错:
2023-07-17 21:25:20.799 INFO 114643 --- [io-30140-exec-7] c.c.h.a.c.DeclareTemplateController : startAttachmentAndFullTextapplicationCode:2023SQBC000090
2023-07-17 21:25:21.376 INFO 114643 --- [io-30140-exec-7] c.c.h.a.c.DeclareTemplateController : finishAttachmentAndFullTextapplicationCode:2023SQBC000090
org.apache.catalina.connector.ClientAbortException: java.io.IOException: Connection reset by peer
at org.apache.catalina.connector.OutputBuffer.realWriteBytes(OutputBuffer.java:351)
at org.apache.catalina.connector.OutputBuffer.flushByteBuffer(OutputBuffer.java:776)
at org.apache.catalina.connector.OutputBuffer.append(OutputBuffer.java:681)
at org.apache.catalina.connector.OutputBuffer.writeBytes(OutputBuffer.java:386)
at org.apache.catalina.connector.OutputBuffer.write(OutputBuffer.java:364)
at org.apache.catalina.connector.CoyoteOutputStream.write(CoyoteOutputStream.java:96)
at com.XXXXController.getXXXXAndFullText(XXXXController.java:62)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:190)
at org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:138)
at org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:105)
at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:878)
at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:792)
at org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:87)
at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:1040)
at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:943)
at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:1006)
at org.springframework.web.servlet.FrameworkServlet.doPost(FrameworkServlet.java:909)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:652)
at org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:883)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:733)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:231)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:53)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
at org.springframework.web.filter.RequestContextFilter.doFilterInternal(RequestContextFilter.java:100)
at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:119)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
at org.springframework.web.filter.FormContentFilter.doFilterInternal(FormContentFilter.java:93)
at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:119)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
at org.springframework.boot.actuate.metrics.web.servlet.WebMvcMetricsFilter.doFilterInternal(WebMvcMetricsFilter.java:93)
at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:119)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
at org.springframework.web.filter.CharacterEncodingFilter.doFilterInternal(CharacterEncodingFilter.java:201)
at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:119)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:202)
at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:97)
at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:542)
at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:143)
at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:92)
at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:78)
at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:343)
at org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:374)
at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:65)
at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:888)
at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1597)
at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
at sun.nio.ch.IOUtil.write(IOUtil.java:65)
at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:469)
at org.apache.tomcat.util.net.NioChannel.write(NioChannel.java:138)
at org.apache.tomcat.util.net.NioBlockingSelector.write(NioBlockingSelector.java:101)
at org.apache.tomcat.util.net.NioSelectorPool.write(NioSelectorPool.java:152)
at org.apache.tomcat.util.net.NioEndpoint$NioSocketWrapper.doWrite(NioEndpoint.java:1253)
at org.apache.tomcat.util.net.SocketWrapperBase.doWrite(SocketWrapperBase.java:764)
at org.apache.tomcat.util.net.SocketWrapperBase.writeBlocking(SocketWrapperBase.java:584)
at org.apache.tomcat.util.net.SocketWrapperBase.write(SocketWrapperBase.java:528)
at org.apache.coyote.http11.Http11OutputBuffer$SocketOutputBuffer.doWrite(Http11OutputBuffer.java:546)
at org.apache.coyote.http11.filters.ChunkedOutputFilter.doWrite(ChunkedOutputFilter.java:112)
at org.apache.coyote.http11.Http11OutputBuffer.doWrite(Http11OutputBuffer.java:193)
at org.apache.coyote.Response.doWrite(Response.java:601)
at org.apache.catalina.connector.OutputBuffer.realWriteBytes(OutputBuffer.java:339)
... 60 more
2023-07-17 21:25:21.383 INFO 114643 --- [io-30140-exec-7] c.c.h.a.c.DeclareTemplateController : endAttachmentAndFullTextapplicationCode:2023SQBC000090
我贴的日志内容有点多,就都放这吧,哈哈,Connection reset by peer,这是什么意思,即便翻译成中文也不太好理解。由于看不懂这个报错的内容一时不知如何下手,那我现在本地复现一下这个问题吧,于是把本地的网关服务及各微服务都启动起来,数据库与线上共用一个数据库,奇怪的事情发生了,该下载接口在本地运行一切正常,把zip文件下载下来了,能解压,里面的pdf文件清晰可见。
不知道线上发生了啥,没有思路,于是我去接了一杯水。
2.2、思考
回来后想了想,本地与上线在环境上有啥差异吗,有!本地测试时没有使用nginx,而线上请求接口是通过nginx进行的,那我就去线上环境看看nginx发生什么了吧,打开error.log,使用tail -f跟踪最新日志,再次点击线上的下载接口,下面的错误信息映入眼帘:
2023/07/17 21:57:38 [crit] 124351#0: *10084 open() "/usr/local/nginx/proxy_temp/1/11/0000000111" failed (13: Permission denied) while reading upstream, client: 10.10.100.11, server: localhost, request: "POST /api/apply/declareTemplate/attachmentAndFullText HTTP/1.1", upstream: "http://192.168.84.201:30120/xxxx/declareTemplateXXX/XXXXattachmentAndFullText", host: "192.168.84.201:30110", referrer: "http://192.168.84.201:30110/xxxx/collect-approval"
可以看到是没有proxy_temp目录的权限导致的。
2.3、真相
好吧,此时查看proxy_temp目录的所有者为nobody,这个现象如果跟最近公司发生的事情联系起来,真相就能大白于天下了,最近公司机房搬迁了,服务器上的nginx被人以非普通用户重启了,而微服务中jar包的启动用的是普通用户,这就是事情的真相。
2.4、延伸
/usr/local/nginx/proxy_temp 目录是 Nginx Web服务器在处理反向代理请求时使用的一个临时目录,在反向代理模式中,Nginx接收到客户端的请求,然后将这些请求转发到后端服务器。当后端服务器响应这些请求时,Nginx会将响应内容暂时保存在 /usr/local/nginx/proxy_temp 目录中。
据说你不经意间执行了nginx的其他命令(比如nginx -t),如果执行时使用的用户不是nginx目录的所有者也会导致proxy_temp目录的权限被改变;
这可能会导致接口产生206、304等问题,不过我的接口返回的依然是200;
3、总结
技术人在技术之外还是要有一些约束的,比如不能滥用root用户进行一些操作,可能有的团队在使用Linux服务器时只有一个root用户(有没有躺枪);遇到问题要懂得分段排查,比如我们通过本地与线上的环境差异把问题定位到nginx上;发生问题时多看日志。
我是欧阳方超,把事情做好了自然就有兴趣了,如果你喜欢我的文章,欢迎点赞、转发、评论加关注。我们下次见。
今天夕阳: