记一次 .NET 某WMS仓储打单系统 内存暴涨分析

一:背景

1. 讲故事

七月中旬有一位朋友加wx求助,他的程序在生产上跑着跑着内存就飙起来了,貌似没有回头的趋势,询问如何解决,截图如下:

和这位朋友聊下来,感觉像是自己在小县城当了个小老板,规律的生活,有当地资源,各种小关系,有一股财务自由的味道,这也是我一直向往的生活方式 😄😄😄。

既然朋友找到我了,我得想办法给他解决问题,既然是内存暴涨,我就赌一把在托管层面吧,嘿嘿,上windbg说话。

二: windbg 分析

1. 托管还是非托管

一直在追这个系列的朋友应该知道,我无数次的用 !address -summary!eeheap -gc 这两个命令来判断当前内存属于托管层还是非托管层。


0:000> !address -summary

--- State Summary ---------------- RgnCount ----------- Total Size -------- %ofBusy %ofTotal
MEM_FREE                                393     7dfe`f2105000 ( 125.996 TB)           98.43%
MEM_RESERVE                            1691      200`0f1e4000 (   2.000 TB)  99.81%    1.56%
MEM_COMMIT                             6191        0`fed07000 (   3.981 GB)   0.19%    0.00%


0:000> !eeheap -gc
Number of GC Heaps: 1
generation 0 starts at 0x000001D2E572BBC8
generation 1 starts at 0x000001D2E54F70E0
generation 2 starts at 0x000001D252051000
ephemeral segment allocation context: none
         segment             begin         allocated              size
000001D252050000  000001D252051000  000001D26204FFE0  0xfffefe0(268431328)
Large object heap starts at 0x000001D262051000
         segment             begin         allocated              size
000001D262050000  000001D262051000  000001D2655F3F80  0x35a2f80(56242048)
Total Size:              Size: 0xbf4dbf80 (3209543552) bytes.
------------------------------
GC Heap Size:    Size: 0xbf4dbf80 (3209543552) bytes.

卦象上进程指标为 3.98G ,GC堆指标为 3209543552 = 3G ,很显然,本次事故属于 托管层面

2. 寻找托管层上的大对象

我们都知道C#是托管语言,所以甭管有用没用的对象都逃不出GC堆,言外之意就是看GC堆准没错,挑几个大对象看看。


0:000> !dumpheap -stat
Statistics:
              MT    Count    TotalSize Class Name
00007ff98a68f090   391475     43869284 System.Int32[]
00007ff98b6adfa0  1902760     45666240 System.Collections.ObjectModel.ReadOnlyCollection`1[[System.Linq.Expressions.Expression, System.Linq.Expressions]]
00007ff98b6ac3c0  1951470     46835280 System.Linq.Expressions.ConstantExpression
00007ff98bc452e0  1681178     53797696 System.Linq.Expressions.TypedConstantExpression
00007ff98eacb6b8  1902708     60886656 System.Dynamic.Utils.ListArgumentProvider
00007ff98f236518  1774982     70999280 Microsoft.EntityFrameworkCore.Query.Expressions.ColumnExpression
00007ff98c650c58  1681142     80694816 System.Linq.Expressions.MethodCallExpression3
00007ff98a82bc38  3414094     81938256 System.RuntimeMethodHandle
00007ff98fd96fc0    17750     83936016 System.Collections.Generic.Dictionary`2+Entry[[System.Reflection.MemberInfo, System.Private.CoreLib],[System.Linq.Expressions.Expression, System.Linq.Expressions]][]
00007ff98e5ed5d8    35493    101740504 System.Collections.Generic.Dictionary`2+Entry[[Microsoft.Extensions.DependencyInjection.ServiceLookup.ServiceCacheKey, Microsoft.Extensions.DependencyInjection],[System.Object, System.Private.CoreLib]][]
00007ff98bcff6a8  3639389    116460448 System.Linq.Expressions.PropertyExpression
00007ff98b85cf00  5028347    160907104 System.Reflection.Emit.GenericFieldInfo
00007ff98a671e18  2178117    168395994 System.String
00007ff98a5b6610   160565    171498416 System.Object[]
00007ff98eaa8ab0  4981589    199263560 System.Linq.Expressions.MemberAssignment
00007ff98a672360   398740    391928469 System.Byte[]
00007ff98a746d68   181886    486150592 System.Char[] 

从托管堆上看,System.Linq.Expressions.MemberAssignment 对象高达 498w ,很明显有问题,从类名看可能和 ExpressionTree 有关,那就抽几个对象看看它的引用链上是否有过大的对象。


0:000> !gcroot 000001d25399f690
HandleTable:
    000001D251B715A8 (pinned handle)
    -> 000001D262068CF0 System.Object[]
    -> 000001D2531C3B78 Microsoft.EntityFrameworkCore.Internal.ServiceProviderCache
    -> 000001D25399E3D0 Remotion.Linq.QueryModel
    -> 000001D25399E3B8 Remotion.Linq.Clauses.SelectClause
    -> 000001D25442C068 System.Linq.Expressions.MemberInitExpression
    -> 000001D25442C050 System.Runtime.CompilerServices.TrueReadOnlyCollection`1[[System.Linq.Expressions.MemberBinding, System.Linq.Expressions]]
    -> 000001D2539A0290 System.Linq.Expressions.MemberBinding[]
    -> 000001D25399F690 System.Linq.Expressions.MemberAssignment

引用链特别长,这里我就截取一下,经过一顿排查,我发现大对象居然是 Remotion.Linq.Clauses.SelectClause,objsize 这个对象直接爆掉了,真的很奇葩,如下代码所示:


0:000> !objsize 000001D25399E3B8
sizeof(000001D25399E3B8) = -1187378032 (0xb93a0c90) bytes (Remotion.Linq.Clauses.SelectClause)

有点懵,这个对象居然是罪魁祸首,从引用链看它是 EF 下的一个构建表达式树的小部件,可以肯定的是,朋友在用EF的时候出了什么问题,不过还得硬着头皮继续挖 SelectClause,经过深挖,我发现这个类有大量这样的 char[] 数组,导出来后大概是下面这样。


Logistics.Text30), 
|           Text31 = string TryReadValue(t1.Outer.Outer, 42, WmsOutboundConfirmLogistics.Text31), 
|           Text32 = string TryReadValue(t1.Outer.Outer, 43, WmsOutboundConfirmLogistics.Text32), 
|           Text33 = string TryReadValue(t1.Outer.Outer, 44, WmsOutboundConfirmLogistics.Text33), 
|           Text34 = string TryReadValue(t1.Outer.Outer, 45, WmsOutboundConfirmLogistics.Text34), 
|           Text35 = string TryReadValue(t1.Outer.Outer, 46, WmsOutboundConfirmLogistics.Text35), 
|           IsQueue = Nullable<bool> TryReadValue(t1.Outer.Outer, 47, WmsOutboundConfirmLogistics.IsQueue), 
|           IsStop = Nullable<bool> TryReadValue(t1.Outer.Outer, 48, WmsOutboundConfirmLogistics.IsStop), 
|           CheckCode = string TryReadValue(t1.Outer.Outer, 49, WmsOutboundConfirmLogistics.CheckCode), 
|           ClientCode = string TryReadValue(t1.Outer.Inner, 50, WmsOutboundOrder.ClientCode), 
|           WarehouseCode = string TryReadValue(t1.Outer.Inner, 51, WmsOutboundOrder.WarehouseCode), 
|           ErpNumber = string TryReadValue(t1.Outer.Inner, 52, WmsOutboundOrder.ErpNumber), 
|           OrderCategory = string TryReadValue(t1.Outer.Inner, 53, WmsOutboundOrder.OrderCategory), 
|           OrderStatus = string TryReadValue(t1.Outer.Inner, 54, WmsOutboundOrder.OrderStatus), 
|           OrderType = string TryReadValue(t1.Outer.Inner, 55, WmsOutboundOrder.OrderType), 
|           SendCompany = string TryReadValue(t1.Outer.Inner, 56, WmsOutboundOrder.SendCompany), 
|           SendName = string TryReadValue(t1.Outer.Inner, 57, WmsOutboundOrder.SendName), 
|           SendTel = string TryReadValue(t1.Outer.Inner, 58, WmsOutboundOrder.SendTel), 
|           SendMobile = string TryReadValue(t1.Outer.Inner, 59, WmsOutboundOrder.SendMobile), 
|           SendProvince = string TryReadValue(t1.Outer.Inner, 60, WmsOutboundOrder.SendProvince), 
|           SendCity = string TryReadValue(t1.Outer.Inner, 61, WmsOutboundOrder.SendCity), 
|           SendArea = string TryReadValue(t1.Outer.Inner, 62, WmsOutboundOrder.SendArea), 
|           ...
|           CategoryName = string TryReadValue(t1.Outer.Inner, 88, WmsOutboundOrder.CategoryName), 
|           SourcePlatformCode = string TryReadValue(t1.Outer.Inner, 89, WmsOutboundOrder.SourcePlatformCode), 
|           PayMode = (string)string TryReadValue(t1.Outer.Outer, 90, null), 
|           List = List<WmsOutboundConfirmLogisticsLinesDTO> WmsOutboundConfirmLogisticsBusiness.GetOrderLines(string TryReadValue(t1.Outer.Outer, 5, WmsOutboundConfirmLogistics.OrderNumber)), 
|           ConfirmTime = DateTime TryReadValue(t1.Inner, 91, WmsOutboundOrderConfirmation.CreateTime), 
|           ReturnUrl = (string)string TryReadValue(t1.Outer.Outer, 92, null) 
|       }
|__ ), 
|__ contextType: Core.DataRepository.BaseDbContext, 
|__ logger: DiagnosticsLogger<Query>, 
|__ queryContext: Unhandled parameter: queryContext)                                                      

从内容看,应该是 select 语句的ExpressionTree表示,问了下朋友,说大概是报表业务,不过这些信息给他貌似也没有多大帮助,说实话到这里我其实也不知道怎么继续往下排查了,陷入了绝望。

3. 从绝望中寻找希望

我在想,既然EF构建了大量这样的 ExpressionTree,肯定有问题,但也想不出是什么问题,隔了半天,我突然灵光一现,EF既然构建了树,有可能sql也出来了,对,我何不直接在 heap 上搜索 select 的sql语句。。。。


0:000> !strings /m:*select*
Address            Gen    Length   Value
000001d2e4de64e0    2       1964   SELECT a."Id", a."CreateTime" AS "CreateTime0", a."CreatorId", a."CreatorRealName", a."Deleted", a."OrderNumber", a."CarrierId",...
000001d2e4e11e78    2       1964   SELECT a."Id", a."CreateTime" AS "CreateTime0", a."CreatorId", a."CreatorRealName", a."Deleted", a."OrderNumber", a."CarrierId",...
000001d2e4e3d1f0    2       1964   SELECT a."Id", a."CreateTime" AS "CreateTime0", a."CreatorId", a."CreatorRealName", a."Deleted", a."OrderNumber", a."CarrierId",...
000001d2e4e673c8    2       1964   SELECT a."Id", a."CreateTime" AS "CreateTime0", a."CreatorId", a."CreatorRealName", a."Deleted", a."OrderNumber", a."CarrierId",...
000001d2e4e91760    2       1964   SELECT a."Id", a."CreateTime" AS "CreateTime0", a."CreatorId", a."CreatorRealName", a."Deleted", a."OrderNumber", a."CarrierId",...
000001d2e4ebb2e8    2       1964   SELECT a."Id", a."CreateTime" AS "CreateTime0", a."CreatorId", a."CreatorRealName", a."Deleted", a."OrderNumber", a."CarrierId",...
000001d2e4ee54f8    2       1964   SELECT a."Id", a."CreateTime" AS "CreateTime0", a."CreatorId", a."CreatorRealName", a."Deleted", a."OrderNumber", a."CarrierId",...
000001d2e4f10758    2       1964   SELECT a."Id", a."CreateTime" AS "CreateTime0", a."CreatorId", a."CreatorRealName", a."Deleted", a."OrderNumber", a."CarrierId",...
000001d2e4f398d0    2       1964   SELECT a."Id", a."CreateTime" AS "CreateTime0", a."CreatorId", a."CreatorRealName", a."Deleted", a."OrderNumber", a."CarrierId",...

---------------------------------------
18128 matching strings

果然发现了大量重复的 select 语句,而且从最左边的内存地址看都是非常接近的,也就说明他们是在某一个操作中同时生成的,然后我们导出几个sql语句。


SELECT a."Id", ....
FROM "WmsOutboundConfirmLogistics" AS a
INNER JOIN "WmsOutboundOrder" AS b ON a."OrderNumber" = b."OrderNumber"
INNER JOIN "WmsOutboundOrderConfirmation" AS c ON a."OrderNumber" = c."OrderNumber"
WHERE (a."OrderNumber" = @__pagination_OrderNumber_0) AND (b."FreezeStatus" = FALSE)
ORDER BY a."Id"";

SELECT a."Id", ....
FROM "WmsOutboundConfirmLogistics" AS a
INNER JOIN "WmsOutboundOrder" AS b ON a."OrderNumber" = b."OrderNumber"
INNER JOIN "WmsOutboundOrderConfirmation" AS c ON a."OrderNumber" = c."OrderNumber"
WHERE (a."OrderNumber" = @__pagination_OrderNumber_0) AND (b."FreezeStatus" = FALSE)
ORDER BY a."Id""

拿到这 1.8w 重复的sql 给朋友看,朋友说这是查询报表的sql。

4. 所有线索整合打通

那这里就存在着很大问题,既然是查询报表,为什么会有 1.8w 相同的sql,唯一不同的就是 a."OrderNumber" = @__pagination_OrderNumber_0 中的订单号,难道不应该是 a.OrderNumber in (xxxx) 或者是表关联查询吗??? 整理一下就是下面这样的猜想:


-- 理想
select * from a where a.id in (1,2,3)

-- 现实
select * from a where a.id=1;
select * from a where a.id=2;
select * from a where a.id=3;

加上每个sql内存地址相近,再结合爆表的 Remotion.Linq.Clauses.SelectClause 对象,整个流程大概就是:本该表关联或者in操作,结果变成了无数个单条sql语句查询,导致EF底层出现内存爆炸式增长。

三:总结

看了下朋友查询ef的写法,猜测大多都是人肉构建 ExpressionTree 去查询数据库,大写的🐂👃,比如下面的这张图:

解决方案就是让朋友检查下表示式树的写法问题,或者直接灌写好的sql得了,说实话这个dump还是费了九牛二虎之力,本以为很简单,实操起来还是碰到了一点小困难,就当历练成长吧!

更多高质量干货:参见我的 GitHub: dotnetfly

  • 2
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
C#开发,可解析WMS服务,并进行浏览,功能强大,GIS开发人员可参考文件列表 Get Capabilities Example ........................\AssemblyInfo.cs ........................\Get Capabilities Example.csproj ........................\GetCapabilitiesExample.cs Get Map Asynch Example ......................\AssemblyInfo.cs ......................\Get Map Asynch Example.csproj ......................\GetMapAsynchExample.cs ......................\GetMapAsynchExample.resx Get Map Example ...............\AssemblyInfo.cs ...............\Get Map Example.csproj ...............\GetMapExample.cs GLOBECapabilities20040423.xml MapAnimation ............\AssemblyInfo.cs ............\Map Animation.csproj ............\MapAnimation.cs ............\MapAnimation.resx WMS Overview Article.html WMS Overview.sln Wms.Client ..........\AssemblyInfo.cs ..........\Capabilities.cs ..........\CLSDFOLD.BMP ..........\DefaultServerDescriptors.xml ..........\DownloadCache.cs ..........\ENTIRNET.BMP ..........\ENTIRNETX.bmp ..........\ENTIRNETX.ICO ..........\ExtensionMap.cs ..........\Layer.cs ..........\OPENFOLD.BMP ..........\PreviewDialog.cs ..........\PreviewDialog.resx ..........\RequestBuilder.cs ..........\Retriever.cs ..........\Server.cs ..........\ServerAddDialog.cs ..........\ServerAddDialog.resx ..........\ServerDescriptor.cs ..........\Wms.Client.csproj ..........\WmsDialog.cs ..........\WmsDialog.resx ..........\WmsException.cs ..........\WmsServerDescriptors.cs ..........\WmsServerDescriptors.xsd WMSBrowser ..........\AssemblyInfo.cs ..........\MapForm.cs ..........\MapForm.resx ..........\WMSBrowser.cs ..........\WMSBrowser.csproj ..........\WMSBrowser.resx

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值