我必须像1M实体一样处理以构建事实.应该有大约相同数量的结果事实(100万).
我遇到的第一个问题是批量插入,实体框架很慢.所以我使用了这种模式Fastest Way of Inserting in Entity Framework(来自SLauma的回答).而且我可以在一分钟内快速插入实体大约100K.
我遇到的另一个问题是缺乏处理所有内容的内存.所以我已经“分页”了处理过程.为了避免内存不足,如果我从我的100万结果事实中列出一个列表,我会得到.
我的问题是即使使用分页,内存也总是在增长,我不明白为什么.每批后没有释放内存.我认为这很奇怪,因为我在循环的每次迭代中获取重建构建事实并将它们存储到DB中.一旦循环完成,那些应该从内存中释放出来.但它看起来并不是因为每次迭代后都没有释放内存.
在我挖掘更多之前,你能否告诉我你是否看错了什么?更具体地说,为什么在迭代循环之后没有释放内存.
static void Main(string[] args)
{
ReceiptsItemCodeAnalysisContext db = new ReceiptsItemCodeAnalysisContext();
var recon = db.Recons
.Where(r => r.Transacs.Where(t => t.ItemCodeDetails.Count > 0).Count() > 0)
.OrderBy( r => r.ReconNum);
// used for "paging" the processing
var processed = 0;
var total = recon.Count();
var batchSize = 1000; //100000;
var batch = 1;
var skip = 0;
var doBatch = true;
while (doBatch)
{ // list to store facts processed during the batch
List facts = new List();
// get the Recon items to process in this batch put them in a list
List toProcess = recon.Skip(skip).Take(batchSize)
.Include(r => r.Transacs.Select(t => t.ItemCodeDetails))
.ToList();
// to process real fast
Parallel.ForEach(toProcess, r =>
{ // processing a recon and adding the facts to the list
var thisReconFacts = ReconFactGenerator.Generate(r);
thisReconFacts.ForEach(f => facts.Add(f));
Console.WriteLine(processed += 1);
});
// saving the facts using pattern provided by Slauma
using (TransactionScope scope = new TransactionScope(TransactionScopeOption.Required, new System.TimeSpan(0, 15, 0)))
{
ReceiptsItemCodeAnalysisContext context = null;
try
{
context = new ReceiptsItemCodeAnalysisContext();
context.Configuration.AutoDetectChangesEnabled = false;
int count = 0;
foreach (var fact in facts.Where(f => f != null))
{
count++;
Console.WriteLine(count);
context = ContextHelper.AddToContext(context, fact, count, 250, true); //context.AddToContext(context, fact, count, 250, true);
}
context.SaveChanges();
}
finally
{
if (context != null)
context.Dispose();
}
scope.Complete();
}
Console.WriteLine("batch {0} finished continuing", batch);
// continuing the batch
batch++;
skip = batchSize * (batch - 1);
doBatch = skip < total;
// AFTER THIS facts AND toProcess SHOULD BE RESET
// BUT IT LOOKS LIKE THEY ARE NOT OR AT LEAST SOMETHING
// IS GROWING IN MEMORY
}
Console.WriteLine("Processing is done {} recons processed", processed);
}
Slauma提供的方法用实体框架优化批量插入.
class ContextHelper
{
public static ReceiptsItemCodeAnalysisContext AddToContext(ReceiptsItemCodeAnalysisContext context,
ReconFact entity, int count, int commitCount, bool recreateContext)
{
context.Set().Add(entity);
if (count % commitCount == 0)
{
context.SaveChanges();
if (recreateContext)
{
context.Dispose();
context = new ReceiptsItemCodeAnalysisContext();
context.Configuration.AutoDetectChangesEnabled = false;
}
}
return context;
}
}