作者:Tony Qu
由于NPOI 2.0 beta 2中95%的序列化不再使用XmlSerializer,在经过实验后发现,性能提升差不多是10-20倍的。当然干掉XmlSerializer的原因是多方面的,首先XmlSerializer有很多限制,NPOI的基础库NPOI.OpenXmlFormats无法预先生成XmlSerializer的assembly(这是微软官方提供的性能优化方法),生成器直接报错说有冲突,具体冲突我忘了,反正n多冲突,所以这条路走不通;再加上XmlSerializer对于一些场景的支持不够到位,比如Xml对象的继承,接口的支持等,导致生成的open xml文件其实就是坏的,然后Excel或Word直接报xxx.xml损坏,是否修复;最后自然是性能和内存消耗,使用XmlSerilizer时由于要临时编译生成代码,内存消耗非常可怕,即使有assembly缓存,很多时候第一次都挺不过去,所以直接报OutOfMemory异常,我已经接到多个用户反馈(国内国外都有)。
我们那数字说话,先来看来自于某国外NPOI用户的反馈。借此来说明性能究竟有多少提升
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
|
class
Program
{
static
void
Main(
string
[] args)
{
IWorkbook workbook =
new
XSSFWorkbook();
ICell cell;
ISheet sheet = workbook.CreateSheet(
"StressTest"
);
int
i = 0;
int
rowLimit = 100000;
DateTime originalTime = DateTime.Now;
System.Console.WriteLine(
"Start time: "
+ originalTime);
for
(i = 0; i <= rowLimit; i++)
{
cell = sheet.CreateRow(i).CreateCell(0);
cell.SetCellValue(
"ZOMG PLEASE SURVIVE THIS STRESS TEST"
);
if
(i % 10000 == 0)
{
System.Console.WriteLine(
"["
+ (DateTime.Now - originalTime) +
"]"
+
" "
+ i +
" rows written"
);
}
}
FileStream sw = File.Create(
"test.xlsx"
);
workbook.Write(sw);
sw.Close();
System.Console.Read();
}
}
|
NPOI 2.0.1 (beta)的运行结果如下
Start time: 5/14/2013 5:05:39 PM
[00:00:00.0170017] 0 rows written
[00:00:02.7792779] 10000 rows written
[00:00:11.1951194] 20000 rows written
[00:00:27.7817779] 30000 rows written
[00:00:53.5283523] 40000 rows written
[00:01:30.1910182] 50000 rows written
[00:02:16.2836270] 60000 rows written
[00:03:14.6894670] 70000 rows written
[00:04:21.9641938] 80000 rows written
[00:05:38.7868753] 90000 rows written
[00:07:05.4055363] 100000 rows written
可以看到,写100,000行需要7分钟。
NPOI 2.0.5 (beta2)的运行结果如下
Start time: 2013/11/24 21:47:55
[00:00:00.0439453] 0 rows written
[00:00:00.1054687] 10000 rows written
[00:00:00.1826171] 20000 rows written
[00:00:00.2646484] 30000 rows written
[00:00:00.3271484] 40000 rows written
[00:00:00.4130859] 50000 rows written
[00:00:00.4892578] 60000 rows written
[00:00:00.5498046] 70000 rows written
[00:00:00.6513671] 80000 rows written
[00:00:00.7246093] 90000 rows written
[00:00:00.8037109] 100000 rows written
可以看到,2.0.5即使写入100,000行,都在1秒以内。
我们再来看另外一个例子,也是国外某NPOI提供的反馈。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
|
class
Program
{
static
void
Main(
string
[] args)
{
IWorkbook workbook =
new
XSSFWorkbook();
ISheet sheet1 = workbook.CreateSheet(
"Sheet1"
);
sheet1.CreateRow(0).CreateCell(0).SetCellValue(
"This is a Sample"
);
int
x = 1;
Debug.WriteLine(
"Start at "
+ DateTime.Now.ToString());
for
(
int
i = 1; i <= 30000; i++)
{
IRow row = sheet1.CreateRow(i);
for
(
int
j = 0; j < 15; j++)
{
row.CreateCell(j).SetCellValue(x++);
}
}
Debug.WriteLine(
"End at "
+ DateTime.Now.ToString());
FileStream sw = File.Create(
"test.xls"
);
workbook.Write(sw);
sw.Close();
}
}
|
NPOI 2.0.1(beta)的运行结果
3w行:25秒
7w行:4分钟
30w行:80分钟
用那位网友的话来说,真的是几何级数级增长。
NPOI 2.0.5(beta 2)的运行结果
3w行:2秒
7w行:5秒
30w行:37秒
看结果大家就懂了,其他不多说了。目前NPOI 2.0.5的内存依然有点高,以刚才30w行为例,峰值大概在1.3G内存左右,NPOI 2.0正式版中将对内存进行优化。