看软设的教程太枯燥了,实在无聊,干点什么,就来写点基础的Java代码运行运行吧。上次看到有面试题是处理大数据查询的,我就来试试切割大文件吧。
自己生成一个数据文件,来个2G大小的txt,里面装满字符“a”:
StringBuffer sb = new StringBuffer();
BufferedWriter bufferedWriter = new BufferedWriter(new FileWriter(
"performace.txt"));
System.out.println("Begin...");
long start = System.currentTimeMillis();
for (int i = 0; i < 2048; ++i) {
for (int j = 0; j < 1024; ++j) {
for (int k = 0; k < 1024; ++k) {
sb.append("a");
}
sb.append("\n");
}
bufferedWriter.write(sb.toString());
sb = sb.delete(0, sb.length());
}
bufferedWriter.flush();
bufferedWriter.close();
long end = System.currentTimeMillis();
System.out.println((end - start) + "ms elapsed.");
接下来就割吧。割成单个的文件应该定为多大呢?我用字节buffer获取数据,试下来12MB可以,13MB就堆溢出了:
int bufSize = 1024*1024;
StringBuffer sb=new StringBuffer();
byte[] bs = new byte[bufSize];
ByteBuffer byteBuf = ByteBuffer.allocate(bufSize);
System.out.println("Begin...");
long start = System.currentTimeMillis();
FileChannel channel = new RandomAccessFile("performace.txt","r").getChannel();
long size,count=1;
while((size = channel.read(byteBuf)) != -1) {
byteBuf.rewind();
byteBuf.get(bs);
sb.append(new String (bs));
if(count%(12)==0){
System.out.println(count);
BufferedWriter bufferedWriter = new BufferedWriter(new FileWriter(
"文件"+(count/(12))+".txt"));
bufferedWriter.write(sb.toString());
sb=sb.delete(0, sb.length());
bufferedWriter.flush();
bufferedWriter.close();
}
byteBuf.clear();
count++;
}
channel.close();
long end = System.currentTimeMillis();
System.out.println((end - start) + "ms elapsed.");
但如果直接用BufferedReader读,同样用StringBuffer缓存数据,却可以割成13MB大小的文件,14MB溢出了:
int bufSize = 1024*1024;
StringBuffer sb=new StringBuffer();
byte[] bs = new byte[bufSize];
ByteBuffer byteBuf = ByteBuffer.allocate(bufSize);
System.out.println("Begin...");
long start = System.currentTimeMillis();
FileChannel channel = new RandomAccessFile("performace.txt","r").getChannel();
long size,count=1;
while((size = channel.read(byteBuf)) != -1) {
byteBuf.rewind();
byteBuf.get(bs);
sb.append(new String (bs));
if(count%(12)==0){
System.out.println(count);
BufferedWriter bufferedWriter = new BufferedWriter(new FileWriter(
"文件"+(count/(12))+".txt"));
bufferedWriter.write(sb.toString());
sb=sb.delete(0, sb.length());
bufferedWriter.flush();
bufferedWriter.close();
}
byteBuf.clear();
count++;
}
channel.close();
long end = System.currentTimeMillis();
System.out.println((end - start) + "ms elapsed.");