java 读取磁盘文件,Java:对巨大磁盘文件进行随机读取的最快方法

I've got a moderately big set of data, about 800 MB or so, that is basically some big precomputed table that I need to speed some computation by several orders of magnitude (creating that file took several mutlicores computers days to produce using an optimized and multi-threaded algo... I do really need that file).

Now that it has been computed once, that 800MB of data is read only.

I cannot hold it in memory.

As of now it is one big huge 800MB file but splitting in into smaller files ain't a problem if it can help.

I need to read about 32 bits of data here and there in that file a lot of time. I don't know before hand where I'll need to read these data: the reads are uniformly distributed.

What would be the fastest way in Java to do my random reads in such a file or files? Ideally I should be doing these reads from several unrelated threads (but I could queue the reads in a single thread if needed).

Is Java NIO the way to go?

I'm not familiar with 'memory mapped file': I think I don't want to map the 800 MB in memory.

All I want is the fastest random reads I can get to access these 800MB of disk-based data.

btw in case people wonder this is not at all the same as the question I asked not long ago:

解决方案

800MB is not that much to load up and store in memory. If you can afford to have multicore machines ripping away at a data set for days on end, you can afford an extra GB or two of RAM, no?

That said, read up on Java's "I think I don't want to map the 800 MB in memory" that the concept is not clear.

In a nut shell, a mapped byte buffer allows one to programmatically access the data as it were in memory, although it may be on disk or in memory--this is for the OS to decide, as Java's MBB is based on the OS's Virtual Memory subsystem. It is also nice and fast. You will also be able to access a single MBB from multiple threads safely.

Here are the steps I recommend you take:

Instantiate a MappedByteBuffer that maps your data file to the MBB. The creation is kinda expensive, so keep it around.

In your look up method...

instantiate a byte[4] array

call .get(byte[] dst, int offset, int length)

the byte array will now have your data, which you can turn into a value

And presto! You have your data!

I'm a big fan of MBBs and have used them successfully for such tasks in the past.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值