java流数据base64,Java缓冲的base64编码器流

I have lots of PDF files that I need to get its content encoded using base64. I have an Akka app which fetch the files as stream and distributes to many workers to encode these files and returns the string base64 for each file. I got a basic solution for encoding:

org.apache.commons.codec.binary.Base64InputStream;

...

Base64InputStream b64IStream = null;

InputStreamReader reader = null;

BufferedReader br = null;

StringBuilder sb = new StringBuilder();

try {

b64IStream = new Base64InputStream(input, true);

reader = new InputStreamReader(b64IStream);

br = new BufferedReader(reader);

String line;

while ((line = br.readLine()) != null) {

sb.append(line);

}

} finally {

if (b64IStream != null) {

b64IStream.close();

}

if (reader != null) {

reader.close();

}

if (br != null) {

br.close();

}

}

It works, but I would like to know what would be the best way that I can encode the files using a buffer and if there is a faster alternative for this.

I tested some other approaches such as:

Base64.getEncoder

sun.misc.BASE64Encoder

Base64.encodeBase64

javax.xml.bind.DatatypeConverter.printBase64

com.google.guava.BaseEncoding.base64

They are faster but they need the entire file, correct? Also, I do not want to block other threads while encoding 1 PDF file.

Any input is really helpful. Thank you!

解决方案

Fun fact about Base64: It takes three bytes, and converts them into four letters. This means that if you read binary data in chunks that are divisible by three, you can feed the chunks to any Base64 encoder, and it will encode it in the same way as if you fed it the entire file.

Now, if you want your output stream to just be one long line of Base64 data - which is perfectly legal - then all you need to do is something along the lines of:

private static final int BUFFER_SIZE = 3 * 1024;

try ( BufferedInputStream in = new BufferedInputStream(input, BUFFER_SIZE); ) {

Base64.Encoder encoder = Base64.getEncoder();

StringBuilder result = new StringBuilder();

byte[] chunk = new byte[BUFFER_SIZE];

int len = 0;

while ( (len = in.read(chunk)) == BUFFER_SIZE ) {

result.append( encoder.encodeToString(chunk) );

}

if ( len > 0 ) {

chunk = Arrays.copyOf(chunk,len);

result.append( encoder.encodeToString(chunk) );

}

}

This means that only the last chunk may have a length that is not divisible by three and will therefore contain the padding characters.

The above example is with Java 8 Base64, but you can really use any encoder that takes a byte array of an arbitrary length and returns the base64 string of that byte array.

This means that you can play around with the buffer size as you wish.

If you want your output to be MIME compatible, however, you need to have the output separated into lines. In this case, I would set the chunk size in the above example to something that, when multiplied by 4/3, gives you a round number of lines. For example, if you want to have 64 characters per line, each line encodes 64 / 4 * 3, which is 48 bytes. If you encode 48 bytes, you'll get one line. If you encode 480 bytes, you'll get 10 full lines.

So modify the above BUFFER_SIZE to something like 4800. Instead of Base64.getEncoder() use Base64.getMimeEncoder(64,new byte[] { 13, 10}). And then, when it encodes, you'll get 100 full-sized lines from each chunk except the last. You may need to add a result.append("\r\n") to the while loop.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值