java流数据base64,Java缓冲的base64编码器流

最新推荐文章于 2023-12-26 23:57:02 发布

weixin_39836876

最新推荐文章于 2023-12-26 23:57:02 发布

阅读量87

点赞数

文章标签： java流数据base64

I have lots of PDF files that I need to get its content encoded using base64. I have an Akka app which fetch the files as stream and distributes to many workers to encode these files and returns the string base64 for each file. I got a basic solution for encoding:

org.apache.commons.codec.binary.Base64InputStream;

...

Base64InputStream b64IStream = null;

InputStreamReader reader = null;

BufferedReader br = null;

StringBuilder sb = new StringBuilder();

try {

b64IStream = new Base64InputStream(input, true);

reader = new InputStreamReader(b64IStream);

br = new BufferedReader(reader);

String line;

while ((line = br.readLine()) != null) {

sb.append(line);

}

} finally {

if (b64IStream != null) {

b64IStream.close();

}

if (reader != null) {

reader.close();

}

if (br != null) {

br.close();

}

It works, but I would like to know what would be the best way that I can encode the files using a buffer and if there is a faster alternative for this.

I tested some other approaches such as:

Base64.getEncoder

sun.misc.BASE64Encoder

Base64.encodeBase64

javax.xml.bind.DatatypeConverter.printBase64

com.google.guava.BaseEncoding.base64

They are faster but they need the entire file, correct? Also, I do not want to block other threads while encoding 1 PDF file.

Any input is really helpful. Thank you!

解决方案

Fun fact about Base64: It takes three bytes, and converts them into four letters. This means that if you read binary data in chunks that are divisible by three, you can feed the chunks to any Base64 encoder, and it will encode it in the same way as if you fed it the entire file.

Now, if you want your output stream to just be one long line of Base64 data - which is perfectly legal - then all you need to do is something along the lines of:

private static final int BUFFER_SIZE = 3 * 1024;

try ( BufferedInputStream in = new BufferedInputStream(input, BUFFER_SIZE); ) {

Base64.Encoder encoder = Base64.getEncoder();

StringBuilder result = new StringBuilder();

byte[] chunk = new byte[BUFFER_SIZE];

int len = 0;

while ( (len = in.read(chunk)) == BUFFER_SIZE ) {

result.append( encoder.encodeToString(chunk) );

}

if ( len > 0 ) {

chunk = Arrays.copyOf(chunk,len);

result.append( encoder.encodeToString(chunk) );

}

This means that only the last chunk may have a length that is not divisible by three and will therefore contain the padding characters.

The above example is with Java 8 Base64, but you can really use any encoder that takes a byte array of an arbitrary length and returns the base64 string of that byte array.

This means that you can play around with the buffer size as you wish.

If you want your output to be MIME compatible, however, you need to have the output separated into lines. In this case, I would set the chunk size in the above example to something that, when multiplied by 4/3, gives you a round number of lines. For example, if you want to have 64 characters per line, each line encodes 64 / 4 * 3, which is 48 bytes. If you encode 48 bytes, you'll get one line. If you encode 480 bytes, you'll get 10 full lines.

So modify the above BUFFER_SIZE to something like 4800. Instead of Base64.getEncoder() use Base64.getMimeEncoder(64,new byte[] { 13, 10}). And then, when it encodes, you'll get 100 full-sized lines from each chunk except the last. You may need to add a result.append("\r\n") to the while loop.

weixin_39836876

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
java流数据base64,Java缓冲的base64编码器流

I have lots of PDF files that I need to get its content encoded using base64. I have an Akka app which fetch the files as stream and distributes to many workers to encode these files and returns the s...
复制链接

扫一扫

java流数据base64,Java缓冲的base64编码器流

“相关推荐”对你有帮助么？