java tostring apache,Java Apache FileUtils readFileToString and writeStringToFile problems

问题

I need to parse a java file (actually a .pdf) to an String and go back to a file. Between those process I'll apply some patches to the given string, but this is not important in this case.

I've developed the following JUnit test case:

String f1String=FileUtils.readFileToString(f1);

File temp=File.createTempFile("deleteme", "deleteme");

FileUtils.writeStringToFile(temp, f1String);

assertTrue(FileUtils.contentEquals(f1, temp));

This test converts a file to a string and writtes it back. However the test is failing.

I think it may be because of the encodings, but in FileUtils there is no much detailed info about this.

Anyone can help?

Thanks!

Added for further undestanding:

Why I need this?

I have very large pdfs in one machine, that are replicated in another one. The first one is in charge of creating those pdfs. Due to the low connectivity of the second machine and the big size of pdfs, I don't want to synch the whole pdfs, but only the changes done.

To create patches/apply them, I'm using the google library DiffMatchPatch. This library creates patches between two string. So I need to load a pdf to an string, apply a generated patch, and put it back to a file.

回答1:

A PDF is not a text file. Decoding (into Java characters) and re-encoding of binary files that are not encoded text is asymmetrical.  For example, if the input bytestream is invalid for the current encoding, you can be assured that it won't re-encode correctly.  In short - don't do that.  Use readFileToByteArray and writeByteArrayToFile instead.

回答2:

Just a few thoughts:

There might actually some BOM (byte order mark) bytes in one of the files that either gets stripped when reading or added during writing. Is there a difference in the file size (if it is the BOM the difference should be 2 or 3 bytes)?

The line breaks might not match, depending which system the files are created on, i.e. one might have CR LF while the other only has LF or CR. (1 byte difference per line break)

According to the JavaDoc both methods should use the default encoding of the JVM, which should be the same for both operations. However, try and test with an explicitly set encoding (JVM's default encoding would be queried using System.getProperty("file.encoding")).

回答3:

Ed Staub awnser points why my solution is not working and he suggested using bytes instead of Strings. In my case I need an String, so the final working solution I've found is the following:

@Test

public void testFileRWAsArray() throws IOException{

String f1String="";

byte[] bytes=FileUtils.readFileToByteArray(f1);

for(byte b:bytes){

f1String=f1String+((char)b);

}

File temp=File.createTempFile("deleteme", "deleteme");

byte[] newBytes=new byte[f1String.length()];

for(int i=0; i

char c=f1String.charAt(i);

newBytes[i]= (byte)c;

}

FileUtils.writeByteArrayToFile(temp, newBytes);

assertTrue(FileUtils.contentEquals(f1, temp));

}

By using a cast between byte-char, I have the symmetry on conversion.

Thank you all!

回答4:

Try this code...

public static String fetchBase64binaryEncodedString(String path) {

File inboundDoc = new File(path);

byte[] pdfData;

try {

pdfData = FileUtils.readFileToByteArray(inboundDoc);

} catch (IOException e) {

throw new RuntimeException(e);

}

byte[] encodedPdfData = Base64.encodeBase64(pdfData);

String attachment = new String(encodedPdfData);

return attachment;

}

//How to decode it

public void testConversionPDFtoBase64() throws IOException

{

String path = "C:/Documents and Settings/kantab/Desktop/GTR_SDR/MSDOC.pdf";

File origFile = new File(path);

String encodedString = CreditOneMLParserUtil.fetchBase64binaryEncodedString(path);

//now decode it

byte[] decodeData = Base64.decodeBase64(encodedString.getBytes());

String decodedString = new String(decodeData);

//or actually give the path to pdf file.

File decodedfile = File.createTempFile("DECODED", ".pdf");

FileUtils.writeByteArrayToFile(decodedfile,decodeData);

Assert.assertTrue(FileUtils.contentEquals(origFile, decodedfile));

// Frame frame = new Frame("PDF Viewer");

// frame.setLayout(new BorderLayout());

}

来源:https://stackoverflow.com/questions/7502825/java-apache-fileutils-readfiletostring-and-writestringtofile-problems

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值