java nul 字符,Java字符串替换和NUL(NULL,ASCII 0)字符?

"本文讨论了在Java中使用空字符('')替换字符串中的特定字符时的行为。虽然空字符可以成功替换,但它不会从字符串中移除该字符,而是一个有效的Unicode字符。此外,字符串的渲染可能因字体和编码而显得'怪异',而不是由于空字符本身。正确移除字符的简单方法是使用`replace()`方法。"
摘要由CSDN通过智能技术生成

Testing out someone elses code, I noticed a few JSP pages printing funky non-ASCII characters. Taking a dip into the source I found this tidbit:

// remove any periods from first name e.g. Mr. John --> Mr John

firstName = firstName.trim().replace('.','\0');

Does replacing a character in a String with a null character even work in Java? I know that '\0' will terminate a C-string. Would this be the culprit to the funky characters?

解决方案

Does replacing a character in a String with a null character even work in Java? I know that '\0' will terminate a c-string.

That depends on how you define what is working. Does it replace all occurrences of the target character with '\0'? Absolutely!

String s = "food".replace('o', '\0');

System.out.println(s.indexOf('\0')); // "1"

System.out.println(s.indexOf('d')); // "3"

System.out.println(s.length()); // "4"

System.out.println(s.hashCode() == 'f'*31*31*31 + 'd'); // "true"

Everything seems to work fine to me! indexOf can find it, it counts as part of the length, and its value for hash code calculation is 0; everything is as specified by the JLS/API.

It DOESN'T work if you expect replacing a character with the null character would somehow remove that character from the string. Of course it doesn't work like that. A null character is still a character!

String s = Character.toString('\0');

System.out.println(s.length()); // "1"

assert s.charAt(0) == 0;

It also DOESN'T work if you expect the null character to terminate a string. It's evident from the snippets above, but it's also clearly specified in JLS (10.9. An Array of Characters is Not a String):

In the Java programming language, unlike C, an array of char is not a String, and neither a String nor an array of char is terminated by '\u0000' (the NUL character).

Would this be the culprit to the funky characters?

Now we're talking about an entirely different thing, i.e. how the string is rendered on screen. Truth is, even "Hello world!" will look funky if you use dingbats font. A unicode string may look funky in one locale but not the other. Even a properly rendered unicode string containing, say, Chinese characters, may still look funky to someone from, say, Greenland.

That said, the null character probably will look funky regardless; usually it's not a character that you want to display. That said, since null character is not the string terminator, Java is more than capable of handling it one way or another.

Now to address what we assume is the intended effect, i.e. remove all period from a string, the simplest solution is to use the replace(CharSequence, CharSequence) overload.

System.out.println("A.E.I.O.U".replace(".", "")); // AEIOU

The replaceAll solution is mentioned here too, but that works with regular expression, which is why you need to escape the dot meta character, and is likely to be slower.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值