一句话:
"Hallo, I'm a dog. The end. Someone said: \"Earth is Earth\". Is it good? I like it! 'He is right' said I."
当前正则表达式:
\\s+|(?<=[\\p{Punct}&&[^']])|(?=[\\p{Punct}&&[^']])
当前的结果:
{"Hallo", ",", "I'm", "a", "dog", ".", "The", "end", ".", "Someone",
"said", ":", **""**, """ , "Earth", "is", "Earth", """, ".", "Is", "it",
"good", "?", "I", "like", "it", "!", **"'He"**, "is", **"right'"**,
"said", "I", "."}
我有多余的""一次报价符号前,它不会分裂“从单词。
结果,我想:
{"Hallo", ",", "I'm", "a", "dog", ".", "The", "end", ".", "Someone",
"said", ":", """ , "Earth", "is", "Earth", """, ".", "Is", "it",
"good", "?", "I", "like", "it", "!", "'" , "He", "is", "right", "'",
"said", "I", "."}
编辑: 对不起!更多的代码,然后:
String toTest = "Hallo, I'm a dog. The end. Someone said: \"Earth is Earth\". Is it good? I like it! 'He is right' said I.";
String [] words = toTest.split("\\s+|(?<=[\\p{Punct}&&[^']])|(?=[\\p{Punct}&&[^']])");
,并产生单词列表: “ ”
话= { “你好”,“”, “我”, “一”, “狗”, “The”,“end”,“。”,“Someone”, “said”,“:”,“”“”,“”,“Earth”,“is”,“Earth” “”,“Is”,“it”, “好”,“?”,“我”,“like”,“它”,“!”,“'他”,“is”,“ right'“, ”said“,”I“,”。“}
+0
我在您的问题中看不到任何Java代码。 –
2014-11-21 13:04:29
+0
@LutzHorn正则表达式是一个Java代码。 –
2014-11-21 13:05:46
+0
@RealSkeptic为什么不是Perl,Python或Ruby? –
2014-11-21 13:10:21