I want to split a String into a String array along non-alphabetic characters. For example:
"Here is an ex@mple" => "Here", "is", "an" "ex", "mple"
I tried using the String.split(String regex) method with the regular expression "(?![\\p{Alpha}])". However this splits the string into
"Here", "_is", "_an", "_ex", "@ample"
(those underscores are to emphasize there is a space). I guess this is because the ?! regex operator is "zero-width" and is actually splitting on and removing a zero-width character preceding the non-alphabetic characters in the input string.
How can I accomplish removal of the actual non-alpha characters while I split the string? Is there a NON-zero-width negation operator?
解决方案
You could try \P{Alpha}+:
"Here is an ex@mple".split("\\P{Alpha}+")
["Here", "is", "an", "ex", "mple"]
\P{Alpha} matches any non-alphabetic character (as opposed to \p{Alpha}, which matches any alphabetic character). + indicates that we should split on any continuous string of such characters. For example:
"a!@#$%^&*b".split("\\P{Alpha}+")
["a", "b"]