java 正则 标点符号,关于标点符号的正则表达式

So I'm completely new to regular expressions, and I'm trying to use Java's java.util.regex to find punctuation in input strings. I won't know what kind of punctuation I might get ahead of time, except that (1) !, ?, ., ... are all valid puncutation, and (2) "" mean something special, and don't count as punctuation.

The program itself builds phrases pseudo-randomly, and I want to strip off the punctuation at the end of a sentence before it goes through the random process.

I can match entire words with any punctuation, but the matcher just gives me indexes for that word. In other words:

Pattern p = Pattern.compile("(.*\\!)*?");

Matcher m = p.matcher([some input string]);

will grab any words with a "!" on the end. For example:

String inputString = "It is a warm Summer day!";

Pattern p = Pattern.compile("(.*\\!)*?");

Matcher m = p.matcher(inputString);

String match = inputString.substring(m.start(), m.end());

results in --> String match ~ "day!"

But I want to have Matcher index just the "!", so I can just split it off.

I could probably make cases, and use String.substring(...) for each kind of punctuation I might get, but I'm hoping there's some mistake in my use of regular expressions to do this.

解决方案

I would try a character class regex similar to

"[.!?\\-]"

Add whatever characters you wish to match inside the []s. Be careful to escape any characters that might have a special meaning to the regex parser.

You then have to iterate through the matches by using Matcher.find() until it returns false.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值