我应该如何在python中编写这个正则表达式(How should I write this regex in python)
我有字符串。
st = "12345 hai how r u @3456? Awer12345 7890"
re.findall('([0-9]+)',st)
它应该不会像:
['12345', '3456', '12345', '7890']
我应该得到
['12345','7890']
我应该只取数值
和
它不应该包含任何其他字符,如字母,特殊字符
I have the string.
st = "12345 hai how r u @3456? Awer12345 7890"
re.findall('([0-9]+)',st)
It should not come like :
['12345', '3456', '12345', '7890']
I should get
['12345','7890']
I should only take the numeric values
and
It should not contain any other chars like alphabets,special chars
原文:https://stackoverflow.com/questions/9216382
更新时间:2019-12-16 16:50
最满意答案
In [21]: re.findall(r'(?:^|\s)(\d+)(?=$|\s)', st)
Out[21]: ['12345', '7890']
这里,
(?:^|\s)是一个非捕获组,它匹配字符串的开头或空格。
(\d+)是一个匹配一个或多个数字的捕获组。
(?=$|\s)是前瞻性断言,它匹配字符串的末尾或空格, 而不消耗它 。
In [21]: re.findall(r'(?:^|\s)(\d+)(?=$|\s)', st)
Out[21]: ['12345', '7890']
Here,
(?:^|\s) is a non-capture group that matches the start of the string, or a space.
(\d+) is a capture group that matches one or more digits.
(?=$|\s) is lookahead assertion that matches the end of the string, or a space, without consuming it.
2012-02-09
相关问答
In [21]: re.findall(r'(?:^|\s)(\d+)(?=$|\s)', st)
Out[21]: ['12345', '7890']
这里, (?:^|\s)是一个非捕获组,它匹配字符串的开头或空格。 (\d+)是一个匹配一个或多个数字的捕获组。 (?=$|\s)是前瞻性断言,它匹配字符串的末尾或空格, 而不消耗它 。 In [21]: re.findall(r'(?:^|\s)(\d+)(?=$|\s)', st)
Out[21]: ['12345', '7890']
H
...
>>> import re
>>> regex = re.compile('(?:red,(?P\w+)|(?P\w+),red)')
>>> string = "blue,red red,yellow blue,yellow red,green purple red, ..."
>>> for matches in regex.finditer(string):
... if matches.group('redf
...
如果使用Python 2.7,请使用Unicode字符串。 我假设你的“我需要的”例子是不正确的,或者你真的想要AAAAA用于طس ? 如果从文件中读取字符串,请首先将字符串解码为Unicode。 #!python2
#coding: utf8
import re
# Note leading u
data = u'TX 35-L|М-21|A 1 طس|US-50|yeni sinop-erfelek yolu çevre yolu|Av Antônio Ribeiro'.split('|')
...
>>> import re
>>> strs = 'The output is\n1) python\nA dynamic language\neasy to learn\n2) C++\ndifficult to learn\n3244) PHP\neay to learn\n'
>>> re.findall(r'\d+\)\s[^\d]+',strs)
['1) python\nA dynamic language\neasy to learn\n',
'2) C++\ndifficult t
...
(从技术上讲,多行字符串!=多行注释。但这不是重点) 正则表达式(['"])\1\1(.*?)\1{3}应该有效,但请确保使用re.DOTALL 。 (['"])找到'或"并在\1捕获它 \1\1找到2个相同的引号 (.*?)抓住一切,直到... \1{3}找到三个相同的引号 (Technically, multiline strings != multiline comments. But that's aside from the point) The regex (['"])\1\1(.*?
...
您可以使用 date ([a-z]{3} \d{2}) at (\d{2}) ([PA]M)
见演示 比较你的两个选择: date ([a-z]{3} [0-9]{2}) at ([0-9]{2}) ([P][M])
date ([a-z]{3} [0-9]{2}) at ([0-9]{2}) ([A][M])
注意它们有多相似。 我们只需为PM或AM添加1个替代方案。 可以使用与P或A匹配的字符类[PA]来完成。 而不是[0-9] ,你可以使用速记类\d (它有点短:),并且不要忘记将正则表
...
使用正则表达式非捕获组和regex.findAll函数的解决方案: import regex
...
fh = open('lines.txt', 'r'); // considering 'lines.txt' is your initial file
commlines = fh.read()
_sched_wakeup_pattern = regex.compile(r"""
comm=(?P[\S]+?)
\spid=(?P\d+)
\spri
...
[^\W\d]
丢弃非单词字符并丢弃数字。 保持休息。 [^\W\d]
Throw out non-word characters and throw out digits. Keep the rest.
阅读文档 。 re.findall返回组,如果有的话。 如果您想要整个匹配,则必须将其全部分组,或使用re.finditer 。 看到这个问题 。 Read the documentation. re.findall returns the groups, if there are any. If you want the entire match you must group it all, or use re.finditer. See this question.
尝试使用 \[xxxcixxx\[\[_'.*?'\] \[_'.*?'\]\]xxxcixxx\]
演示: http : //regexr.com/3d887 Try using \[xxxcixxx\[\[_'.*?'\] \[_'.*?'\]\]xxxcixxx\]
Demo: http://regexr.com/3d887