I want to extract month and year from string. For example:
If I have a string From August 2017 - September 2018 then I should get 'August 2017' and 'September 2018' as two groups. I tried the following:
import re
regex = r'(\b\d{1,2}\D{0,3}\b-)?\b(?:Jan(?:uary)?|Feb(?:ruary)?|Mar(?:ch)?|Apr(?:il)?|May|Jun(?:e)?|Jul(?:y)?|Aug(?:ust)?|Sep(?:tember)?|Oct(?:ober)?|(Nov|Dec)(?:ember)?)\D?(\d{1,2}\D?)?\D?((19[7-9]\d|20\d{2})|\d{2})'
experience = re.findall(regex, 'August 2017 - Sep 2018')
print(experience)
This returns [('', '', '20', '17', ''), ('', '', '20', '18', '')]
I also tried re.search:
import re
regex = r'(\b\d{1,2}\D{0,3}\b-)?\b(?:Jan(?:uary)?|Feb(?:ruary)?|Mar(?:ch)?|Apr(?:il)?|May|Jun(?:e)?|Jul(?:y)?|Aug(?:ust)?|Sep(?:tember)?|Oct(?:ober)?|(Nov|Dec)(?:ember)?)\D?(\d{1,2}\D?)?\D?((19[7-9]\d|20\d{2})|\d{2})'
experience = re.search(regex, 'August 2017 - Sep 2018')
print(experience.group())
This returns only August 2017
Can we have some regex to extract both the dates?
解决方案
Do you mean like this? Regex demo.
import re
string = "From August 2017 - September 2018"
month = re.search("(?P\w+.\d+)\s+\-\s+(?P\w+.\d+)", string)
month = month.groups()
print (month)
Output:
('August 2017', 'September 2018')