Text2 = text1.split(‘ ‘ )
[w for w in text2 if w.endswith(‘s’)]
Find unique words : set(text4) set([w.lower() for w in text4])
S.startswith(t)
S.endswith(t)
T in s
S.isupper(); s.islower(); s.istitle()
S.isalpha(); s.isdigit(); s.isalnum()
S.lower(); s.upper(); s.titlecase()
S.split(t); s.splitlines()
S.join(t)
S.strip(); s.rstrip()
S.find(t); s.rfind(t)
S.replace(u,v)
File operation
F = open(filename,mode)
F.readline(); f.read(); f.read(n)
For lline in f: do something(line)
F.seek(0) reset the reading pointer
F.write(message)
F.close()
How do you remove the last newline character?
Text14.rstrip()
Finding specific words
Hashtags
[w for w in text11 if w.startswith(‘#’)]
Callouts
[w for w in text11 if w.startswith(‘@’)]
Import regular expression first
>>>Import re
>>>[w for w in text11 if re.search(‘@[A-Za-z0-9]+’,w)
>>>[‘@UN’ , ’@UN_Women’]
Means:
Starts with @
Followed by any alphabet(upper or lower case),digit, or underscore that repeats at least once,but any number of times.
Text12 = ‘ouagadougou’
re.findall(r’[aeiou]’, text12) find all vowel
Re.findall(r’[^aeiou]’, text12) find out everything that is not a vowel