import pandas as pd
df = pd.DataFrame({"name": ["wang","wei","zhao","li","uw"],
"gender": ["boy","girl","girl","boy","girl"],
"score": [56,67,47,87,None]
})
1.比较运算
1.等于: =
2.大于: >
3.大于等于: >=
4.小于: <
5.小于等于: <=
6.不等于: !
df[df['score']>60] # 分数大于60的
2.范围查询isin
df[df['score'].isin([87,88])] # 得分在87,和88范围内的同学
Series.between(left, right[, inclusive])
3.空值查询
df[df['score'].isnull()]
df[df['score'].notnull()]
4.模糊查询
df[df['name'].str.contains('^w')] # 以name字段以w开头的
5.逻辑运算符
1.and or not
用来判断单个元素的逻辑,pandas序列比较时使用& | ~
逻辑运算符
1.| for or
2.& for and
3.~for not
df[df['name'].str.contains('^w') & (df['gender'] == 'boy')] # 'name'以w开头并且'gender'等于boy
api
contains
Series.str.contains(*args, **kwargs)[source]
参数 | 描述 |
---|---|
pat | str,字符串或正则表达式 |
case | bool, default True,大小写是否敏感 |
flags | int, default 0 (no flags),正则的标志,比如re.IGNORECASE. |
na | default NaN |
regex | bool, default True,是否开启正则表达式 |
返回 | Series or Index of boolean values |