一、Pandas 常用字符串方法
方法名 说明 startswith() 判断以某个字符开头 endswith() 判断以某个字符结束 repeat() 将字符串重复指定次数 split() 使用指定字符分割字符串(默认为逗号),结果为一个列表 replace() 以指定字符串替换指定部分 find() 查找某个字符的索引,若不存在则返回-1 lower() 所有字符转成小写 upper() 所有字符转成大写 title() 每一个单词的首字母大写 capitalize() 第一个字母大写 strip()/lstrip()/rstrip() 删除前后空格/删除左边空格/删除右边空格 contains() 是否包含某个字符串 len() 字符串长度 zfill() 使用0填充到指定字符串长度,只从左边开始填充 pad() 使用指定的字符填充到指定字符串长度,只能使用一个字符填充,默认从左开始填充,side 参数指定填充方向,width 参数指定填充长度,fillchar 参数指定填充字符 match() 使用正则表达式匹配
二、实操案例
1. 数据准备
import pandas as pd
df = pd. DataFrame( {
"name" : [ "张三" , "李四" ] ,
"fav" : [ "篮球,足球,看书" , "吉他,健身,刷剧" ] ,
"language" : [ "Java/Python/SQL" , "C#/Go/Scala" ] ,
"english_name" : [ "tom" , "JERRY" ] ,
"slogan" : [ "nothing is impossible" , "keep it going" ]
} )
print ( df)
name fav language english_name slogan
0 张三 篮球,足球,看书 Java/Python/SQL tom nothing is impossible
1 李四 吉他,健身,刷剧 C#/Go/Scala JERRY keep it going
2. 基本使用
print ( df[ "language" ] . str . startswith( "J" ) )
print ( df[ "language" ] . str . endswith( "a" ) )
0 True
1 False
Name: language, dtype: bool
0 False
1 True
Name: language, dtype: bool
print ( df[ "english_name" ] . str . repeat( 3 ) )
0 tomtomtom
1 JERRYJERRYJERRY
Name: english_name, dtype: object
print ( df[ "fav" ] . str . split( ) )
print ( df[ "language" ] . str . split( "/" ) )
0 [篮球,足球,看书]
1 [吉他,健身,刷剧]
Name: fav, dtype: object
0 [Java, Python, SQL]
1 [C#, Go, Scala]
Name: language, dtype: object
print ( df[ "language" ] . str . replace( "C#" , "C++" ) )
0 Java/Python/SQL
1 C++/Go/Scala
Name: language, dtype: object
print ( df[ "fav" ] . str . find( "足球" ) )
0 3
1 -1
Name: fav, dtype: int64
print ( df[ "english_name" ] . str . lower( ) )
print ( df[ "english_name" ] . str . upper( ) )
0 tom
1 jerry
Name: english_name, dtype: object
0 TOM
1 JERRY
Name: english_name, dtype: object
print ( df[ "slogan" ] . str . title( ) )
0 Nothing Is Impossible
1 Keep It Going
Name: slogan, dtype: object
print ( df[ "slogan" ] . str . capitalize( ) )
0 Nothing is impossible
1 Keep it going
Name: slogan, dtype: object
print ( df[ "language" ] . str . contains( "thon" ) )
0 True
1 False
Name: language, dtype: bool
print ( df[ "slogan" ] . str . len ( ) )
0 21
1 13
Name: slogan, dtype: int64
print ( df[ "name" ] . str . zfill( 5 ) )
0 000张三
1 000李四
Name: name, dtype: object
print ( df[ "name" ] . str . pad( width= 8 , fillchar= "*" ) )
print ( df[ "name" ] . str . pad( width= 7 , fillchar= "*" , side= "both" ) )
print ( df[ "english_name" ] . str . pad( width= 7 , fillchar= "-" , side= "right" ) )
0 ******张三
1 ******李四
Name: name, dtype: object
0 ***张三**
1 ***李四**
Name: name, dtype: object
0 tom----
1 JERRY--
Name: english_name, dtype: object