Task01-pandas

最新推荐文章于 2024-08-13 18:29:56 发布

ShieldVictory

最新推荐文章于 2024-08-13 18:29:56 发布

阅读量214

点赞数

分类专栏： pandas 文章标签： python

本文链接：https://blog.csdn.net/mimiguo/article/details/105633110

版权

pandas 专栏收录该内容

6 篇文章 0 订阅

订阅专栏

今天学到了新内容~pandas基础知识

1.读取txt文件

df_txt = pd.read_table('/table.txt') 
df_txt

2.series属性以前并不了解，现在看了还是有点懵

s = pd.Series(np.random.randn(5),index=['a','b','c','d','e'],name='这是一个Series',dtype='float64')

刚才写到4的时候回过来仔细看看，还是明白的，对我这种入门小白来说，稍微一看到长的代码，还没仔细看就给自己吓怕了。。。👆

print([attr for attr in dir(s) if not attr.startswith('_')])

不是很理解这一段，应该是把所有可以应用的给列出来了👆

3.DataFrame 平时用的稍微多一些，但是大多也都是皮毛
*修改行或者列名字，这个没用过，这里贴出来提示自己记住👇

df.rename(index={'一':'one'},columns={'col1':'new_col1'})

求每一列的平均值👇

df.mean()

4.索引对齐特性很重要⭐⭐
assign是增加新的列，这里用到了上面讲过的series，就有点不懂了

df1.assign(C=pd.Series(list('def')))

5.将series转换为dataframe，，让我想到昨天做的一个实例就是如何将list转化为dataframe，不过我还没有解决，♥放个标志提示一下♥

s = df.mean()
s.name='to_DataFrame'
s
s.to_frame()

意思就是先将s这个series数据结构求平均值，然后s里面的name变成“to_DataFrame”，然后让我们来看看s现在变成了dataframe格式么。
♥让我思考起来，list变为dataframe，只要也to_frame()？？需要尝试一下♥
6.关于daataframe的预览
分别预览前五行，后五行的代码是👇，指定显示多少行

df.head()
df.tail()
df.head(3)

7.emmmm细节内容很多，加油↖(^ω)↗

8.⭐需要改正的误区就是，不能觉得自己用过的常用的就很重要，看起来眼熟，也愿意再次学习
⭐但其实，没学过的，没用过的，没见过的，也更重要，然后学以致用是很关键的！
⭐这个蛮重要，describe默认统计数值型数据的各个统计量

df.describe()

9.dataframe迭代每一个列的操作
懵懵懂懂先记下来

df.apply(lambda x:x.apply(lambda x:str(x)+'!')).head()

10.Series和DataFrame有哪些常见属性和方法?
：回答：series是一种类似于一维数组的对象，由一组数据和一组标签（index）组成。一组数据也可以产生简单的series对象。
：：dataframe呢是pandas中的一个表格型数据结构，有行索引也有列索引，可以看做事series组成的字典。
11.value_counts会统计缺失值吗？
：：counts本身返回的就是非缺失值元素个数
：：value_counts返回每个元素有多少个，并不包含缺失值
：：答案为不是

df['Name'].value_counts()
df['Name'].value_counts().index[0]
df['Name'].value_counts().index[1]

统计所有的
.index[0]所有中的只第一列显示
.index[1]所有中的只第二列显示
12.
20200420 15:15
在思考ing。。。

pd.Series(list(zip(df['action_type'],df['combined_shot_type']))).value_counts().index[0]

两种df的组合，筛选…？
zip是这样解释的
…
Init signature: zip(self, /, *args, **kwargs)
Docstring:
zip(iter1 [,iter2 […]]) --> zip object
Return a zip object whose .next() method returns a tuple where
the i-th element comes from the i-th iterable argument. The .next()
method continues until the shortest iterable in the argument sequence
is exhausted and then it raises StopIteration.
Type: type
Subclasses:
…

pd.Series(list(list(zip(*(pd.Series(list(zip(df['game_id'],df['opponent'])))
                          .unique()).tolist()))[1])).value_counts().index[0]

到了这里就。。。

ShieldVictory

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Task01-pandas

今天学到了新内容~pandas基础知识1.读取txt文件df_txt = pd.read_table('/table.txt') df_txt2.series属性以前并不了解，现在看了还是有点懵s = pd.Series(np.random.randn(5),index=['a','b','c','d','e'],name='这是一个Series',dtype='float64')...
复制链接

扫一扫

专栏目录