Python数据分析pandas入门练习题（六）

最新推荐文章于 2024-05-06 09:30:00 发布

Geek_bao

最新推荐文章于 2024-05-06 09:30:00 发布

阅读量2k

点赞数

分类专栏：利用Python进行数据分析文章标签： python pandas 数据分析数据集

本文链接：https://blog.csdn.net/baobaobao0000/article/details/118549250

版权

Python数据分析基础

Preparation
Exercise 1-Student Alcohol Consumption
Exercise 2-United States - Crime Rates - 1960 - 2014
Conclusion

Preparation

下面是练习题的数据集，尽量下载下来使用。下面习题的连接不一定能打开。
需要数据集可以私聊博主或者自行网上寻找，传到csdn，你们下载要会员，就不传了。

Exercise 1-Student Alcohol Consumption

Introduction:

This time you will download a dataset from the UCI.

Step 1. Import the necessary libraries

import pandas as pd

Step 2. Import the dataset from this address.

Step 3. Assign it to a variable called df.

代码如下：

df = pd.read_csv("student-mat.csv", sep=',')
df.head()

输出结果如下：

	school	sex	age	address	famsize	Pstatus	Medu	Fedu	Mjob	Fjob	...	famrel	freetime	goout	Dalc	Walc	health	absences	G1	G2	G3
0	GP	F	18	U	GT3	A	4	4	at_home	teacher	...	4	3	4	1	1	3	6	5	6	6
1	GP	F	17	U	GT3	T	1	1	at_home	other	...	5	3	3	1	1	3	4	5	5	6
2	GP	F	15	U	LE3	T	1	1	at_home	other	...	4	3	2	2	3	3	10	7	8	10
3	GP	F	15	U	GT3	T	4	2	health	services	...	3	2	2	1	1	5	2	15	14	15
4	GP	F	16	U	GT3	T	3	3	other	other	...	4	3	2	1	2	5	4	6	10	10

5 rows × 33 columns

Step 4. For the purpose of this exercise slice the dataframe from ‘school’ until the ‘guardian’ column

代码如下：

stud_alcoh = df.loc[:, 'school':'guardian']   # loc切片一般用行列名，iloc一般用行列号
stud_alcoh.head()

输出结果如下：

	school	sex	age	address	famsize	Pstatus	Medu	Fedu	Mjob	Fjob	reason	guardian
0	GP	F	18	U	GT3	A	4	4	at_home	teacher	course	mother
1	GP	F	17	U	GT3	T	1	1	at_home	other	course	father
2	GP	F	15	U	LE3	T	1	1	at_home	other	other	mother
3	GP	F	15	U	GT3	T	4	2	health	services	home	mother
4	GP	F	16	U	GT3	T	3	3	other	other	home	father

Step 5. Create a lambda function that capitalize strings.

代码如下：

capitalizer = lambda str: str.capitalize()  #capitalize()将字符串首字母转换为大写字母，upper()将整个字符串转化为大写
print(capitalizer('www'))

输出结果如下：

Www

Step 6. Capitalize both Mjob and Fjob

代码如下：

# for i in df['Mjob']:
#    print(capitalizer(i))
stud_alcoh.Mjob.apply(capitalizer)
stud_alcoh.Fjob.apply(capitalizer)

输出结果如下：

0       Teacher
1         Other
2         Other
3      Services
4         Other
5         Other
6         Other
7       Teacher
8         Other
9         Other
10       Health
11        Other
12     Services
13        Other
14        Other
15        Other
16     Services
17        Other
18     Services
19        Other
20        Other
21       Health
22        Other
23        Other
24       Health
25     Services
26        Other
27     Services
28        Other
29      Teacher
         ...   
365       Other
366    Services
367    Services
368    Services
369     Teacher
370    Services
371    Services
372     At_home
373       Other
374       Other
375       Other
376       Other
377    Services
378       Other
379       Other
380     Teacher
381       Other
382    Services
383    Services
384       Other
385       Other
386     At_home
387       Other
388    Services
389       Other
390    Services
391    Services
392       Other
393       Other
394     At_home
Name: Fjob, Length: 395, dtype: object

Step 7. Print the last elements of the data set.

代码如下：

# df.iloc[394, 32]
stud_alcoh.tail()

输出结果如下：

	school	sex	age	address	famsize	Pstatus	Medu	Fedu	Mjob	Fjob	reason	guardian
390	MS	M	20	U	LE3	A	2	2	services	services	course	other
391	MS	M	17	U	LE3	T	3	1	services	services	course	mother
392	MS	M	21	R	GT3	T	1	1	other	other	course	other
393	MS	M	18	R	LE3	T	3	2	services	other	course	mother
394	MS	M	19	U	LE3	T	1	1	other	at_home	course	father

Step 8. Did you notice the original dataframe is still lowercase? Why is that? Fix it and capitalize Mjob and Fjob.

代码如下：

stud_alcoh.Mjob = stud_alcoh.Mjob.apply(capitalizer)
stud_alcoh.Fjob = stud_alcoh.Fjob.apply(capitalizer)
stud_alcoh

输出结果如下：

	school	sex	age	address	famsize	Pstatus	Medu	Fedu	Mjob	Fjob	reason	guardian
0	GP	F	18	U	GT3	A	4	4	At_home	Teacher	course	mother
1	GP	F	17	U	GT3	T	1	1	At_home	Other

最低0.47元/天解锁文章

Geek_bao

关注

0
点赞
踩
9

收藏

觉得还不错? 一键收藏
1
评论
Python数据分析pandas入门练习题（六）

Python数据分析基础PreparationExercise 1-Student Alcohol ConsumptionIntroduction:Step 1. Import the necessary librariesStep 2. Import the dataset from this [address](https://raw.githubusercontent.com/guipsamora/pandas_exercises/master/04_Apply/Students_Alcohol_Co
复制链接

扫一扫