Python数据分析基础
- Preparation
- Exercise 1 - Filtering and Sorting Data
-
-
- Step 1. Import the necessary libraries
- Step 2. Import the dataset from this [address](https://raw.githubusercontent.com/justmarkham/DAT8/master/data/chipotle.tsv).
- Step 3. Assign it to a variable called chipo.
- Step 4. How many products cost more than $10.00?
- Step 5. What is the price of each item?
- Step 6. Sort by the name of the item
- Step 7. What was the quantity of the most expensive item ordered?
- Step 8. How many times were a Veggie Salad Bowl ordered?
- Step 9. How many times people orderd more than one Canned Soda?
-
- Exercise2 - Filtering and Sorting Data
-
-
- Step 1. Import the necessary libraries
- Step 2. Import the dataset from this [address](https://raw.githubusercontent.com/guipsamora/pandas_exercises/master/02_Filtering_%26_Sorting/Euro12/Euro_2012_stats_TEAM.csv).
- Step 3. Assign it to a variable called euro12.
- Step 4. Select only the Goal column.
- Step 5. How many team participated in the Euro2012?
- Step 6. What is the number of columns in the dataset?
- Step 7. View only the columns Team, Yellow Cards and Red Cards and assign them to a dataframe called discipline
- Step 8. Sort the teams by Red Cards, then to Yellow Cards
- Step 9. Calculate the mean Yellow Cards given per Team
- Step 10. Filter teams that scored more than 6 goals
- Step 11. Select the teams that start with G
- Step 12. Select the first 7 columns
- Step 13. Select all columns except the last 3.
- Step 14. Present only the Shooting Accuracy from England, Italy and Russia
-
- Fictional Army - Filtering and Sorting
-
-
- Introduction:
- Step 1. Import the necessary libraries
- Step 2. This is the data given as a dictionary
- Step 3. Create a dataframe and assign it to a variable called army.
- Step 4. Set the 'origin' colum as the index of the dataframe
- Step 5. Print only the column veterans
- Step 6. Print the columns 'veterans' and 'deaths'
- Step 7. Print the name of all the columns.
- Step 8. Select the 'deaths', 'size' and 'deserters' columns from Maine and Alaska
- Step 9. Select the rows 3 to 7 and the columns 3 to 6
- Step 10. Select every row after the fourth row
- Step 11. Select every row up to the 4th row
- Step 12. Select the 3rd column up to the 7th column
- Step 13. Select rows where df.deaths is greater than 50
- Step 14. Select rows where df.deaths is greater than 500 or less than 50
- Step 15. Select all the regiments not named "Dragoons"
- Step 16. Select the rows called Texas and Arizona
- Step 17. Select the third cell in the row named Arizona
- Step 18. Select the third cell down in the column named deaths
-
- 小节
- 结语
Preparation
下面练习题的数据集,给出的网址不一定可用,这个地址数据集亲测可用。如果数据集失效了,可自行网上寻找。https://github.com/daacheng/PythonBasic/tree/master/dataset
Exercise 1 - Filtering and Sorting Data
Step 1. Import the necessary libraries
代码如下:
import pandas as pd
Step 2. Import the dataset from this address.
这个地址数据集不一定能用,可能需要梯子。
Step 3. Assign it to a variable called chipo.
代码如下:
chipo = pd.read_csv('chipotle.csv', sep=',')
Step 4. How many products cost more than $10.00?
代码如下:
# 题目是让你求单价超过10美金的产品
# 整理 item_price 列并将其转换为浮点数
prices = [float(value[1:-1]) for value in chipo.item_price]
# 用整理过的价格重新分配列
chipo.item_price = prices
# 删除 item_name 和quantity中的重复项
'''
drop_duplicates(self, subset=None, keep="first", inplace=False)
subset(子集 ):考虑用于标识重复行的列标签或标签序列。 默认情况下,所有列均用于查找重复的行。
keep :允许的值为{'first','last',False},默认为'first'。 如果为“ first”,则删除除第一个行以外的重复行。
如果为“ last”,则删除除最后一行以外的重复行。 如果为False,则删除所有重复的行。
inplace :如果为True,则更改源DataFrame并返回None。 默认情况下,源DataFrame保持不变,并返回一个新的DataFrame实例。
'''
chipo_filtered = chipo.drop_duplicates(['item_name', 'quantity'])
# 仅选择数量等于 1 的产品
chipo_one_prod = chipo_filtered[chipo_filtered.quantity == 1]
# item_name.nunique()返回每列不同值的个数
chipo_one_prod[chipo_one_prod['item_price']>10].item_name.nunique()
输出结果如下:
12
Step 5. What is the price of each item?
print a data frame with only two columns item_name and item_price
代码如下:
# 输出每个商品的单价,只输出item_name和item_price
# delete the duplicates in item_name and quantity
chipo_filtered = chipo.drop_duplicates(['item_name','quantity'])
# chipo[(chipo['item_name'] == 'Chicken Bowl') & (chipo['quantity'] == 1)]
# select only the products with quantity equals to 1
chipo_one_prod = chipo_filtered[chipo_filtered.quantity == 1]
# select only the item_name and item_price columns
price_per_item = chipo_one_prod[['item_name', 'item_price']]
print(price_per_item)
# sort the values from the most to less expensive
# price_per_item.sort_values(by = "item_price", ascending = False).head(20)
输出结果如下:
item_name item_price
0 Chips and Fresh Tomato Salsa 2.39
1 Izze 3.39
2 Nantucket Nectar 3.39
3 Chips and Tomatillo-Green Chili Salsa 2.39
5 Chicken Bowl 10.98
6 Side of Chips 1.69
7 Steak Burrito 11.75
8 Steak Soft Tacos 9.25
10 Chips and Guacamole 4.45
11 Chicken Crispy Tacos 8.75
12 Chicken Soft Tacos 8.75
16 Chicken Burrito 8.49
21 Barbacoa Burrito 8.99
27 Carnitas Burrito 8.99
28 Canned Soda 1.09
33 Carnitas Bowl 8.99
34 Bottled Water 1.09
38 Chips and Tomatillo Green Chili Salsa 2.95
39 Barbacoa Bowl 11.75
40 Chips 2.15
44 Chicken Salad Bowl 8.75
54 Steak Bowl 8.99
56 Barbacoa Soft Tacos 9.25
57 Veggie Burrito 11.25
62 Veggie Bowl 11.25
92 Steak Crispy Tacos 9.25
111 Chips and Tomatillo Red Chili Salsa 2.95
168 Barbacoa Crispy Tacos 11.75
186 Veggie Salad Bowl 11.25
191 Chips and Roasted Chili-Corn Salsa 2.39
233 Chips and Roasted Chili Corn Salsa 2.95
237 Carnitas Soft Tacos 9.25
250 Chicken Salad 10.98
263 Canned Soft Drink 1.25
298 6 Pack Soft Drink 6.49
300 Chips and Tomatillo-Red Chili Salsa 2.39
510 Burrito 7.40
520 Crispy Tacos 7.40
554 Carnitas Crispy Tacos 9.25
606 Steak Salad Bowl 11.89
664 Steak Salad 8.99
673 Bowl 7.40
674 Chips and Mild Fresh Tomato Salsa 3.00
738 Veggie Soft Tacos 11.25
1132 Carnitas Salad Bowl 11.89
1229 Barbacoa Salad Bowl 11.89
1414 Salad 7.40
1653 Veggie Crispy Tacos 8.49
1694 Veggie Salad 8.49
3750 Carnitas Salad 8.99
Step 6. Sort by the name of the item
代码如下:
chipo.sort_values(by='item_name')
# chipo.item_name.sort_values()
输出结果如下:
Unnamed: 0 | order_id | quantity | item_name | choice_description | item_price | |
---|---|---|---|---|---|---|
3389 | 3389 | 1360 | 2 | 6 Pack Soft Drink | [Diet Coke] | 12.98 |
341 | 341 | 148 | 1 | 6 Pack Soft Drink | [Diet Coke] | 6.49 |
1849 | 1849 | 749 | 1 | 6 Pack Soft Drink | [Coke] | 6.49 |
1860 | 1860 | 754 | 1 | 6 Pack Soft Drink | [Diet Coke] | 6.49 |
2713 | 2713 | 1076 | 1 | 6 Pack Soft Drink | [Coke] | 6.49 |
3422 | 3422 | 1373 | 1 | 6 Pack Soft Drink | [Coke] | 6.49 |
553 | 553 | 230 | 1 | 6 Pack Soft Drink | [Diet Coke] | 6.49 |
1916 | 1916 | 774 | 1 | 6 Pack Soft Drink | [Diet Coke] | 6.49 |
1922 | 1922 | 776 | 1 | 6 Pack Soft Drink | [Coke] | 6.49 |
1937 | 1937 | 784 | 1 | 6 Pack Soft Drink | [Diet Coke] | 6.49 |
3836 | 3836 | 1537 | 1 | 6 Pack Soft Drink | [Coke] | 6.49 |
298 | 298 | 129 | 1 | 6 Pack Soft Drink | [Sprite] | 6.49 |
1976 | 1976 | 798 | 1 | 6 Pack Soft Drink | [Diet Coke] | 6.49 |
1167 | 1167 | 481 | 1 | 6 Pack Soft Drink | [Coke] | 6.49 |
3875 | 3875 | 1554 | 1 | 6 Pack Soft Drink | [Diet Coke] | 6.49 |
1124 | 1124 | 465 | 1 | 6 Pack Soft Drink | [Coke] | 6.49 |
3886 | 3886 | 1558 | 1 | 6 Pack Soft Drink | [Diet Coke] | 6.49 |
2108 | 2108 | 849 | 1 | 6 Pack Soft Drink | [Coke] | 6.49 |
3010 | 3010 | 1196 | 1 | 6 Pack Soft Drink | [Diet Coke] | 6.49 |
4535 | 4535 | 1803 | 1 | 6 Pack Soft Drink | [Lemonade] | 6.49 |
4169 | 4169 | 1664 | 1 | 6 Pack Soft Drink | [Diet Coke] | 6.49 |
4174 | 4174 | 1666 | 1 | 6 Pack Soft Drink | [Coke] | 6.49 |
4527 | 4527 | 1800 | 1 | 6 Pack Soft Drink | [Diet Coke] | 6.49 |
4522 | 4522 | 1798 | 1 | 6 Pack Soft Drink | [Diet Coke] | 6.49 |
3806 | 3806 | 1525 | 1 | 6 Pack Soft Drink | [Sprite] | 6.49 |
2389 | 2389 | 949 | 1 | 6 Pack Soft Drink | [Coke] | 6.49 |
3132 | 3132 | 1248 | 1 | 6 Pack Soft Drink | [Diet Coke] | 6.49 |
3141 | 3141 | 1253 | 1 | 6 Pack Soft Drink | [Lemonade] | 6.49 |
639 | 639 | 264 | 1 | 6 Pack Soft Drink | [Diet Coke] | 6.49 |
1026 | 1026 | 422 | 1 | 6 Pack Soft Drink | [Sprite] | 6.49 |
... | ... | ... | ... | ... | ... | ... |
2996 | 2996 | 1192 | 1 | Veggie Salad | [Roasted Chili Corn Salsa (Medium), [Black Bea... | 8.49 |
3163 | 3163 | 1263 | 1 | Veggie Salad | [[Fresh Tomato Salsa (Mild), Roasted Chili Cor... | 8.49 |
4084 | 4084 | 1635 | 1 | Veggie Salad | [[Fresh Tomato Salsa (Mild), Roasted Chili Cor... | 8.49 |
1694 | 1694 | 686 | 1 | Veggie Salad | [[Fresh Tomato Salsa (Mild), Roasted Chili Cor... | 8.49 |
2756 | 2756 | 1094 | 1 | Veggie Salad | [[Tomatillo-Green Chili Salsa (Medium), Roaste... | 8.49 |
4201 | 4201 | 1677 | 1 | Veggie Salad Bowl | [Fresh Tomato Salsa, [Fajita Vegetables, Black... | 11.25 |
1884 | 1884 | 760 | 1 | Veggie Salad Bowl | [Fresh Tomato Salsa, [Fajita Vegetables, Rice,... | 11.25 |
455 | 455 | 195 | 1 | Veggie Salad Bowl | [Fresh Tomato Salsa, [Fajita Vegetables, Rice,... | 11.25 |
3223 | 3223 | 1289 | 1 | Veggie Salad Bowl | [Tomatillo Red Chili Salsa, [Fajita Vegetables... | 11.25 |
2223 | 2223 | 896 | 1 | Veggie Salad Bowl | [Roasted Chili Corn Salsa, Fajita Vegetables] | 8.75 |
2269 | 2269 | 913 | 1 | Veggie Salad Bowl | [Fresh Tomato Salsa, [Fajita Vegetables, Rice,... | 8.75 |
4541 | 4541 | 1805 | 1 | Veggie Salad Bowl | [Tomatillo Green Chili Salsa, [Fajita Vegetabl... | 8.75 |
3293 | 3293 | 1321 | 1 | Veggie Salad Bowl | [Fresh Tomato Salsa, [Rice, Black Beans, Chees... | 8.75 |
186 | 186 | 83 | 1 | Veggie Salad Bowl | [Fresh Tomato Salsa, [Fajita Vegetables, Rice,... | 11.25 |
960 | 960 | 394 | 1 | Veggie Salad Bowl | [Fresh Tomato Salsa, [Fajita Vegetables, Lettu... | 8.75 |
1316 | 1316 | 536 | 1 | Veggie Salad Bowl | [Fresh Tomato Salsa, [Fajita Vegetables, Rice,... | 8.75 |
2156 | 2156 | 869 | 1 | Veggie Salad Bowl | [Tomatillo Red Chili Salsa, [Fajita Vegetables... | 11.25 |
4261 | 4261 | 1700 | 1 | Veggie Salad Bowl | [Fresh Tomato Salsa, [Fajita Vegetables, Rice,... | 11.25 |
295 | 295 | 128 | 1 | Veggie Salad Bowl | [Fresh Tomato Salsa, [Fajita Vegetables, Lettu... | 11.25 |
4573 | 4573 | 1818 | 1 | Veggie Salad Bowl | [Fresh Tomato Salsa, [Fajita Vegetables, Pinto... | 8.75 |
2683 | 2683 | 1066 | 1 | Veggie Salad Bowl | [Roasted Chili Corn Salsa, [Fajita Vegetables,... | 8.75 |
496 | 496 | 207 | 1 | Veggie Salad Bowl | [Fresh Tomato Salsa, [Rice, Lettuce, Guacamole... | 11.25 |
4109 | 4109 | 1646 | 1 | Veggie Salad Bowl | [Tomatillo Red Chili Salsa, [Fajita Vegetables... | 11.25 |
738 | 738 | 304 | 1 | Veggie Soft Tacos | [Tomatillo Red Chili Salsa, [Fajita Vegetables... | 11.25 |
3889 | 3889 | 1559 | 2 | Veggie Soft Tacos | [Fresh Tomato Salsa (Mild), [Black Beans, Rice... | 16.98 |
2384 | 2384 | 948 | 1 | Veggie Soft Tacos | [Roasted Chili Corn Salsa, [Fajita Vegetables,... | 8.75 |
781 | 781 | 322 | 1 | Veggie Soft Tacos | [Fresh Tomato Salsa, [Black Beans, Cheese, Sou... | 8.75 |
2851 | 2851 | 1132 | 1 | Veggie Soft Tacos | [Roasted Chili Corn Salsa (Medium), [Black Bea... | 8.49 |
1699 | 1699 | 688 | 1 | Veggie Soft Tacos | [Fresh Tomato Salsa, [Fajita Vegetables, Rice,... | 11.25 |
1395 | 1395 | 567 | 1 | Veggie Soft Tacos | [Fresh Tomato Salsa (Mild), [Pinto Beans, Rice... | 8.49 |
4622 rows × 6 columns