机器学习python编程

最新推荐文章于 2023-06-08 09:15:00 发布

qq_32062101

最新推荐文章于 2023-06-08 09:15:00 发布

阅读量269

点赞数 1

文章标签： payton 机器学习

本文链接：https://blog.csdn.net/qq_32062101/article/details/81100561

版权

Pandas 数据读取

1.pandas.read_csv（"文件名"）

in:

import pandas
food_info = pandas.read_csv("food_info.csv")
print(type(food_info))
print (food_info.dtypes)

pandas.read_csv（"文件名"）：读取以逗号为分隔符的文件。

print(type(food_info))：打印文件元素类型，此处为food_info为datafram格式（表格格式）返回表格列的类型

int64、float64、object64（pandas中的字符类型）

out:

NDB_No               int64
Shrt_Desc           object
Water_(g)          float64
Energ_Kcal           int64
dtype: object

2.food_info.head(行数)

in:

first_rows = food_info.head()
#print first_rows
print(food_info.head(3))
print (food_info.columns)
#print food_info.shape

print(food_info.head(3))：打印表格的前3行

food_info.head()：代表所有行

food_info.tail(4)：返回倒数4行的表格

print (food_info.columns)：返回列名

print food_info.shape：打印表格的行数和列数

out:

   NDB_No                 Shrt_Desc  Water_(g)  Energ_Kcal  Protein_(g)  \
0    1001          BUTTER WITH SALT      15.87         717         0.85   
1    1002  BUTTER WHIPPED WITH SALT      15.87         717         0.85   
2    1003      BUTTER OIL ANHYDROUS       0.24         876         0.28   

   Lipid_Tot_(g)  Ash_(g)  Carbohydrt_(g)  Fiber_TD_(g)  Sugar_Tot_(g)  \
0          81.11     2.11            0.06           0.0           0.06   
1          81.11     2.11            0.06           0.0           0.06   
2          99.48     0.00            0.00           0.0           0.00   

        ...        Vit_A_IU  Vit_A_RAE  Vit_E_(mg)  Vit_D_mcg  Vit_D_IU  \
0       ...          2499.0      684.0        2.32        1.5      60.0   
1       ...          2499.0      684.0        2.32        1.5      60.0   
2       ...          3069.0      840.0        2.80        1.8      73.0   

   Vit_K_(mcg)  FA_Sat_(g)  FA_Mono_(g)  FA_Poly_(g)  Cholestrl_(mg)  
0          7.0      51.368       21.021        3.043           215.0  
1          7.0      50.489       23.426        3.012           219.0  
2          8.6      61.924       28.732        3.694           256.0  

[3 rows x 36 columns]
Index(['NDB_No', 'Shrt_Desc', 'Water_(g)', 'Energ_Kcal', 'Protein_(g)',
       'Lipid_Tot_(g)', 'Ash_(g)', 'Carbohydrt_(g)', 'Fiber_TD_(g)',
       'Sugar_Tot_(g)', 'Calcium_(mg)', 'Iron_(mg)', 'Magnesium_(mg)',
       'Phosphorus_(mg)', 'Potassium_(mg)', 'Sodium_(mg)', 'Zinc_(mg)',
       'Copper_(mg)', 'Manganese_(mg)', 'Selenium_(mcg)', 'Vit_C_(mg)',
       'Thiamin_(mg)', 'Riboflavin_(mg)', 'Niacin_(mg)', 'Vit_B6_(mg)',
       'Vit_B12_(mcg)', 'Vit_A_IU', 'Vit_A_RAE', 'Vit_E_(mg)', 'Vit_D_mcg',
       'Vit_D_IU', 'Vit_K_(mcg)', 'FA_Sat_(g)', 'FA_Mono_(g)', 'FA_Poly_(g)',
       'Cholestrl_(mg)'],
      dtype='object')

3.print (food_info.loc[行数])

in：

#pandas uses zero-indexing
#Series object representing the row at index 0.
print (food_info.loc[0])

# Series object representing the seventh row.
#food_info.loc[6]

# Will throw an error: "KeyError: 'the label [8620] is not in the [index]'"
#food_info.loc[8620]
#The object dtype is equivalent to a string in Python

print (food_info.loc[0])：打印列表第0行的数据

out:

NDB_No                         1001
Shrt_Desc          BUTTER WITH SALT
Water_(g)                     15.87
Energ_Kcal                      717
Protein_(g)                    0.85
Lipid_Tot_(g)                 81.11
FA_Poly_(g)                   3.043
Cholestrl_(mg)                  215
Name: 0, dtype: object

in:

# Returns a DataFrame containing the rows at indexes 3, 4, 5, and 6.
#food_info.loc[3:6]

# Returns a DataFrame containing the rows at indexes 2, 5, and 10. Either of the following approaches will work.
# Method 1
#two_five_ten = [2,5,10] 
#food_info.loc[two_five_ten]

# Method 2
#food_info.loc[[2,5,10]]

out:

NDB_No	Shrt_Desc
2	1003	BUTTER OIL ANHYDROUS
5	1006	CHEESE BRIE
10	1011	CHEESE COLBY

in:

# Series object representing the "NDB_No" column.
ndb_col = food_info["NDB_No"]
print (ndb_col)
# Alternatively, you can access a column by passing in a string variable.
#col_name = "NDB_No"
#ndb_col = food_info[col_name]

打印“NDB_No”则一列

out:

0        1001
1        1002
2        1003
3        1004
4        1005
        ...  
8613    83110
8614    90240
8615    90480
8616    90560
8617    93600
Name: NDB_No, Length: 8618, dtype: int64

in:

columns = ["Zinc_(mg)", "Copper_(mg)"]
zinc_copper = food_info[columns]
print (zinc_copper)
#print zinc_copper
# Skipping the assignment.
#zinc_copper = food_info[["Zinc_(mg)", "Copper_(mg)"]]

food_info[columns]：输出Zinc_(mg) 、 Copper_(mg)两列

out:

      Zinc_(mg)  Copper_(mg)
0          0.09        0.000
1          0.05        0.016
2          0.01        0.001
3          2.66        0.040
4          2.60        0.024
5          2.38        0.019
[8618 rows x 2 columns]

qq_32062101

关注

1
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
机器学习python编程

Pandas 数据读取1.pandas.read_csv（"文件名"）in:import pandasfood_info = pandas.read_csv("food_info.csv")print(type(food_info))print (food_info.dtypes)pandas.read_csv（"文件名"）：读取以逗号为分隔符的文件。print(typ...
复制链接

扫一扫