如何在 Pandas 中使用 DataFrame

Python pandas DataFrame是一个类似于表的数据结构对象。它包含行和列。每列包含相同类型的数据。对于每一列数据,您可以使用行号来迭代列元素。本文将告诉你如何创建一个pandas DataFrame 对象,以及如何获取其中的列和行数据。

1. 如何创建 Pandas DataFrame 对象。

  1. 调用pandas模块的DataFrame(data, index=index, columns=columns)方法来创建 python pandas DataFrame对象。
  2. 数据参数保存数据帧的对象数据,它可以是一个2维阵列或一个Python字典对象。
  3. 指数参数是 数据框对象的行索引号,它是一个Python列表对象。
  4. 列 参数是 数据框对象的列标签的文字,我们可以使用每个列的值,以获得 数据帧中的对象的一列数据pandas.Series类型的对象。

1.1 通过二维数组创建 Pandas DataFrame 对象。

  1. 下面的示例将创建一个带有二维数组的 python pandas DataFrame 对象。
    import pandas as pd
    
    
    '''
    This function create a python pandas DataFrame object with a 2 dimension array.
    '''
    def create_dataframe_from_2_dimension_array():
        
        pd.set_option('display.unicode.east_asian_width', True)
        
        ''' Define a 2 dimension array, each element in the array's first dimension is a list. 
            
            It contains the position number, programming language and operating system '''
        data = [[1, 'python', 'Windows'], [5, 'java', 'Linux'],[8, 'c++', 'macOS']]
        
        # Define the column list, each element in the list is the column label.
        columns = ['Position', 'Programming Language', 'Operating System']
        
        # Define the row index list.
        index = [1, 2, 3]
        
        # Create the python pandas DataFrame object.
        df = pd.DataFrame(data, index=index, columns=columns)
        
        # Print out the DataFrame object data.
        print(df)
        
        # Return the python pandas DataFrame object.
        return df

  2. 当你运行上面的函数时,它会在控制台打印出下面的数据。
       Position Programming Language Operating System
    1         1               python          Windows
    2         5                 java            Linux
    3         8                  c++            macOS

1.2 通过Python Dictionary 对象创建Pandas DataFrame 对象。

  1. 下面的示例将使用 Python 字典对象创建一个 python pandas DataFrame 对象。
    import pandas as pd
    
    '''
    This function create a python pandas DataFrame object with a python dictionary object.
    '''
    
    
    def create_dataframe_from_dictionary_object():
        pd.set_option('display.unicode.east_asian_width', True)
        '''
        Define a python dictionary object, the key is the column name, the value is a list that contains the column value of each row in the column.
        '''
        dict_obj = {'Position': [1, 5, 8], 'Programming Language': ['python', 'java', 'c++'],
                    'Operating System': ['Windows', 'Linux', 'macOS']}
    
        # Create a list object to store the row index number.
        index = [1, 2, 3]
    
        # Create the python pandas DataFrame object 
        df = pd.DataFrame(dict_obj, index=index)
    
        # Print the DataFrame object's data in the console.
        print(df)
    
        # Return the created DataFrame object.
        return df
    
    
    print(create_dataframe_from_dictionary_object())
    

  2. 下面是上面的示例函数在控制台中的执行结果。
       Position Programming Language Operating System
    1         1               python          Windows
    2         5                 java            Linux
    3         8                  c++            macOS

2. 如何迭代 Python Pandas DataFrame 对象。

2.1 迭代 DataFrame 列。

  1. python pandas DataFrame对象的columns属性将在列表中返回所有DataFrame对象的值。
  2. 然后我们可以迭代返回的列列表,然后在 pandas Series对象中获取列数据。下面是一个例子。
    '''
    This function will iterate the dataframe_object and print out each column data list in the python pandas DataFrame object.
    '''
    def iterate_dataframe_object(dataframe_object):
        
        
        print('=================== iterate_dataframe_object ======================')
        
        
        # Loop the DataFrame object's columns.
        for column in dataframe_object.columns:
            
            # Print out the column name.
            print(column)
            
            # Get the column data in a pandas Series object.
            column_data_series = dataframe_object[column]
            
            # Print out the column data Series object.
            print(column_data_series)
    
            print('=======================================')
    
    
    if __name__ == '__main__':
        
        #create_dataframe_from_2_dimension_array()
        
        df = create_dataframe_from_dictionary_object()
        
        iterate_dataframe_object(df)

  3. 下面是上面例子的执行结果。
    =================== iterate_dataframe_object ======================
    Position
    1    1
    2    5
    3    8
    Name: Position, dtype: int64
    =======================================
    Programming Language
    1    python
    2      java
    3       c++
    Name: Programming Language, dtype: object
    =======================================
    Operating System
    1    Windows
    2      Linux
    3      macOS
    Name: Operating System, dtype: object
    =======================================

2.2 迭代 DataFrame 行。

  1. 您可以使用pandas模块的DataFrame对象的iterrows()函数来获取 DataFrame 对象的行迭代器。
  2. 然后就可以调用python next()函数用迭代器对items进行迭代,然后得到DataFrame对象的每一行数据。下面是示例源代码。
    '''
    Created on Oct 23, 2021
    
    @author: songzhao
    '''
    
    import pandas as pd
    
    '''
    This function create a python pandas DataFrame object with a python dictionary object.
    '''
    def create_dataframe_from_dictionary_object():
        
        pd.set_option('display.unicode.east_asian_width', True)
        '''
        Define a python dictionary object, the key is the column name, the value is a list that contains the column value of each row in the column.
        '''
        dict_obj = {'Position':[1, 5, 8], 'Programming Language':['python', 'java', 'c++'], 'Operating System':['Windows', 'Linux', 'macOS']}
        
        # Create a list object to store the row index number.
        index = [1, 2, 3]
        
        # Create the python pandas DataFrame object 
        df = pd.DataFrame(dict_obj, index=index)
        
        # Print the DataFrame object's data in the console.
        print(df)
        
        # Return the created DataFrame object.
        return df
            
        
    '''
    This function will iterate the DataFrame object rows and print each row data.
    '''  
    def iterate_dataframe_rows(df_obj):
        
        print('=================== iterate_dataframe_rows ======================')
        
        # Call the DataFrame object's iterrows() function to get row iterator.
        iterator = df_obj.iterrows()
        
        # Get the next item in the iterator.
        row = next(iterator, None)
       
        # While there are rows in the iterator.
        while row != None:  
            
            row_number = row[0]
            
            series_obj = row[1]
            
            print('row number = ', row_number)
            
            print(series_obj.index)
            
            print(series_obj.values)
            
            print('\r\n')
            
            # Get the next row from the iterator.
            row = next(iterator, None)
                
    
    if __name__ == '__main__':
        
        df = create_dataframe_from_dictionary_object()
        
        iterate_dataframe_rows(df)

  3. 当您运行上面的示例源代码时,您将获得以下输出。
       Position Programming Language Operating System
    1         1               python          Windows
    2         5                 java            Linux
    3         8                  c++            macOS
    =================== iterate_dataframe_rows ======================
    row number =  1
    Index(['Position', 'Programming Language', 'Operating System'], dtype='object')
    [1 'python' 'Windows']
    
    
    row number =  2
    Index(['Position', 'Programming Language', 'Operating System'], dtype='object')
    [5 'java' 'Linux']
    
    
    row number =  3
    Index(['Position', 'Programming Language', 'Operating System'], dtype='object')
    [8 'c++' 'macOS']

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值