Original place:
https://don.easiestsoft.com/pandas-01/
Series
Create Series
# create Series from ndarray
arr = np.array([1,2,3])
ser = pd.Series(arr, index=['a','b','c'])
# create Series from dict
dic = {'a':1, 'b':2, 'c':3}
ser = pd.Series(dic)
# create Series from list
lis = [1,2,3]
ser = pd.Series(lis, index=['a','b','c'])
Access Data in Series
# using position number
s = pd.Series([1,2,3], index=['a','b','c'])
s[0]
s[:3]
# using series index
s['a']
s[['a','b','c']]
DataFrame
Create Dataframes
# create dataframes from dict (of lists)
data = {'Name':['Tom', 'Jack', 'Steve'],'Age':[28,34,29]}
df = pd.DataFrame(data)
# create dataframes from list (of dicts)
data = [{'a': 1, 'b': 2},{'a': 5, 'b': 10, 'c': 20}]
df = pd.DataFrame(data)
# create dataframes from dict (of series)
data = {'one':pd.Series([1,2,3], index=['a','b','c'], 'two':pd.Series([1,2,3,4], index=['a','b','c','d'])}
df = pd.DateFrame(data)
# create dataframes from files
df = pd.read_csv('path', sep=',')
df = pd.read_json('path')
Access (select) Data in Dataframes
# access by row (return either a series or a dataframe)
df.loc['first_index']
df.iloc[0]
df.loc['first_index':'third_index'] # will contain 'first_index', 'second_index', 'third_index'
df.iloc[1:3] # will only contain index 1, 2
df[0]
df[0:3]
# access by column (return either a series or a dataframe)
df['column_1']
df[['column_1','column_1']]
df.column_1
# access both by row and by column
df.loc[['first_row','second_row'], ['first_column', 'second_column']]
df.iloc[0:2, 0:2]
# access by conditional expression
movies_df[(movies_df['director'] == 'Christopher Nolan') | (movies_df['director'] == 'Ridley Scott')]
movies_df[movies_df['director'].isin(['Christopher Nolan', 'Ridley Scott'])]
See More:
Pandas Learning Note 02 – Dealing With Missing Values (Clean Data)
References:
pandas.pydata.org
learndatasci
tutorialpoint