Import pandas under the name pd . In [1]:
import pandas as pd import numpy as np
Print the version of pandas that has been imported. In [2]:
pd.__version_
Print out all the version information of the libraries that are required by the pandas library In [3]:
pd.show_versions()
Create a DataFrame df from this dictionary data which has the index labels .
data = {'animal': ['cat', 'cat', 'snake', 'dog', 'dog', 'cat', 'snake', 'cat', 'dog
'age': [2.5, 3, 0.5, np.nan, 5, 2, 4.5, np.nan, 7, 3],
'visits': [1, 3, 2, 3, 2, 3, 1, 1, 2, 1],
'priority': ['yes', 'yes', 'no', 'yes', 'no', 'no', 'no', 'yes', 'no', 'no'] labels = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j']
df = pd.DataFrame(data, index=labels)
Display a summary of the basic information about this DataFrame and its data. In [5]:
df.info()
# ...or...
df.describe()
6.Return the first 3 rows of the DataFrame df In [6]:
df.iloc[:3]
# or equivalently
df.head(3)
Select just the 'animal' and 'age' columns from the DataFrame df . In [7]:
df.loc[:, ['animal', 'age']]
# or
df[['animal', 'age']]
Select the data in rows [3, 4, 8] and in columns ['animal', 'age'] .
df.loc[df.index[[3, 4, 8]], ['animal', 'age']]
Select only the rows where the number of visits is greater than 3. In [4]:
df[df['visits'] > 3]
Select the rows where the age is missing, i.e. is NaN .
In [5]:
df[df['age'].isnull()]
Select the rows where the animal is a cat and the age is less than 3. In [6]:
df[(df['animal'] == 'cat') & (df['age'] < 3)]
Select the rows the age is between 2 and 4 (inclusive). In [7]:
df[df['age'].between(2, 4)]
Change the age in row 'f' to 1.5.
df.loc['f', 'age'] = 1.5
Calculate the sum of all visits (the total number of visits). In [ ]:
df['visits'].sum()
Calculate the mean age for each different animal in df . In [8]:
df.groupby('animal')['age'].mean()
Append a new row 'k' to df with your choice of values for each column. Then delete that row to return the
original DataFrame.
In [ ]:df.loc['k'] = [5.5, 'dog', 'no', 2]
and then d