1.Loading a CSV File
import pandas as pd
path = 'F:/pycharmFile/input/train_data.csv'
dataframe = pd.read_csv(path)
dataframe.head(2)
notes:
1. see how a dataset is structured beforehand and what parameters we need to set to loas in the file.
2. read_csv has many parameter.
e.g. separators; if a header row exist (header=None) ...
2.Loading a Excel File
path = 'F:/pycharmFile/input/data.xlsx'
dataframe = pd.read_excel(path, sheet_name=0, header=1)
dataframe.head(2)
the main difference is the additional parameter ---- sheet_name, that specifies which sheet in the Excel file we wish to load.
3.Loading a JSON File
# Load library
import pandas as pd
# Create URL
url = 'https://tinyurl.com/simulated_json'
# Load data
dataframe = pd.read_json(url, orient='columns')
# View the first two rows
dataframe.head(2)
4.Querying a SQL Database
# Load libraries
import pandas as pd
from sqlalchemy import create_engine
# Create a connection to the database
database_connection = create_engine('sqlite:///sample.db')
# Load data
dataframe = pd.read_sql_query('SELECT * FROM data', database_connection)
# View first two rows
dataframe.head(2)