python机器学习基础笔记3之加载数据（cook book）

最新推荐文章于 2022-09-30 16:30:47 发布

万物琴弦光锥之外

最新推荐文章于 2022-09-30 16:30:47 发布

阅读量412

点赞数

分类专栏： python 机器学习文章标签： load data 导入数据机器学习 python 笔记

本文链接：https://blog.csdn.net/weixin_43702920/article/details/95177310

版权

python 同时被 2 个专栏收录

85 篇文章 1 订阅

订阅专栏

机器学习

60 篇文章 3 订阅

订阅专栏

Loading datasets

# Load scikit-learn's datasets
from sklearn import datasets

# Load digits dataset(手写数字数据集)
digits = datasets.load_digits()

# Create features matrix
features = digits.data

# Create target vector
target = digits.target
# View first observation
features[0]

部分数据集：

load_boston
Contains 503 observations on Boston housing prices. It is a good dataset for
exploring regression algorithms.
load_iris
Contains 150 observations on the measurements of Iris flowers. It is a good data‐
set for exploring classification algorithms.
load_digits
Contains 1,797 observations from images of handwritten digits. It is a good data‐
set for teaching image classification.

CSV file

网络上url :

# Load library
import pandas as pd

# Create URL
url = 'https://tinyurl.com/simulated_data'

# Load dataset
dataframe = pd.read_csv(url)

# View first two rows
dataframe.head(2)

本地 file:

dataframe = pd.read_csv(r'path')

EXCEL

# Load library
import pandas as pd

# Create URL
url = 'https://tinyurl.com/simulated_excel'

# Load data
dataframe = pd.read_excel(url, sheetname=0, header=1)

# View the first two rows
dataframe.head(2)

# ps： sheetname can accept both strings containing the name of the sheet and
integers pointing to sheet positions (zero-indexed). If we need to load multiple sheets,
include them as a list. For example, sheetname=[0,1,2, "Monthly Sales"] will
return a dictionary of pandas DataFrames containing the first, second, and third
sheets and the sheet named Monthly Sales.

JSON file

# Load library
import pandas as pd

# Create URL
url = 'https://tinyurl.com/simulated_json'

# Load data
dataframe = pd.read_json(url, orient='columns')

# View the first two rows
dataframe.head(2)

注意： orient parameter, which indicates to pandas how the JSON file
is structured. However, it might take some experimenting to figure out which argu‐
ment (split, records, index, columns, and values) is the right one. Another helpful
tool pandas offers is json_normalize, which can help convert semistructured JSON
data into a pandas DataFrame.

SQL 数据库访问

# Load libraries
import pandas as pd
from sqlalchemy import create_engine

# Create a connection to the database
database_connection = create_engine('sqlite:///sample.db')

# Load data
dataframe = pd.read_sql_query('SELECT * FROM data', database_connection)

# View first two rows
dataframe.head(2)

万物琴弦光锥之外

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
打赏
0
评论
python机器学习基础笔记3之加载数据（cook book）

Loading datasets# Load scikit-learn's datasetsfrom sklearn import datasets# Load digits dataset(手写数字数据集)digits = datasets.load_digits()# Create features matrixfeatures = digits.data# Create ...
复制链接

扫一扫