数据分析
Batac_
中国北方小镇的田园村落
展开
-
k-means
k-means:无监督学习方法,没有目标值;原创 2019-12-06 11:31:11 · 99 阅读 · 0 评论 -
线性回归-机器学习
# coding=utf-8from sklearn.datasets import load_bostonfrom sklearn.linear_model import LinearRegression, SGDRegressor, Ridgefrom sklearn.model_selection import train_test_splitfrom sklearn.prepro...原创 2019-12-06 11:14:06 · 182 阅读 · 0 评论 -
逻辑回归-机器学习
# coding=utf-8import pandas as pdimport numpy as npfrom sklearn.model_selection import train_test_splitfrom sklearn.preprocessing import StandardScalerfrom sklearn.linear_model import LogisticRe...原创 2019-12-06 11:13:31 · 150 阅读 · 0 评论 -
朴素贝叶斯算法
# 朴素贝叶斯算法# coding=utf-8from sklearn.datasets import fetch_20newsgroupsfrom sklearn.model_selection import train_test_splitfrom sklearn.feature_extraction.text import TfidfVectorizerfrom sklearn....原创 2019-12-03 16:58:38 · 140 阅读 · 0 评论 -
newsgroups数据集问题记录
说明: 在mac上边做python开发(机器学习), 对数据newsgroups进行朴素贝叶斯分析(概率)from sklearn.datasets import fetch_20newsgroups运行下载数据的时候, 总是报错, 所以自己下载数据放在"/Users/Batac/scikit_learn_data/20news_home/" 下边,1.使用safari下载20ne...原创 2019-12-03 15:34:35 · 306 阅读 · 0 评论 -
K近邻算法
数据分类:判断测试数据属于哪一类? (离散数据:特征值+目标值, 有监督学习)# coding=utf-8# k近邻算法# 数据(目标值是离散的) 需要进行标准化,避免某一项特征对其他特征的影响 k值影响精度from sklearn.model_selection import train_test_splitfrom sklearn.neighbors import KNeig...原创 2019-12-03 11:57:08 · 327 阅读 · 0 评论 -
pandas之字符串离散化
# coding=utf-8from matplotlib import pyplot as pltimport pandas as pdimport numpy as np# 数据地址file_path = "./IMDB-Movie-Data.csv"# 获取数据 ['', '', '']结构df = pd.read_csv(file_path)# 分割数据 [[],...原创 2019-11-28 17:28:20 · 393 阅读 · 0 评论 -
Matplotlib之条形图3
# coding=utf-8# 绘制多条条形图from matplotlib import pyplot as pltfrom matplotlib import font_manager# 中文编码my_font = font_manager.FontProperties(fname="/System/Library/Fonts/PingFang.ttc")plt.figur...原创 2019-11-20 16:00:57 · 271 阅读 · 0 评论 -
Matplotlib之条形图2
# coding=utf-8# 绘制横着的条形图from matplotlib import pyplot as pltfrom matplotlib import font_manager# 中文编码my_font = font_manager.FontProperties(fname="/System/Library/Fonts/PingFang.ttc")plt.figu...原创 2019-11-20 15:58:54 · 143 阅读 · 0 评论 -
Matplotlib之条形图1
# coding=utf-8# 条形图from matplotlib import pyplot as pltfrom matplotlib import font_manager# 中文编码my_font = font_manager.FontProperties(fname="/System/Library/Fonts/PingFang.ttc")plt.figure(fig...原创 2019-11-20 15:57:33 · 119 阅读 · 0 评论 -
Matplotlib之散点图
# coding=utf-8from matplotlib import pyplot as pltfrom matplotlib import font_managery_3 = [11,17,16,11,12,11,12,6,6,7,8,9,12,15,14,17,18,21,16,17,20,14,15,15,15,19,21,22,22,22,23]y_10 = [26,26,...原创 2019-11-20 14:48:42 · 220 阅读 · 0 评论 -
Matplotlib之折线图
# coding=utf-8from matplotlib import pyplot as pltfrom matplotlib import font_manager# 中文编码my_font = font_manager.FontProperties(fname="/System/Library/Fonts/PingFang.ttc")# 1.获取数据y_1 = [1, 0,...原创 2019-11-20 11:04:25 · 131 阅读 · 0 评论