- 博客(15)
- 资源 (4)
- 收藏
- 关注
原创 z score vs. min-max scaling 优缺点
Min-max:所有特征具有相同尺度 (scale) 但不到处理outlierMin-max normalization: Guarantees all features will have the exact same scale but does not handle outliers well.Z-score normalization: Handles outliers, but do...
2019-07-31 16:58:15
原创 Datacamp 笔记&代码 Unsupervised Learning in Python 第三章 Decorrelating your data and dimension reduction
更多原始数据文档和JupyterNotebookGithub: https://github.com/JinnyR/Datacamp_DataScienceTrack_PythonDatacamp track: Data Scientist with Python - Course 23 (3)ExerciseCorrelated data in natureYou are given ...
2019-07-31 05:09:14
原创 Datacamp 笔记&代码 Unsupervised Learning in Python 第二章 Visualization with hierarchical clustering &t-SNE
更多原始数据文档和JupyterNotebookGithub: https://github.com/JinnyR/Datacamp_DataScienceTrack_PythonDatacamp track: Data Scientist with Python - Course 23 (2)ExerciseHierarchical clustering of the grain dat...
2019-07-31 05:06:31
原创 Datacamp 笔记&代码 Unsupervised Learning in Python 第一章 Clustering for dataset exploration
更多原始数据文档和JupyterNotebookGithub: https://github.com/JinnyR/Datacamp_DataScienceTrack_PythonDatacamp track: Data Scientist with Python - Course 23 (1)ExerciseClustering 2D pointsFrom the scatter pl...
2019-07-31 05:01:10
原创 Datacamp 笔记&代码 Machine Learning with the Experts: School Budgets 第四章 Learning from the experts
更多原始数据文档和JupyterNotebookGithub: https://github.com/JinnyR/Datacamp_DataScienceTrack_PythonDatacamp track: Data Scientist with Python - Course 22 (4)ExerciseDeciding what’s a wordBefore you build ...
2019-07-31 04:56:56
原创 Datacamp 笔记&代码 Machine Learning with the Experts: School Budgets 第三章 Improving your model
更多原始数据文档和JupyterNotebookGithub: https://github.com/JinnyR/Datacamp_DataScienceTrack_PythonDatacamp track: Data Scientist with Python - Course 22 (3)ExerciseInstantiate pipelineIn order to make yo...
2019-07-31 04:53:50
转载 sklearn中Pipeline与make_pipeline的区别
转载自:https://stackoverflow.com/questions/40708077/what-is-the-difference-between-pipeline-and-make-pipeline-in-scikitThe only difference is that make_pipeline generates names for steps automatically....
2019-07-30 04:24:57
原创 Datacamp 笔记&代码 Machine Learning with the Experts: School Budgets 第二章 Creating a simple first model
更多原始数据文档和JupyterNotebookGithub: https://github.com/JinnyR/Datacamp_DataScienceTrack_PythonDatacamp track: Data Scientist with Python - Course 22 (2)ExerciseSetting up a train-test split in scikit-...
2019-07-27 01:50:07
原创 Datacamp 笔记&代码 Machine Learning with the Experts: School Budgets 第一章 Exploring the raw data
更多原始数据文档和JupyterNotebookGithub: https://github.com/JinnyR/Datacamp_DataScienceTrack_PythonDatacamp track: Data Scientist with Python - Course 22 (1)ExerciseLoading the dataNow it’s time to check ...
2019-07-26 15:25:31
原创 Datacamp 笔记&代码 Supervised Learning with scikit-learn 第四章 Preprocessing and pipelines
更多原始数据文档和JupyterNotebookGithub: https://github.com/JinnyR/Datacamp_DataScienceTrack_PythonDatacamp track: Data Scientist with Python - Course 21 (4)Exercise作者:JinnyR来源:CSDN原文:https://blog.csdn.n...
2019-07-25 17:54:26
原创 哪些机器学习算法需要进行特征缩放 - feature scaling
通常以距离或者相似度(例如标量积scaler product)作为计算量的算法: 例如KNN, SVM。而基于概率图模型(graphical model)的算法:Fisher LDA ,Naive Bayes, Decision trees 和 Tree-based 集成方法 (RF, XGB)不会受到特征缩放的影响。Reference: https://stats.stackexchang...
2019-07-25 16:10:33
原创 Datacamp 笔记&代码 Supervised Learning with scikit-learn 第三章 Fine-tuning your model
更多原始数据文档和JupyterNotebookGithub: https://github.com/JinnyR/Datacamp_DataScienceTrack_PythonDatacamp track: Data Scientist with Python - Course 21 (3)ExerciseMetrics for classificationIn Chapter 1,...
2019-07-24 22:34:17
原创 Scipy randint 与 Numpy randint 的区别
2019-07-24 21:50:58
原创 Datacamp 笔记&代码 Supervised Learning with scikit-learn 第二章 Regression
更多原始数据文档和JupyterNotebookGithub: https://github.com/JinnyR/Datacamp_DataScienceTrack_PythonDatacamp track: Data Scientist with Python - Course 21 (2)ExerciseImporting data for supervised learningI...
2019-07-23 23:58:52
原创 Datacamp 笔记&代码 Supervised Learning with scikit-learn 第一章 Classification
更多原始数据文档和JupyterNotebookGithub: https://github.com/JinnyR/Datacamp_DataScienceTrack_PythonExercisek-Nearest Neighbors: FitHaving explored the Congressional voting records dataset, it is time now t...
2019-07-23 17:38:08
TA创建的收藏夹 TA关注的收藏夹