- 博客(15)
- 资源 (4)
- 收藏
- 关注
原创 z score vs. min-max scaling 优缺点
Min-max:所有特征具有相同尺度 (scale) 但不到处理outlierMin-max normalization: Guarantees all features will have the exact same scale but does not handle outliers well.Z-score normalization: Handles outliers, but do...
2019-07-31 16:58:15 5312
原创 Datacamp 笔记&代码 Unsupervised Learning in Python 第三章 Decorrelating your data and dimension reduction
更多原始数据文档和JupyterNotebookGithub: https://github.com/JinnyR/Datacamp_DataScienceTrack_PythonDatacamp track: Data Scientist with Python - Course 23 (3)ExerciseCorrelated data in natureYou are given ...
2019-07-31 05:09:14 594
原创 Datacamp 笔记&代码 Unsupervised Learning in Python 第二章 Visualization with hierarchical clustering &t-SNE
更多原始数据文档和JupyterNotebookGithub: https://github.com/JinnyR/Datacamp_DataScienceTrack_PythonDatacamp track: Data Scientist with Python - Course 23 (2)ExerciseHierarchical clustering of the grain dat...
2019-07-31 05:06:31 670
原创 Datacamp 笔记&代码 Unsupervised Learning in Python 第一章 Clustering for dataset exploration
更多原始数据文档和JupyterNotebookGithub: https://github.com/JinnyR/Datacamp_DataScienceTrack_PythonDatacamp track: Data Scientist with Python - Course 23 (1)ExerciseClustering 2D pointsFrom the scatter pl...
2019-07-31 05:01:10 1202
原创 Datacamp 笔记&代码 Machine Learning with the Experts: School Budgets 第四章 Learning from the experts
更多原始数据文档和JupyterNotebookGithub: https://github.com/JinnyR/Datacamp_DataScienceTrack_PythonDatacamp track: Data Scientist with Python - Course 22 (4)ExerciseDeciding what’s a wordBefore you build ...
2019-07-31 04:56:56 457
原创 Datacamp 笔记&代码 Machine Learning with the Experts: School Budgets 第三章 Improving your model
更多原始数据文档和JupyterNotebookGithub: https://github.com/JinnyR/Datacamp_DataScienceTrack_PythonDatacamp track: Data Scientist with Python - Course 22 (3)ExerciseInstantiate pipelineIn order to make yo...
2019-07-31 04:53:50 586
转载 sklearn中Pipeline与make_pipeline的区别
转载自:https://stackoverflow.com/questions/40708077/what-is-the-difference-between-pipeline-and-make-pipeline-in-scikitThe only difference is that make_pipeline generates names for steps automatically....
2019-07-30 04:24:57 5848
原创 Datacamp 笔记&代码 Machine Learning with the Experts: School Budgets 第二章 Creating a simple first model
更多原始数据文档和JupyterNotebookGithub: https://github.com/JinnyR/Datacamp_DataScienceTrack_PythonDatacamp track: Data Scientist with Python - Course 22 (2)ExerciseSetting up a train-test split in scikit-...
2019-07-27 01:50:07 522
原创 Datacamp 笔记&代码 Machine Learning with the Experts: School Budgets 第一章 Exploring the raw data
更多原始数据文档和JupyterNotebookGithub: https://github.com/JinnyR/Datacamp_DataScienceTrack_PythonDatacamp track: Data Scientist with Python - Course 22 (1)ExerciseLoading the dataNow it’s time to check ...
2019-07-26 15:25:31 1035
原创 Datacamp 笔记&代码 Supervised Learning with scikit-learn 第四章 Preprocessing and pipelines
更多原始数据文档和JupyterNotebookGithub: https://github.com/JinnyR/Datacamp_DataScienceTrack_PythonDatacamp track: Data Scientist with Python - Course 21 (4)Exercise作者:JinnyR来源:CSDN原文:https://blog.csdn.n...
2019-07-25 17:54:26 1349
原创 哪些机器学习算法需要进行特征缩放 - feature scaling
通常以距离或者相似度(例如标量积scaler product)作为计算量的算法: 例如KNN, SVM。而基于概率图模型(graphical model)的算法:Fisher LDA ,Naive Bayes, Decision trees 和 Tree-based 集成方法 (RF, XGB)不会受到特征缩放的影响。Reference: https://stats.stackexchang...
2019-07-25 16:10:33 738
原创 Datacamp 笔记&代码 Supervised Learning with scikit-learn 第三章 Fine-tuning your model
更多原始数据文档和JupyterNotebookGithub: https://github.com/JinnyR/Datacamp_DataScienceTrack_PythonDatacamp track: Data Scientist with Python - Course 21 (3)ExerciseMetrics for classificationIn Chapter 1,...
2019-07-24 22:34:17 1606
原创 Scipy randint 与 Numpy randint 的区别
scipy.stats.randint:https://docs.scipy.org/doc/scipy-0.15.1/reference/generated/scipy.stats.randint.htmlnumpy.random.randint:https://docs.scipy.org/doc/numpy/reference/generated/numpy.random.randin...
2019-07-24 21:50:58 1338
原创 Datacamp 笔记&代码 Supervised Learning with scikit-learn 第二章 Regression
更多原始数据文档和JupyterNotebookGithub: https://github.com/JinnyR/Datacamp_DataScienceTrack_PythonDatacamp track: Data Scientist with Python - Course 21 (2)ExerciseImporting data for supervised learningI...
2019-07-23 23:58:52 2459
原创 Datacamp 笔记&代码 Supervised Learning with scikit-learn 第一章 Classification
更多原始数据文档和JupyterNotebookGithub: https://github.com/JinnyR/Datacamp_DataScienceTrack_PythonExercisek-Nearest Neighbors: FitHaving explored the Congressional voting records dataset, it is time now t...
2019-07-23 17:38:08 1949
空空如也
TA创建的收藏夹 TA关注的收藏夹
TA关注的人