机器学习
Adm1rat1on
太极生两仪,两仪生四象,四象生八卦
展开
-
构建词向量的数据读取
# 读取数据def read_data(path_1, path_2, path_3): with open(path_1, 'r', encoding='utf-8') as f1, \ open(path_2, 'r', encoding='utf-8') as f2, \ open(path_3, 'r', encoding='utf...原创 2019-11-13 15:29:10 · 321 阅读 · 0 评论 -
词向量数据预处理参考
预处理def parse_data(path): df = pd.read_csv(path, encoding='utf-8') data_x = df.Question.str.cat(df.Dialogue) data_y = [] if 'Report' in df.columns: data_y = df.Report retur...原创 2019-11-13 14:51:54 · 293 阅读 · 0 评论 -
featuretools实例
### 创建数据框import pandas as pd# First dataset contains the basic information for databases.databases_df = pd.DataFrame({"database_id": [2234, 1765, 8796, 2237, 3398],"creation_date": ["2018-02-01",...原创 2019-06-13 14:06:33 · 717 阅读 · 0 评论 -
python-smac-svm实例
import loggingimport numpy as npfrom sklearn import svm, datasetsfrom sklearn.model_selection import cross_val_score# Import ConfigSpace and different types of parametersfrom smac.configspace im...原创 2019-06-19 08:35:57 · 605 阅读 · 0 评论 -
python-autosklearn-MLPClassifier
from ConfigSpace.configuration_space import ConfigurationSpacefrom ConfigSpace.hyperparameters import CategoricalHyperparameter, \ UniformIntegerHyperparameter, UniformFloatHyperparameterimport...原创 2019-06-19 15:36:20 · 463 阅读 · 0 评论 -
python-autosklearn-RidgeRegression
from ConfigSpace.configuration_space import ConfigurationSpacefrom ConfigSpace.hyperparameters import UniformFloatHyperparameter, \ UniformIntegerHyperparameter, CategoricalHyperparameterimport...原创 2019-06-19 15:41:45 · 239 阅读 · 0 评论 -
python-autosklearn-LDA
from ConfigSpace.configuration_space import ConfigurationSpacefrom ConfigSpace.hyperparameters import UniformFloatHyperparameter, \ UniformIntegerHyperparameter, CategoricalHyperparameterimport...原创 2019-06-19 15:45:52 · 243 阅读 · 0 评论 -
python-featuretools生成新特征
import featuretools as ft# 定义EntitySetes = ft.EntitySet(id = 'clients')# 添加entity到EntitySetes = es.entity_from_dataframe(entity_id = 'app', dataframe = app, index = 'SK_ID_CURR', variable_types=a...原创 2019-06-19 16:55:45 · 966 阅读 · 1 评论 -
python-featuretools-feature-selection
import featuretools as ft# 定义EntitySetes = ft.EntitySet(id = 'engines')es = es.entity_from_dataframe(dataframe = train, entity_id = 'obs', ...原创 2019-06-19 17:16:02 · 424 阅读 · 0 评论 -
python Featuretools实现自动特征工程
import featuretools as ftfrom featuretools.selection import remove_low_information_featuresimport pandas as pdimport numpy as npfilename = 'data/ds76_tx_All_Data_74_2018_0912_070949.txt'def data...原创 2019-06-06 14:51:27 · 810 阅读 · 0 评论 -
语言翻译项目--2
实现预处理函数文本到单词id首先将文本转换成数字。在函数text_to_ids()中,请将单词中的source_text和target_text转为id。注意:需要在target_text中每个句子的末尾,添加< EOS >单词id。这样可以预测句子应该在什么地方结束。通过以下代码获取< EOS >单词ID:target_vocab_to_int[’< EOS...原创 2019-06-06 14:34:10 · 265 阅读 · 0 评论 -
语言翻译项目--1
运用神经网络完成机器翻译。使用英语和法语语句组成的数据集,训练一个序列到序列模型(sequence to sequence model),该模型能够将新的英语句子翻译成法语。获取数据import os import pickerimport copyimport numpy as npdef load_data(path): """ load Dataset from File...原创 2019-03-04 10:26:27 · 200 阅读 · 0 评论 -
LSTM预测股票数据--模型和参数
class NetConfig(): def __init__(self): self.rnn_unit = 10 self.input_size = 7 self.output_size = 1 self.lr = 0.0006 self.time_step = 20 self.batch_s...原创 2019-01-25 17:08:07 · 1904 阅读 · 1 评论 -
RNN预测股票数据---模型和参数
# 定义模型参数class NetConfig(): def __init__(self): self.rnn_unit = 10 self.input_size = 7 self.output_size = 1 self.lr = 0.0006 self.time_step = 20 self.batch_size = 80 self.weights = { ...原创 2019-01-25 17:03:08 · 2402 阅读 · 0 评论 -
LSTM预测股票数据---数据预处理
# 导入相应的模块import pandas as pdimport numpy as npimport matplotlib.pyplot as pltimport tensorflow as tf# 获取训练集def get_train_data(data,batch_size = 60 ,time_step = 20,train_begin = 0,train_end = 58...原创 2019-01-24 15:23:18 · 3208 阅读 · 0 评论 -
python实现catboost分类器以及部分参数解释
xxx原创 2019-01-04 16:18:24 · 6779 阅读 · 2 评论 -
python实现lightgbm以及重要参数解析
lightgbm参数解析max_depth : 树的深度num_leaves : 树的最大叶子节点数objective : 目标函数min_data_in_leaf : 一个叶子上最小数据量learning_rate : 学习率feature_fraction : 随机抽取特征的比例bagging_fraction : 在不进行重采样的情况下随机选择部分数据bagging_freq...原创 2019-01-04 16:11:45 · 8120 阅读 · 0 评论 -
python实现简单stackingCV分类
from sklearn import datasetsiris = datasets.load_iris()X, y = iris.data[:, 1:3], iris.targetfrom sklearn import model_selectionfrom sklearn.linear_model import LogisticRegressionfrom sklearn.nei...原创 2019-01-04 14:59:12 · 1128 阅读 · 0 评论 -
stacking分类和GridSearch
导入相应的模块from sklearn.linear_model import LogisticRegressionfrom sklearn.neighbors import KNeighborsClassifierfrom sklearn.naive_bayes import GaussianNB from sklearn.ensemble import RandomForestClas...原创 2019-01-04 14:55:13 · 669 阅读 · 0 评论 -
python-featuretools-advanced-featuretools
from sklearn.cluster import KMeansimport featuretools as ftimport featuretools.variable_types as vtypesfrom featuretools.primitives import make_agg_primitivefrom tsfresh.feature_extraction.feature...原创 2019-06-19 17:33:33 · 362 阅读 · 1 评论 -
python使用tpot做分类
topt做分类的简单使用import numpy as npfrom sklearn.datasets import load_irisfrom sklearn.model_selection import train_test_splitfrom tpot import TPOTClassifier#### 设置随机种子np.random.seed(10)#### 导入数据ir...原创 2019-06-17 10:57:43 · 659 阅读 · 0 评论 -
python-smac-自动化优化函数实例
import loggingfrom smac.facade.func_facade import fmin_smacdef rosenbrock_2d(x): """ The 2 dimensional Rosenbrock function as a toy model The Rosenbrock function is well know in the optim...原创 2019-06-18 19:38:38 · 800 阅读 · 0 评论 -
python-smac-randomforest自动化调参
import loggingimport osimport inspectimport numpy as npfrom sklearn.metrics import make_scorerfrom sklearn.model_selection import cross_val_scorefrom sklearn.ensemble import RandomForestRegresso...原创 2019-06-18 16:55:02 · 743 阅读 · 3 评论 -
python-autosklearn-randomsearch
import sklearn.model_selectionimport sklearn.datasetsimport sklearn.metricsfrom smac.facade.roar_facade import ROARfrom smac.scenario.scenario import Scenarioimport autosklearn.classification...原创 2019-06-18 15:52:28 · 535 阅读 · 0 评论 -
python-autosklearn-metric
import numpy as npimport sklearn.model_selectionimport sklearn.datasetsimport sklearn.metricsimport autosklearn.classificationimport autosklearn.metricsdef accuracy(solution, prediction): ...原创 2019-06-18 15:08:17 · 241 阅读 · 0 评论 -
python-autosklearn-openml-处理连续分类数据
import sklearn.model_selectionimport sklearn.datasetsimport sklearn.metricsimport autosklearn.classificationimport openmldef main(): openml.config.apikey = '610344db6388d9ba34f6db45a3cf71de...原创 2019-06-18 14:52:25 · 418 阅读 · 0 评论 -
python-autosklearn-regression
import sklearn.model_selectionimport sklearn.datasetsimport sklearn.metricsimport autosklearn.regressiondef main(): X, y = sklearn.datasets.load_boston(return_X_y=True) feature_types = ([...原创 2019-06-18 11:18:58 · 390 阅读 · 0 评论 -
python-autosklearn-sequence-usage
import sklearn.model_selectionimport sklearn.datasetsimport sklearn.metricsimport autosklearn.classificationdef main(): X, y = sklearn.datasets.load_breast_cancer(return_X_y=True) X_trai...原创 2019-06-18 11:04:34 · 146 阅读 · 0 评论 -
python-autosklearn-多进程
import multiprocessingimport shutilimport sklearn.model_selectionimport sklearn.datasetsimport sklearn.metricsfrom autosklearn.metrics import accuracyfrom autosklearn.classification import AutoS...原创 2019-06-18 10:55:17 · 737 阅读 · 0 评论 -
python-autosklearn-单个机器上的并行
import sklearn.model_selectionimport sklearn.datasetsimport sklearn.metricsimport autosklearn.classificationdef main(): X, y = sklearn.datasets.load_breast_cancer(return_X_y=True) X_trai...原创 2019-06-18 10:12:33 · 564 阅读 · 0 评论 -
python--autosklearn--crossvalidation
import sklearn.model_selectionimport sklearn.datasetsimport sklearn.metricsimport autosklearn.classificationdef main(): X, y = sklearn.datasets.load_breast_cancer(return_X_y=True) X_trai...原创 2019-06-18 09:57:18 · 115 阅读 · 0 评论 -
python--autosklearn--holdout
import sklearn.model_selectionimport sklearn.datasetsimport sklearn.metricsimport autosklearn.classificationdef main(): X, y = sklearn.datasets.load_breast_cancer(return_X_y=True) X_trai...原创 2019-06-18 09:35:54 · 372 阅读 · 0 评论 -
用Titanic数据实现autosklearn
autosklearn实现Titanic分类import autosklearn.classification as autoskclimport pandas as pdimport numpy as npimport sklearn as skimport seaborn as snsimport matplotlib.pyplot as pltfrom sklearn.mode...原创 2019-06-17 14:22:08 · 240 阅读 · 0 评论 -
python_h2o实现梯度提升树算法
from h2o.estimators.gbm import H2OGradientBoostingEstimator# 读取数据data = h2o.import_file(path='data/allyears2k.csv')# 分割数据train,test = data.split_frame([.9])# 构建x和ymyY = "IsDepDelayed"myX = ["O...原创 2019-06-25 10:57:05 · 635 阅读 · 0 评论 -
python--h2o实现广义线性模型
%matplotlib inlineimport random, os, sysimport h2oimport pandasimport pprintimport operatorimport matplotlibfrom h2o.estimators.glm import H2OGeneralizedLinearEstimatorfrom h2o.estimators.gbm ...原创 2019-06-25 10:49:00 · 934 阅读 · 0 评论 -
autosklearn简单实现
autosklearn简单使用import autosklearn.classificationimport sklearn.metricsimport sklearn.model_selectionimport sklearn.datasetsX, y = sklearn.datasets.load_digits(return_X_y=True)X_train, X_test, y...原创 2019-06-17 14:12:57 · 286 阅读 · 0 评论 -
python rapidML实现自动化调参
ripidML简单实现自动化调参from sklearn.datasets import load_bostonfrom sklearn.model_selection import train_test_splitimport RapidMLhousing = load_boston()X_train, X_test, y_train, y_test = train_test_spl...原创 2019-06-17 11:22:27 · 484 阅读 · 0 评论 -
python tpot做回归
tpot做回归模型import numpy as npfrom tpot import TPOTRegressorheart_data = np.load('data/heart_preproc.npz')X_train = heart_data['X_train']X_test = heart_data['X_test']y_train = heart_data['y_train...原创 2019-06-17 11:02:40 · 634 阅读 · 0 评论 -
python实现stacking使用概率作为高层特征
from sklearn import model_selectionfrom sklearn.linear_model import LogisticRegressionfrom sklearn.neighbors import KNeighborsClassifierfrom sklearn.naive_bayes import GaussianNBfrom sklearn.ensem...原创 2019-01-04 14:52:39 · 392 阅读 · 0 评论 -
python实现简单的stacking分类
导入相应的模块from sklearn import datasetsiris = datasets.load_iris()X, y = iris.data[:, 1:3], iris.targetfrom sklearn import model_selectionfrom sklearn.linear_model import LogisticRegressionfrom skl...原创 2019-01-04 14:38:37 · 1840 阅读 · 0 评论