1.数据处理
1.1导入数据
我这里用的网上下载的diabetes.csv数据,首先载入数据,导入包:
import torch
import numpy as np
from torch import nn
from torch.autograd import Variable
import torch.nn.functional as F
import matplotlib.pyplot as plt
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
%matplotlib inline
data = pd.read_csv('diabetes.csv')
看看数据长啥样:
可以看到’Outcome’这一栏是数据的类别
1.2数据预处理
我们将特征和类别分离,划分验证集和训练集,然后将特征归一化:
data1=data.copy()
y=data1.loc[:,['Outcome']] #数据标签
del data1['Outcome']
x = data1 #数据
x_train, x_test,y_train,y_test= train_test_split(x, y, test_size=0.3,random_state=2018) #数据集三七分,随机种子2018
ss = StandardScaler()
x_train = ss.fit_transform(x_train) #数据标准化
x_test = ss.fit_transform(x_test) #数据标准化
1.3数据转化为tensor
x_train_tensor=torch.from_numpy(x_train)
x_test_tensor=torch.from_numpy(x_test)
y_train_numpy=np.array(y_train)
y_train_tensor=torch.from_numpy(y_train_numpy)
y_test_numpy=np.array(y_tes