TensorFlow2.0教程-结构化数据分类
Tensorflow 2.0 教程持续更新 :https://blog.csdn.net/qq_31456593/article/details/88606284
完整tensorflow2.0教程代码请看tensorflow2.0:中文教程tensorflow2_tutorials_chinese(欢迎star)
入门教程:
TensorFlow 2.0 教程- Keras 快速入门
TensorFlow 2.0 教程-keras 函数api
TensorFlow 2.0 教程-使用keras训练模型
TensorFlow 2.0 教程-用keras构建自己的网络层
TensorFlow 2.0 教程-keras模型保存和序列化
本教程展示了如何对结构化数据进行分类(例如CSV中的表格数据)。我们使用Keras定义模型,并将csv中各列的特征转化为训练的输入。 本教程包含一下功能代码:
- 使用Pandas加载CSV文件。
- 构建一个输入的pipeline,使用tf.data批处理和打乱数据。
- 从CSV中的列映射到用于训练模型的输入要素。
- 使用Keras构建,训练和评估模型。
from __future__ import absolute_import, division, print_function
import numpy as np
import pandas as pd
import tensorflow as tf
from tensorflow import feature_column
from tensorflow.keras import layers
from sklearn.model_selection import train_test_split
print(tf.__version__)
2.0.0-alpha0
1.数据集
我们将使用克利夫兰诊所心脏病基金会提供的一个小数据集。 CSV中有几百行。 每行描述一个患者,每列描述一个属性。 我们将使用此信息来预测患者是否患有心脏病,该疾病在该数据集中是二元分类任务。
Column Description Feature Type Data Type Age Age in years Numerical integer Sex (1 = male; 0 = female) Categorical integer CP Chest pain type (0, 1, 2, 3, 4) Categorical integer Trestbpd Resting blood pressure (in mm Hg on admission to the hospital) Numerical integer Chol Serum cholestoral in mg/dl Numerical integer FBS (fasting blood sugar > 120 mg/dl) (1 = true; 0 = false) Categorical integer RestECG Resting electrocardiographic results (0, 1, 2) Categorical integer Thalach Maximum heart rate achieved Numerical integer Exang Exercise induced angina (1 = yes; 0 = no) Categorical integer Oldpeak ST depression induced by exercise relative to rest Numerical integer Slope The slope of the peak exercise ST segment Numerical float CA Number of major vessels (0-3) colored by flourosopy Numerical integer Thal 3 = normal; 6 = fixed defect; 7 = reversable defect Categorical string Target Diagnosis of heart disease (1 = true; 0 = false) Classification integer
2.准备数据
使用pandas读取数据
URL = 'https://storage.googleapis.com/applied-dl/heart.csv'
dataframe = pd.read_csv(URL)
dataframe.head()
age | sex | cp | trestbps | chol | fbs | restecg | thalach | exang | oldpeak | slope | ca | thal | target | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 63 | 1 | 1 | 145 | 233 | 1 | 2 | 150 | 0 | 2.3 | 3 | 0 | fixed | 0 |
1 | 67 | 1 | 4 | 160 | 286 | 0 | 2 | 108 | 1 | 1.5 | 2 | 3 | normal | 1 |
2 | 67 | 1 | 4 | 120 | 229 | 0 | 2 | 129 | 1 | 2.6 | 2 | 2 | reversible | 0 |
3 | 37 | 1 | 3 | 130 | 250 | 0 | 0 | 187 | 0 | 3.5 | 3 | 0 | normal | 0 |
4 | 41 | 0 | 2 | 130 | 204 | 0 | 2 | 172 | 0 | 1.4 | 1 | 0 | normal | 0 |
划分训练集验证集和测试集
train, test = train_test_split(dataframe, test_size=0.2)
train, val = train_test_split(train