下表是肾癌标本资料(数据来源于《卫生统计学》第四版第11章):
序号 | X1 | X2 | X3 | X4 | X5 | y |
1 | 59 | 2 | 43.4 | 2 | 1 | 0 |
2 | 36 | 1 | 57.2 | 1 | 1 | 0 |
3 | 61 | 2 | 190 | 2 | 1 | 0 |
4 | 58 | 3 | 128 | 4 | 3 | 1 |
5 | 55 | 3 | 80 | 4 | 3 | 1 |
6 | 61 | 1 | 94.4 | 4 | 2 | 0 |
7 | 38 | 1 | 76 | 1 | 1 | 0 |
8 | 42 | 1 | 240 | 3 | 2 | 0 |
9 | 50 | 1 | 74 | 1 | 1 | 0 |
10 | 58 | 3 | 68.6 | 2 | 2 | 0 |
11 | 68 | 3 | 132.8 | 4 | 2 | 0 |
12 | 25 | 2 | 94.6 | 4 | 3 | 1 |
13 | 52 | 1 | 56 | 1 | 1 | 0 |
14 | 32 | 1 | 47.8 | 2 | 1 | 0 |
15 | 36 | 3 | 31.6 | 3 | 1 | 1 |
16 | 42 | 1 | 66.2 | 2 | 1 | 0 |
17 | 14 | 3 | 138.6 | 3 | 3 | 1 |
18 | 32 | 1 | 114 | 2 | 3 | 0 |
19 | 35 | 1 | 40.2 | 2 | 1 | 0 |
20 | 70 | 3 | 177.2 | 4 | 3 | 1 |
21 | 65 | 2 | 51.6 | 4 | 4 | 1 |
22 | 45 | 2 | 124 | 2 | 4 | 0 |
23 | 68 | 3 | 127.2 | 3 | 3 | 1 |
24 | 31 | 2 | 124.8 | 2 | 3 | 0 |
数据说明:
y: 肾细胞癌转移情况(有转移y=1;无转移y=0);
x1: 确诊时患者的年龄(岁)
x2:肾细胞癌血管内皮生长因子(VEGF),其阳性表述由低到高共3个等级;
x3: 肾细胞癌组织内微血管数(MVC);
x4:肾癌细胞核组织学分级,由低到高共4级;
x5:肾细胞癌分期,由低到高共4期。
请用逻辑回归模型预测肾癌是否转移,自己造至少两个数据进行预测判断。
import tensorflow as tf
import numpy as np
import pylab as plt
x_data = np.float32(np.array([[59,2,43.4,2,1],[36,1,57.2,1,1],[61,2,190,2,1],[58,3,128,4,3],[55,3,80,4,3],[61,1,94.4,4,2],[38,1,76,1,1],
[42,1,240,3,2],[50,1,74,1,1],[58,3,68.6,2,2],[68,3,132.8,4,2],[25,2,94.6,4,3],[52,1,56,1,1],
[32,1,47.8,2,1],[36,3,31.6,3,1],[42,1,66.2,2,1],[14,3,138.6,3,3],[32,1,114,2,3],[35,1,40.2,2,1],
[70,3,177.2,4,3],[65,2,51.6,4,4],[45,2,124,2,4],[68,3,127.2,3,3],[31,2,124.8,2,3]]))
y_data = np.float32(np.array([[0,0,0,1,1,0,0,0,0,0,0,1,0,0,1,0,1,0,0,1,1,0,1,0]])).T
#测试数据
yu_x_data = np.float32(np.array([[58,3,128,4,3],[55,3,80,4,3],[36,1,57.2,1,1]]))
yu_b1 = tf.Variable(tf.zeros([3,1]))
#print("x_data: ", x_data)
#print("y_data: ", y_data)
w = tf.Variable(tf.zeros([5,1]))
b = tf.Variable(tf.zeros([24,1]))
y1 = tf.matmul(x_data,w) + b
yu_y1 = tf.matmul(yu_x_data,w) + yu_b1
#把值转换到0-1区间
y = tf.nn.sigmoid(y1)
yu_y = tf.nn.sigmoid(yu_y1)
#交叉熵的总值
loss = -tf.reduce_sum(y_data * tf.log(y) + (1 - y_data) * tf.log(1 - y))
train = tf.train.AdamOptimizer(0.02).minimize(loss)
rs = tf.cast(y > 0.9,dtype = tf.int8)
yu_rs = tf.cast(yu_y > 0.9,dtype = tf.int8)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
for step in range(10000):
sess.run(train)
if step % 1000 == 0:
print('loss:',sess.run([loss]))
#print('w:',sess.run(w))
#print('b:',sess.run(b))
print('测试数据值的输出:',sess.run(yu_rs))
最终输出: loss: [27.408615] loss: [0.10465329] loss: [0.03114378] loss: [0.013801137] loss: [0.0071040075] loss: [0.0039306525] loss: [0.0022601355] loss: [0.0013286448] loss: [0.0007907951] loss: [0.00047440067] 要预测的值: [[1] [1] [0]]