1.4.3 无监督学习

一个经典的无监督学习任务是找到数据的“最佳”表示,“最佳”可以是不同的表示,但是一般来说是指该表示在比本事表示的信息更简单或者更容易访问而受到一些惩罚或者限制的情况下,尽可能多的保存关于x的信息。

有很多方式定义较简单的表示,常见的三种有低维表示,稀疏表示和独立表示

主成分分析(PCA)

线性代数一章说过,一种降维的手段

k均值据类(k-mean)

k-均值聚类算法将训练集分为k个靠近彼此的不同样本聚类,因此该算法提供了k维的one-hot编码向量以表示输入x。当x属于据类i时,有hi=1, h的其它项为0

k-mean有k个不同的中心点,然后迭代交换两个不同的步骤直到收敛。

  1. 每个训练样本分配到最近的中心点所代表的聚类i
  2. 每个中心点跟新为聚类i中所有训练样本x的均值
import numpy as np
from sklearn import datasets
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
import tensorflow as tf

def loadData():
    iris = datasets.load_iris()
    X=iris.data
    y=iris.target
    return X,y

def kmeansCluster(X,numClusters):
    get_inputs=lambda: tf.train.limit_epochs(tf.convert_to_tensor(X, dtype=tf.float32), num_epochs=1)
    # 加载模型
    cluster = tf.contrib.factorization.KMeansClustering(num_clusters=numClusters,
                                                      initial_clusters=tf.contrib.factorization.KMeansClustering.KMEANS_PLUS_PLUS_INIT)
    cluster.train(input_fn=get_inputs, steps=2000)  # 训练
    y_pred=cluster.predict_cluster_index(input_fn=get_inputs)  # 预测
    y_pred=np.asarray(list(y_pred))
    return y_pred

def plotFigure(fignum,title, X,y):
    fig = plt.figure(fignum, figsize=(8,6))
    ax = Axes3D(fig, rect=[0, 0, .95, 1], elev=48, azim=134)
    ax.scatter(X[:, 3], X[:, 0], X[:, 2],
               c=y.astype(np.float), edgecolor='k')
    ax.w_xaxis.set_ticklabels([])
    ax.w_yaxis.set_ticklabels([])
    ax.w_zaxis.set_ticklabels([])
    ax.set_xlabel('Petal width')
    ax.set_ylabel('Sepal length')
    ax.set_zlabel('Petal length')
    ax.set_title(title)
    ax.dist = 10
    fig.show()

if __name__ == '__main__':
    X,y = loadData()
    y_pred = kmeansCluster(X,3)
    plotFigure(1,"3 clusters",X,y_pred)
    plotFigure(2,"Ground Truth",X,y)

INFO:tensorflow:Using default config.
WARNING:tensorflow:Using temporary folder as model directory: C:\Users\egbert\AppData\Local\Temp\tmp27idx79j
INFO:tensorflow:Using config: {'_model_dir': 'C:\\Users\\egbert\\AppData\\Local\\Temp\\tmp27idx79j', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x000001F520FD3BA8>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Saving checkpoints for 0 into C:\Users\egbert\AppData\Local\Temp\tmp27idx79j\model.ckpt.
INFO:tensorflow:Saving checkpoints for 1 into C:\Users\egbert\AppData\Local\Temp\tmp27idx79j\model.ckpt.
INFO:tensorflow:Loss for final step: None.
WARNING:tensorflow:Input graph does not use tf.data.Dataset or contain a QueueRunner. That means predict yields forever. This is probably a mistake.
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Restoring parameters from C:\Users\egbert\AppData\Local\Temp\tmp27idx79j\model.ckpt-1
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.


D:\Anaconda\lib\site-packages\matplotlib\figure.py:459: UserWarning: matplotlib is currently using a non-GUI backend, so cannot show the figure
  "matplotlib is currently using a non-GUI backend, "

png

在这里插入图片描述

  • 1
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值