(二)Semi-supervised(半监督学习)李宏毅

介绍

半监督学习数据: { ( x r , y ^ r ) } r = 1 R , { x u } u = R R + U \{(x^r,\hat{y}^r)\}^R_{r=1},\{ x^u\}^{R+U}_{u=R} {(xr,y^r)}r=1R,{xu}u=RR+U。unlabled的数据集U远大于R。

因此,用于测试集的数据的特征也可以用来进行半监督学习,只不过不能使用它的标签,称之为Transductive learning;
如果没有标注的数据不是测试集特征,称之为Inductive learning。
参考:李宏毅2016机器学习Semi-supervised

大纲

  1. 普通模型的半监督学习
  2. 低密度low-density分离假设
  3. 平滑假设

普通模型的半监督学习

如下图所示是一个普遍意义的计算更新流程:
在这里插入图片描述

普通模型

低密度分离假设

就是在分界处的数据量很少,是低密度的。
Self-training
两种类型label的不同。
是一种Hard label(一个数据属于一个类,向一个类贡献);
上一小节的模型是soft label(一个数据向多个标签贡献)。
在这里插入图片描述
对于神经网络来说,如果设置成soft label,是不能进行优化的,必须设置为Hard,这个就基于低密度分离假设。

熵正则化
使得在labeled的数据上尽可能准确,在unlabled数据上熵尽可能小
在这里插入图片描述

含义是几种分类的概率要尽可能的集中,越小越好。
在这里插入图片描述

平滑假设

  • x的分布是不均匀的,有些地方很集中,有些地方很分散。
  • 如果 x 1 x^1 x1 x 2 x^2 x2高密度区域非常接近(用基于图的路径描述),则 y ^ 1 \hat{y}^1 y^1 y ^ 2 \hat{y}^2 y^2很相似。connected by high density path
    在这里插入图片描述
    用基于图的路径描述
  • 基于k近邻
  • 基于最大阈值
  • 类别具有传递性

定义smoothness

S = 1 2 ∑ i , j w i , j ( y i − y j ) 2 S=\frac{1}{2}\sum_{i,j}{w_{i,j}(y^i-y^j)^2} S=21i,jwi,j(yiyj)2
越小月smooth
矩阵化运算

新的损失函数
L = ∑ x r C ( y r , y ^ r ) + λ S L=\sum_{x^r}{C(y^r,\hat{y}^r)} +\lambda S L=xrC(yr,y^r)+λS
第二部分就是依据于网络参数。

半监督学习代码

半监督的学习代码位置在每一个epoch循环刚开始的位置

  1. 对unlabled的数据使用model进行train得到伪数据集;
  2. 对train_set和pseudo_set使用ConcatDataset进行合并;
  3. 使用DataLoader对合并后的数据集进行导入。
# Whether to do semi-supervised learning.
do_semi = False

for epoch in range(n_epochs):
    # ---------- TODO ----------
    # In each epoch, relabel the unlabeled dataset for semi-supervised learning.
    # Then you can combine the labeled dataset and pseudo-labeled dataset for the training.
    if do_semi:
        # Obtain pseudo-labels for unlabeled data using trained model.
        pseudo_set = get_pseudo_labels(unlabeled_set, model)

        # Construct a new dataset and a data loader for training.
        # This is used in semi-supervised learning only.
        concat_dataset = ConcatDataset([train_set, pseudo_set])
        train_loader = DataLoader(concat_dataset, batch_size=batch_size, shuffle=True, num_workers=8, pin_memory=True)

    # ---------- Training ----------
    # Make sure the model is in train mode before training.
    model.train()
    # 后面就是正常的训练流程

其中的get_pseudo_labels为利用当前最新model得到伪标签的函数(还未看懂)。

def get_pseudo_labels(dataset, model, threshold=0.65):
    # This functions generates pseudo-labels of a dataset using given model.
    # It returns an instance of DatasetFolder containing images whose prediction confidences exceed a given threshold.
    # You are NOT allowed to use any models trained on external data for pseudo-labeling.
    device = "cuda" if torch.cuda.is_available() else "cpu"

    # Construct a data loader.
    data_loader = DataLoader(dataset, batch_size=batch_size, shuffle=False)

    # Make sure the model is in eval mode.
    model.eval()
    # Define softmax function.
    softmax = nn.Softmax(dim=-1)

    # Iterate over the dataset by batches.
    for batch in tqdm(data_loader):
        img, _ = batch

        # Forward the data
        # Using torch.no_grad() accelerates the forward process.
        with torch.no_grad():
            logits = model(img.to(device))

        # Obtain the probability distributions by applying softmax on logits.
        probs = softmax(logits)

        # ---------- TODO ----------
        # Filter the data and construct a new dataset.

    # # Turn off the eval mode.
    model.train()
    return dataset
  • 1
    点赞
  • 13
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
Semi-supervised classification with graph convolutional networks (GCNs) is a method for predicting labels for nodes in a graph. GCNs are a type of neural network that operates on graph-structured data, where each node in the graph represents an entity (such as a person, a product, or a webpage) and edges represent relationships between entities. The semi-supervised classification problem arises when we have a graph where only a small subset of nodes have labels, and we want to predict the labels of the remaining nodes. GCNs can be used to solve this problem by learning to propagate information through the graph, using the labeled nodes as anchors. The key idea behind GCNs is to use a graph convolution operation to aggregate information from a node's neighbors, and then use this aggregated information to update the node's representation. This operation is then repeated over multiple layers, allowing the network to capture increasingly complex relationships between nodes. To train a GCN for semi-supervised classification, we use a combination of labeled and unlabeled nodes as input, and optimize a loss function that encourages the network to correctly predict the labels of the labeled nodes while also encouraging the network to produce smooth predictions across the graph. Overall, semi-supervised classification with GCNs is a powerful and flexible method for predicting labels on graph-structured data, and has been successfully applied to a wide range of applications including social network analysis, drug discovery, and recommendation systems.

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值