cs224w-图机器学习-Colab 3

13 篇文章 1 订阅

一、GCN

二、GAT

公式是:
在这里插入图片描述

1.forward函数

下面是库中的实现。

    def forward(self, x: Union[Tensor, OptPairTensor], edge_index: Adj,
                size: Size = None, return_attention_weights=None):
        # type: (Union[Tensor, OptPairTensor], Tensor, Size, NoneType) -> Tensor  # noqa
        # type: (Union[Tensor, OptPairTensor], SparseTensor, Size, NoneType) -> Tensor  # noqa
        # type: (Union[Tensor, OptPairTensor], Tensor, Size, bool) -> Tuple[Tensor, Tuple[Tensor, Tensor]]  # noqa
        # type: (Union[Tensor, OptPairTensor], SparseTensor, Size, bool) -> Tuple[Tensor, SparseTensor]  # noqa
        r"""
        Args:
            return_attention_weights (bool, optional): If set to :obj:`True`,
                will additionally return the tuple
                :obj:`(edge_index, attention_weights)`, holding the computed
                attention weights for each edge. (default: :obj:`None`)
        """
        #分别对应head数(即有几套注意力系数)和下层特征维度
        H, C = self.heads, self.out_channels

        x_l: OptTensor = None
        x_r: OptTensor = None
        alpha_l: OptTensor = None
        alpha_r: OptTensor = None
        if isinstance(x, Tensor):
            assert x.dim() == 2, 'Static graphs not supported in `GATConv`.'
            #这是把本层特征变换到输出空间中。
            x_l = x_r = self.lin_l(x).view(-1, H, C)
            #这是准备计算$e_{i,j}$,把两个向量算好
            alpha_l = (x_l * self.att_l).sum(dim=-1)
            alpha_r = (x_r * self.att_r).sum(dim=-1)
        else:
            x_l, x_r = x[0], x[1]
            assert x[0].dim() == 2, 'Static graphs not supported in `GATConv`.'
            x_l = self.lin_l(x_l).view(-1, H, C)
            alpha_l = (x_l * self.att_l).sum(dim=-1)
            if x_r is not None:
                x_r = self.lin_r(x_r).view(-1, H, C)
                alpha_r = (x_r * self.att_r).sum(dim=-1)

        assert x_l is not None
        assert alpha_l is not None

        if self.add_self_loops:
            if isinstance(edge_index, Tensor):
                num_nodes = x_l.size(0)
                if x_r is not None:
                    num_nodes = min(num_nodes, x_r.size(0))
                if size is not None:
                    num_nodes = min(size[0], size[1])
                edge_index, _ = remove_self_loops(edge_index)
                edge_index, _ = add_self_loops(edge_index, num_nodes=num_nodes)
            elif isinstance(edge_index, SparseTensor):
                edge_index = set_diag(edge_index)

        # propagate_type: (x: OptPairTensor, alpha: OptPairTensor)
        #这个函数很神奇。传进去的alpha=(alpha_l, alpha_r)这俩参数,到了message函数中,就自动变成
        #205*head维度的了。其中205是边数,head是注意力个数。
        #而且,这俩参数自动加上后缀_i和_j,就在message函数中,就自动变成alpha_i和alpha_j了
        out = self.propagate(edge_index, x=(x_l, x_r),
                             alpha=(alpha_l, alpha_r), size=size)

        alpha = self._alpha
        self._alpha = None

        if self.concat:
            out = out.view(-1, self.heads * self.out_channels)
        else:
            out = out.mean(dim=1)

        if self.bias is not None:
            out += self.bias

        if isinstance(return_attention_weights, bool):
            assert alpha is not None
            if isinstance(edge_index, Tensor):
                return out, (edge_index, alpha)
            elif isinstance(edge_index, SparseTensor):
                return out, edge_index.set_value(alpha, layout='coo')
        else:
            return out

2.message函数

 def message(self, x_j: Tensor, alpha_j: Tensor, alpha_i: OptTensor,
                index: Tensor, ptr: OptTensor,
                size_i: Optional[int]) -> Tensor:
        # alpha_j 和 alpha_i都是从propagate函数传来的。分别对应着每个边头结点的alpha和尾结点的alpha
        #这样,相加后,205个数据分别对应着每个边的注意力系数
        alpha = alpha_j if alpha_i is None else alpha_j + alpha_i
        alpha = F.leaky_relu(alpha, self.negative_slope)
        alpha = softmax(alpha, index, ptr, size_i)
        self._alpha = alpha
        alpha = F.dropout(alpha, p=self.dropout, training=self.training)
        #返回乘以注意力系数后的表示
        return x_j * alpha.unsqueeze(-1)

3.aggregate函数

这个函数的作用很单一:接收边序列和边序列对应的节点表示,然后求和。所以在调用这个函数之前,所有参数都应该算好了。因此message函数的负担很大。

4.总结

公式中,除了求和操作是aggregate函数实现的,其他操作都是在forward和message中实现的。因此具体在哪实现应该不太重要。仿照别人写的写就行了。一般都是在forward中进行线性投影,准备好一些向量。其他涉及中心节点、边序列的操作,只能在message中完成。因为调用message之前,propagate函数早就做好了准备,把_i和_j数据准备好了。
在这里插入图片描述
GAT实现中,在forward中实现下面的红色部分:

在这里插入图片描述
在message中实现了如下红色部分:
在这里插入图片描述
在aggregate中实现了如下:在这里插入图片描述

上面几个截图都忽略了自环的部分。

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
Sure! Here are the steps to fine-tune ViT-S on a custom dataset using Google Colab: 1. Open a new Google Colab notebook and select a GPU runtime environment. 2. Install the necessary libraries: ``` !pip install torch torchvision !pip install timm ``` 3. Download and prepare the custom dataset. You can use any dataset of your choice. Make sure to split it into training and validation sets. 4. Define the data loaders: ``` import torch import torchvision.transforms as transforms from torch.utils.data import DataLoader from torchvision.datasets import ImageFolder # Define the transformations transform_train = transforms.Compose([ transforms.RandomResizedCrop(224), transforms.RandomHorizontalFlip(), transforms.ToTensor(), transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]) ]) transform_val = transforms.Compose([ transforms.Resize(256), transforms.CenterCrop(224), transforms.ToTensor(), transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]) ]) # Define the data loaders train_dataset = ImageFolder('path_to_train_data', transform=transform_train) train_loader = DataLoader(train_dataset, batch_size=32, shuffle=True, num_workers=4) val_dataset = ImageFolder('path_to_val_data', transform=transform_val) val_loader = DataLoader(val_dataset, batch_size=32, shuffle=False, num_workers=4) ``` Replace 'path_to_train_data' and 'path_to_val_data' with the paths to your training and validation data folders, respectively. 5. Load the pre-trained ViT-S model: ``` import timm model = timm.create_model('vit_small_patch16_224', pretrained=True) ``` 6. Modify the last layer of the model to fit your custom dataset: ``` import torch.nn as nn num_classes = len(train_dataset.classes) model.head = nn.Sequential( nn.LayerNorm((768,)), nn.Linear(768, num_classes) ) ``` Replace '768' with the hidden size of the model you are using. For ViT-S, it is 768. 7. Define the optimizer and criterion: ``` import torch.optim as optim optimizer = optim.Adam(model.parameters(), lr=1e-4) criterion = nn.CrossEntropyLoss() ``` 8. Fine-tune the model: ``` device = torch.device("cuda" if torch.cuda.is_available() else "cpu") model.to(device) num_epochs = 10 for epoch in range(num_epochs): train_loss = 0.0 val_loss = 0.0 correct = 0 total = 0 # Train the model model.train() for inputs, labels in train_loader: inputs, labels = inputs.to(device), labels.to(device) optimizer.zero_grad() outputs = model(inputs) loss = criterion(outputs, labels) loss.backward() optimizer.step() train_loss += loss.item() * inputs.size(0) # Evaluate the model on validation set model.eval() with torch.no_grad(): for inputs, labels in val_loader: inputs, labels = inputs.to(device), labels.to(device) outputs = model(inputs) loss = criterion(outputs, labels) val_loss += loss.item() * inputs.size(0) _, predicted = torch.max(outputs.data, 1) total += labels.size(0) correct += (predicted == labels).sum().item() train_loss = train_loss / len(train_loader.dataset) val_loss = val_loss / len(val_loader.dataset) accuracy = 100 * correct / total print('Epoch: {} \tTraining Loss: {:.6f} \tValidation Loss: {:.6f} \tAccuracy: {:.2f}'.format( epoch+1, train_loss, val_loss, accuracy)) ``` 9. Save the model: ``` torch.save(model.state_dict(), 'path_to_save_model') ``` Replace 'path_to_save_model' with the path where you want to save the model. That's it! You have fine-tuned ViT-S on your custom dataset using Google Colab.

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值