Ch4. Example 4-4 Classify surnames with Convolutional Neural Network

祸莫大于不知足,咎莫大于欲得,故知足之足,常足矣。

 

这是第4章的第二个案例,用卷积网络代替感知机来解决人名分类问题。

代码笔记:

1. 经过向量化后,数据的结构是怎样的?

答:[batch_size, vocab_size, max_surname_length]

2. 文本是一维数据,因此用的是一维卷积(在word-level上是一维卷积;虽然文本经过词向量表达后是二维数据,但是在embedding-level上的二维卷积没有意义)。一维卷积带来的问题是需要通过设计不同 kernel_size 的 filter 获取不同宽度的视野

3. 卷积网络中的层数,核大小,膨胀系数,步幅都是如何选取的?

答:难道真的靠猜和试验?

4. 当输出通道数增大(num_channels默认256),计算量也随之增大

以下是网络构建:

class SurnameClassifier(nn.Module):
    """ A 2­layer multilayer perceptron for classifying surnames """
    def __init__(self,initial_num_channels,num_classes,num_channels):
        """
        Args:
            initial_num_channels (int): size of the incoming feature vector
            num_classes (int): size of the output prediction vector
            num_channels (int): constant channel size to use throughout network        
        """
        super(SurnameClassifier,self).__init__()
        self.convnet=nn.Sequential(
            nn.Conv1d(in_channels=initial_num_channels,out_channels=num_channels,kernel_size=3),
            nn.ELU(),
            nn.Conv1d(in_channels=num_channels,out_channels=num_channels,kernel_size=3,stride=2),
            nn.ELU(),
            nn.Conv1d(in_channels=num_channels,out_channels=num_channels,kernel_size=3,stride=2),
            nn.ELU(),
            nn.Conv1d(in_channels=num_channels,out_channels=num_channels,kernel_size=3),
            nn.ELU()
            )
        self.fc=nn.Linear(num_channels,num_classes)
    
    def forward(self,x_surname,apply_softmax=False):
        """The forward pass of the classifier
        
        Args:
            x_surname (torch.Tensor): an input data tensor
            x_surname.shape should be (batch, initial_num_channels,
                                        max_surname_length)
            apply_softmax (bool): a flag for the softmax activation
                should be false if used with the Cross Entropy losses
        Returns:
            the resulting tensor. tensor.shape should be (batch,)
        """
        features1=self.convnet(x_surname) # output shape [128,256,1]
        features=features1.squeeze(dim=2) # output shape [128,256]
        prediction_vector=self.fc(features)
    
        if apply_softmax:
            prediction_vector=F.softmax(prediction_vector,dim=1)
        
        return prediction_vector
    

 

评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值