Efficient estimation of word representations in vector space

Sharp tools make good work.

工欲善其事,必先利其器

Today I’ll explore word vectors presented by Mikolov et al. in the paper - “Efficient estimation of word representations in vector space”. Two novel model architectures for learning vector representations of words are proposed in this paper which significantly improve the quality of word vectors at lower computational cost, and the vectors are measured in a word similarity task using a word offset technique where simple algebraic operations are performed on the word vectors.

In this paper, Mikolov et al. give a short summary of previously proposed model architectures, including the well-known NNLM and RNNLM, and propose two new log-linear models called CBOW and Skip-gram.

The CBOW is similar to the feedforward NNLM, where the non-linear hidden layer is removed and the projection layer is shared for all words, and the objective of this model is to use words from the history and future simultaneously to correctly classify the middle word in the vocabulary. Unlike standard bag-of-words model, it uses continuous distributed representation of the context.

The Skip-gram is similar to CBOW, but instead of predicting the current word based on the context, it tries to maximize classification of a word based on another word in the same sentence. More precisely, each current word is used as an input to a log-linear classifier consisting of continuous projection layer, and the result is used for predicting words within a certain range before and after the current word. Note that increasing the range improves quality of the resulting word vectors, but it also increases the computational cost.

Below the model architecture of two models is shown.
在这里插入图片描述

Below I present my code implementing the Skip-gram and the CBOW.

class Skip_gram(nn.Module):
    def __init__(self):
        super(Skip_gram, self).__init__()
        self.embedding = nn.Embedding(MAX_VOCAB_SIZE, EMBEDDING_SIZE)
        self.embedding.weight.data.uniform_(-INIT_RANGE, INIT_RANGE)

        self.outLayer = nn.Linear(EMBEDDING_SIZE, MAX_VOCAB_SIZE)

    def forward(self, X):
        # X -> B
        embedded = self.embedding(X)  # B x EMBEDDING_SIZE

        output = self.outLayer(embedded)  # B x MAX_VOCAB_SIZE

        return F.softmax(output, -1)


class CBOW(nn.Module):
    def __init__(self):
        super(CBOW, self).__init__()
        self.embedding = nn.Embedding(MAX_VOCAB_SIZE, EMBEDDING_SIZE)
        self.embedding.weight.data.uniform_(-INIT_RANGE, INIT_RANGE)

        self.outLayer = nn.Linear(EMBEDDING_SIZE, MAX_VOCAB_SIZE)

    def forward(self, X):

        # X -> B x 2C
        embedded = self.embedding(X)  # B x 2C x EMBEDDING_SIZE
        embedded = embedded.sum(1) / (2 * C)  # B x EMBEDDING_SIZE

        output = self.outLayer(embedded)  # B x MAX_VOCAB_SIZE
        return F.softmax(output, -1)

To compare the quality of different versions of word vectors, previous papers typically use a table showing example words and their most similar words, and understand them intuitively. Since it has been observed that there can be many different types of similarities between words, for example, word big is similar to bigger in the same sense that small is similar to smaller. Mikolov et al. raise a question about how to find a word that is similar to small in the same sense as biggest is similar to big, and the question can be answered by simply computing vector X X X = vector(“biggest”) - vector(“big”) + vector(“small”) that is used to search in the vector space to find the word closest to X X X measured by cosine distance. When the word vectors are well trained, it is possible to find the correct answer using this method.

From my perspective, the most valuable contribution of this paper is that it proposes two novel and computationally efficient model architectures to obtain word vectors of high quality, with the expectation that many existing NLP applications can benefit from the model architectures described above, such as machine translation, information retrieval and question answering systems, and may enable other future applications yet to be invented.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
### 回答1: 在全轮驱动(AWD)车辆中进行车速估计是一项重要的任务。估计车速可以帮助驾驶员更好地了解车辆的动态状况,确保驾驶的安全性。 首先,车辆的车速可以通过使用车辆自带的速度传感器来估计。这些传感器可以测量车轮旋转的速度,然后通过车辆的车轮直径等参数进行计算,从而得出车速的估计值。 其次,车速的估计还可以通过使用车辆上安装的惯性测量单元(IMU)进行。IMU可以测量车辆的加速度和旋转速度,通过对这些数据进行积分和滤波处理,可以得到车辆的位移和旋转角度。然后,通过这些数据计算出车辆的车速。 另外,车辆上的其他传感器也可以辅助车速的估计。例如,使用GPS可以测量车辆在地球表面上的位置变化,可以通过这些位置数据的变化来计算车速。另外,使用雷达或相机等传感器可以监测周围环境中的物体移动情况,通过分析这些移动的物体的速度和方向,可以得到车辆的车速估计值。 总之,在AWD车辆中进行车速估计是一项复杂的任务,需要结合多种传感器和数据进行计算和分析。准确的车速估计对于驾驶员的安全和行车控制至关重要。 ### 回答2: 在AWD车辆中估计车速可以通过多种方式来完成。其中一种方法是使用车辆的制动系统来估计车速。制动系统通过监测车辆的轮胎旋转速度来估计车速。当车辆行驶时,每个轮胎的旋转速度会根据车辆的速度而有所变化。通过比较不同轮胎的旋转速度,可以获得一个接近实际车速的估计。 另一种常见的方法是使用车辆的动力系统来估计车速。AWD车辆通常配备有多个驱动轴,每个驱动轴都有一个独立的动力输出装置。通过监测不同驱动轴的动力输出和转速,可以计算出车辆的速度。 除了以上两种方法外,还可以使用车辆的惯性传感器来估计车速。惯性传感器可以检测车辆加速度的变化,并根据这些变化来估计车速。这种方法相对于其他方法更加灵活,可以适用于各种道路和驾驶条件。 需要注意的是,这些方法都是基于估计和计算而来的,可能存在一定的误差。车辆的负载、行驶条件、轮胎磨损等因素都会对估计结果产生影响。因此,在估计车速时需要考虑这些因素,并进行适当的校正和调整,以提高估计精度。 ### 回答3: 在全轮驱动车辆中,估计车辆速度的主要方法有多种。其中一种常用的方法是使用车辆的转速和轮胎直径来进行估算。当车辆在行驶过程中,发动机的转速会通过传动装置传递给车轮,从而推动车辆前进。因此,通过测量发动机转速可以推断车辆的速度。同时,了解车辆所使用的轮胎直径也能提供一定的参考,因为车轮每转一圈所走过的距离与其直径有关。 另外一种估算车辆速度的方法是使用车辆的里程表和所用时间。里程表记录了车辆行驶的总距离,而所用时间可以通过计时器或车载导航系统等设备来获得。通过计算车辆在一段时间内行驶的距离,再与所用的时间进行比较,就可以估算出车辆的速度。 除了以上两种方法,现代车辆中也经常使用车载传感器来测量车辆的速度。这些传感器可以测量车轮的转速,然后通过电子控制单元(ECU)计算出车辆的速度。这种方法通常比较精确,可以准确地估算车辆的实际速度。 综上所述,在全轮驱动车辆中,估算车辆速度可以使用车辆转速和轮胎直径的关系、里程表和时间的关系以及车载传感器等方法。这些方法各有其优势和适用场景,在实际应用中可以根据需要选择合适的方法进行估算。
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值