W2V 简介

What is Word2Vector

In machine learning models such as neural networks, we can't directly process string-type data, so we need to convert them into pure digital information. In this conversion process, we hope that the data can retain the original information as much as possible.

Word2Vector, like one-hot, is a model for converting text data into vectors, which is used extensively in natural language processing (NLP). One-hot counts all the words in the text, and then for each vocabulary number, a V-dimensional vector is created for each word. Each dimension of the vector represents a word, so the corresponding number the dimension value in the position is 1, and the other dimensions are all 0. Although this method retains the original word information, the dimension is too high in the case of a large number of texts, and it cannot reflect the relationship between two words. For example, cat and kitten are significantly closer than cat and Coral, but they are not represented by word vectors. Word2Vector, by learning the text, uses the word vector to represent the semantic information of the word, that is, through an "embedded space", the distance between the semantically similar words is close. This can reduce the dimension and reflect the relationship between words and words.

Embedding is such a mapping that maps the space in which the original word is located to the new space.

 

How do we use W2V

In the Word2Vec model, there are mainly two models of Skip-Gram and CBOW. From the intuitive understanding, Skip-Gram is a given input word to predict the context. CBOW is a given context to predict the input word. In this data set, we want to predict other products that are needed according to the products selected by the consumers. Therefore, this experiment uses the Skip-Gram model and improves it on this basis.

 

The Word2Vector model is actually divided into two parts, the first part is to build the model, and the second part is to get the embedded word vector through the model. The whole modeling process of Word2Vector is actually similar to the idea of auto-encoder, that is, constructing a neural network based on training data. When this model is trained, we will not use this trained model to process it. The new task, what we really need is the parameters that the model learns by training the data, such as the weight matrix of the hidden layer - we will see later that these weights are actually the word vectors we are trying to learn in Word2Vec.

 

In the experiment, we refer to the bagged-prod2vec method of Grbovic, M. for the attribute of the product name. This means that the model is trained at the level of the order, not at the level of the product. The word vector representation of the item is obtained by maximizing the modified objective function:

Probability P(em+j |pmk) of observing products from neighboring receipt em+j , em+j = (pm+j,1 . . . pm+j,Tm), given the k-th product from m-th receipt reduces to a product of probabilities P(em+j |pmk) = P(pm+j,1|pmk) × . . . × P(pm+j,Tm|pmk), each defined using soft-max (3.2)[1].

 

By maximizing the modified objective function, we get the Embedding layer, which is the word vector of the word. Finally, the commodity prediction can be completed by finding a word vector that is closer to the target vocabulary.

 

Why choose W2V

Because W2V can not only help us to convert words into the form of word vector, but also to reflect the relationship between words and words through training, to achieve the purpose of finding words close to the input words.

 

To what extent does W2V participate in decision making (recommended food)

In addition to W2V, we also recommend the products that he might choose by investigating the number and frequency of purchases of various items by the same user. Among them (XXXX) is the main decision factor.

 

Reference:

1.Grbovic, M., Radosavljevic, V., Djuric, N., Bhamidipati, N., Savla, J., Bhagwan, V., & Sharp, D. (2015, August). E-commerce in your inbox: Product recommendations at scale. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 1809-1818). ACM.

 

What is Word2Vector

In machine learning models such as neural networks, we can't directly process string-type data, so we need to convert them into pure digital information. In this conversion process, we hope that the data can retain the original information as much as possible.

Word2Vector, like one-hot, is a model for converting text data into vectors, which is used extensively in natural language processing (NLP). One-hot counts all the words in the text, and then for each vocabulary number, a V-dimensional vector is created for each word. Each dimension of the vector represents a word, so the corresponding number the dimension value in the position is 1, and the other dimensions are all 0. Although this method retains the original word information, the dimension is too high in the case of a large number of texts, and it cannot reflect the relationship between two words. For example, cat and kitten are significantly closer than cat and Coral, but they are not represented by word vectors. Word2Vector, by learning the text, uses the word vector to represent the semantic information of the word, that is, through an "embedded space", the distance between the semantically similar words is close. This can reduce the dimension and reflect the relationship between words and words.

Embedding is such a mapping that maps the space in which the original word is located to the new space.

 

How do we use W2V

In the Word2Vec model, there are mainly two models of Skip-Gram and CBOW. From the intuitive understanding, Skip-Gram is a given input word to predict the context. CBOW is a given context to predict the input word. In this data set, we want to predict other products that are needed according to the products selected by the consumers. Therefore, this experiment uses the Skip-Gram model and improves it on this basis.

 

The Word2Vector model is actually divided into two parts, the first part is to build the model, and the second part is to get the embedded word vector through the model. The whole modeling process of Word2Vector is actually similar to the idea of auto-encoder, that is, constructing a neural network based on training data. When this model is trained, we will not use this trained model to process it. The new task, what we really need is the parameters that the model learns by training the data, such as the weight matrix of the hidden layer - we will see later that these weights are actually the word vectors we are trying to learn in Word2Vec.

 

In the experiment, we refer to the bagged-prod2vec method of Grbovic, M. for the attribute of the product name. This means that the model is trained at the level of the order, not at the level of the product. The word vector representation of the item is obtained by maximizing the modified objective function:

Probability P(em+j |pmk) of observing products from neighboring receipt em+j , em+j = (pm+j,1 . . . pm+j,Tm), given the k-th product from m-th receipt reduces to a product of probabilities P(em+j |pmk) = P(pm+j,1|pmk) × . . . × P(pm+j,Tm|pmk), each defined using soft-max (3.2)[1].

 

By maximizing the modified objective function, we get the Embedding layer, which is the word vector of the word. Finally, the commodity prediction can be completed by finding a word vector that is closer to the target vocabulary.

 

Why choose W2V

Because W2V can not only help us to convert words into the form of word vector, but also to reflect the relationship between words and words through training, to achieve the purpose of finding words close to the input words.

 

 

Reference:

1.Grbovic, M., Radosavljevic, V., Djuric, N., Bhamidipati, N., Savla, J., Bhagwan, V., & Sharp, D. (2015, August). E-commerce in your inbox: Product recommendations at scale. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 1809-1818). ACM.

 

 

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值