word2vector是Google开源项目主要是做词向量,理论上语料越大越好.python3在安装过程中出现一些问题.再次记录一下我将python安装源改成了清华大学的,创建如下目录:C:\Users\tk\pip,并在该文件夹下新建:pip.ini内容为:
[global]
index-url = https://pypi.tuna.tsinghua.edu.cn/simple
首先下载Rtools(https://cran.r-project.org/bin/windows/Rtools/),因为需要GCC编译,所以安装,安装过程记得把添加PATH的勾选上
执行打开Anaconda执行
pip install word2vec
安装过程报错如下如:
Failed building wheel for word2vec
Running setup.py clean for word2vec
Failed to build word2vec
Installing collected packages: word2vec
Running setup.py install for word2vec ... error
Complete output from command C:\Users\tk\Anaconda3\python.exe -u -c "import setuptools, tokenize;__file__='C:\\Users\\tk\\AppData\\Local\\Temp\\pip-install-kp9dm2wz\\word2vec\\setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record C:\Users\tk\AppData\Local\Temp\pip-record-dv_jlkn2\install-record.txt --single-version-externally-managed --compile:
running install
C:\Users\tk\AppData\Local\Temp\pip-install-kp9dm2wz\word2vec\word2vec\src\win32/word2vec.c:21:25: fatal error: win32-port.h: No such file or directory
# include "win32-port.h"
^
找不到win32-port.h文件,下面是win32-port.h的源文件:
#if !defined WIN32_LEAN_AND_MEAN
#define WIN32_LEAN_AND_MEAN
#endif
#include <Windows.h>
#include <process.h>
#include <assert.h>
typedef struct {
void *(*pthread_routine)(void *);
void *pthread_arg;
HANDLE handle;
} pthread_t;
static unsigned __stdcall win32_start_routine(void *arg) {
pthread_t *p = (pthread_t *)arg;
p->pthread_routine(p->pthread_arg);
return 0;
}
static int pthread_create(pthread_t *id, void *attr,
void *(*start_routine)(void *), void *arg) {
assert(attr == 0);
id->pthread_routine = start_routine;
id->pthread_arg = arg;
id->handle =
(HANDLE)_beginthreadex(0, 0, win32_start_routine, (void *)id, 0, 0);
if (id->handle != 0) return 0;
return -1;
}
static int pthread_join(pthread_t thread, void **retval) {
WaitForSingleObject(thread.handle, INFINITE);
if (retval) {
*retval = 0;
}
return 0;
}
static void pthread_exit(void *p) { _endthreadex(0); }
static int posix_memalign(void **memptr, size_t alignment, size_t size) {
assert(memptr);
*memptr = _aligned_malloc(size, alignment);
if (*memptr) {
return 0;
} else {
return -1;
}
}
在直接下载word2vector:
https://pypi.tuna.tsinghua.edu.cn/packages/ce/51/5e2782b204015c8aef0ac830297c2f2735143ec90f592b9b3b909bb89757/word2vec-0.10.2.tar.gz,
压缩包放在C:\Users\XXX\Anaconda3\pkgs目录下并解压,进入word2vec-0.10.2\word2vec\src\win32,
新建win32-port.h并复制上面的win32-port.h内容,进入Anaconda进入:C:\Users\XXX\Anaconda3\pkgs\word2vec-0.10.2运行命令
python setup.py install
然后测试安装是否成功
from gensim.models import word2vec
import gensim
import logging