源码编译安装vearch
官网的链接在这里,大部分是依据官网,只是加了说明,在centos7.9.2009系统下安装vearch-0.3.1(或3.1.0)
1 安装依赖包
vearch需要这些依赖(摘自github网站):
CentOS, Ubuntu and Mac OS are all OK (recommend CentOS >= 7.2),cmake required
Go >= 1.11.2 required
Gcc >= 5 required
# faiss是vearch引擎的依赖,是必须安装的
Faiss >= v1.6.3
# RocksDB是vearch磁盘版数据的存储引擎
RocksDB == 6.2.2 (optional) .
#这个是要源码编译python sdk的时候需要的,这里建议直接使用pip install vearch
swig >= 3
# 关于GPU,我准备单独一个文章介绍,这里就先略过
CUDA >= 9.0, if you want GPU support.
1.1 安装Golang
这里我安装的是golang1.11.2
# 下载golang压缩包
wget https://dl.google.com/go/go1.11.2.linux-amd64.tar.gz
# 解压到/usr/local,这个地方可能需要获取root权限
# 解压之后/usr/local下会出现go文件夹
sudo tar -C /usr/local -xzf go1.11.2.linux-amd64.tar.gz
# 修改配置文件
sudo vim /etc/profile
# 然后在profile文件最下方追加:
# 其实这个地方还应该加上export GOPATH=/your_go_WorkPlace,但是后面安装vearch时会有变动,所以这里先不加
export GOROOT=/usr/local/go、
export PATH=$PATH:$GOROOT/bin
# 更新文件
sudo source /etc/profile
# 验证安装
go version
# 最后的最后,因为时添加在profile里,下次重新启动机器可能不会自动添加环境变量,因此:
sudo vim .bashrc
# 在最后面添加上
source /etc/profile
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-m0QXaxmv-1674834252187)(D:\NoteData\随记\image\image-20230117222340142.png)]
1.2 安装GCC
GCC版本需要>=5(CPU版本),这里我用的是centos7.9.2009,yum安装gcc版本是4.8.5,但实际测试下来并不影响编译
# 首先新的系统安装这个第三方源
yum install epel-release
# 接着是gcc和gcc-c++
yum install gcc gcc-c++
接着输入
gcc -v
# 成功显示4.8.5就安装对了
1.3 安装rocksdb
这里我选择编译安装rocksdb-6.2.2
-
新建目录
# 首先定义一个vearch的目录,这里暂定为/home/vearch,接下的操作都将在这个目录下 sudo mkdir -p /home/vearch && cd /home/vearch # 创建vearch的依赖文件夹vearch_libs sudo mkdir vearch_libs && cd vearch_libs # 创建rocksdb安装文件夹 sudo mkdir -p /home/vearch/vearch_libs/rocksdb-6.2.2-install/
-
下载rocksdb
# 克隆指定版本号的源码,这里可能比较慢,建议直接下载zip解压 sudo git clone -b v6.2.2 https://github.com/facebook/rocksdb.git
-
编译rocksdb
# 进入文件夹 cd rocksdb # 编译之前安装cmake以及依赖库 yum install cmake3 # 修改Makefile文件, sudo vim Makefile # 大概在1587行,将改为 INSTALL_PATH ?= /home/vearch/vearch_libs/rocksdb-6.2.2-install/ # 开始编译 sudo make shared_lib && sudo make install
1.4 安装Faiss
这里安装faiss-1.6.3
-
首先新建目录
mkdir -p /home/vearch/vearch_libs/faiss-install
-
下载源码,还是建议选择相应的tag,然后直接下载zip包
sudo git -b v1.6.3 clone https://github.com/facebookresearch/faiss.git
-
安装依赖包
- 安装必要的库
yum install swig3 openblas-devel lapack-devel
- 接着是拉取源码,这里要注意这是个嵌套项目
# faiss git clone -b v1.6.3 https://github.com/facebookresearch/faiss.git # 接着执行(这里出现问题,没有numpy) cd faiss && ./configure --without-cuda --prefix=/home/vearch/vearch_libs/faiss-install
我在这里出现一个问题,是检查没有numpy,那安装吧:
# centos7.9.2009自带的是python2.7 # 如果直接yum install python-pip,会出现套娃行为, # 安装任何库都提示你升级pip,但是升级有提示你更新setuptools,更新setuptools又提示你升级pip,简直智障!!!! # 因此采用以下方法,安装之前卸载之前任何形式的pip工具 cd /usr/local wget https://files.pythonhosted.org/packages/53/7f/55721ad0501a9076dbc354cc8c63ffc2d6f1ef360f49ad0fbcce19d68538/pip-20.3.4.tar.gz cd pip-20.3.4 python2 setup.py build python2 setup.py install # 之后就是安装numpy pip install -i http://pypi.douban.com/simple/ numpy --trusted-host pypi.douban.com
- 最后,编译
# 记得进入faiss目录 make && make install
2 编译vearch
-
首先定义编译的路径
# 执行这两个命令 mkdir -p /home/vearch/go/src/github.com/vearch export GOPATH=/home/vearch/go
-
下载源码,这里有坑:
# 进入文件夹下 cd $GOPATH/src/github.com/vearch # 拉取vearch源码 git clone -b v3.1.0 https://github.com/vearch/vearch.git # 注意上面只是下载了一部分,还有一部分在engine git clone https://github.com/vearch/gamma.git # 接着删除原有的engine,把gamma改名为engine rm -r engine mv gamma engine
-
开始编译:
cd vearch/build export GOPATH=/home/vearch/go export FAISS_HOME=/home/vearch/vearch_libs/faiss-install/ export ROCKSDB_HOME=/home/vearch/vearch_libs/rocksdb-6.2.2-install/ export LD_LIBRARY_PATH=$FAISS_HOME/lib:$ROCKSDB_HOME/lib:$LD_LIBRARY_PATH ./build.sh
-
接着编写配置文件cong.toml,放在vearch/build/bin里
[global] # the name will validate join cluster by same name name = "vearch" # you data save to disk path ,If you are in a production environment, You'd better set absolute paths data = ["/home/vearch/Data/baud/datas/"] # log path , If you are in a production environment, You'd better set absolute paths log = "/home/vearch/Data/baud/logs/" # default log type for any model level = "debug" # master <-> ps <-> router will use this key to send or receive data signkey = "vearch" skip_auth = true # if you are master you'd better set all config for router and ps and router and ps use default config it so cool [[masters]] # name machine name for cluster name = "master1" # ip or domain address = "127.0.0.1" # api port for http server api_port = 8817 # port for etcd server etcd_port = 2378 # listen_peer_urls List of comma separated URLs to listen on for peer traffic. # advertise_peer_urls List of this member's peer URLs to advertise to the rest of the cluster. The URLs needed to be a comma-separated list. etcd_peer_port = 2390 # List of this member's client URLs to advertise to the public. # The URLs needed to be a comma-separated list. # advertise_client_urls AND listen_client_urls etcd_client_port = 2370 skip_auth = true [router] # port for server port = 9001 # skip auth for client visit data skip_auth = true [ps] # port for server rpc_port = 8081 # raft config begin raft_heartbeat_port = 8898 raft_replicate_port = 8899 heartbeat-interval = 200 #ms raft_retain_logs = 10000 raft_replica_concurrency = 1 raft_snap_concurrency = 1
-
别着急运行,把libgamma.so.0.1链接进/vearch/build/bin/vearch
export LD_LIBRARY_PATH=/home/vearch/go/src/github.com/vearch/vearch/build/gamma_build:$LD_LIBRARY_PATH
-
启动:
./vearch -conf conf.toml all