Apache MADlib是一个开源库,用于可扩展的数据库内分析。Greenplum MADlib 扩展提供了在 Greenplum 数据库中运行机器学习和深度学习工作负载的能力。
1. 安装 MADlib
1.1 安装MADlib软件包
- 从VMware Tanzu下载合适版本的MADlib 扩展包
- 上传软件包到Greenplum的Master主机
- 解压缩
$ tar xzvf madlib-1.18.0+2-gp6-rhel7-x86_64.tar.gz
- 通过运行gppkg命令安装软件包。例如:
[gpadmin@gpmdw opt]$ gppkg -i ./madlib-1.18.0+2-gp6-rhel7-x86_64/madlib-1.18.0+2-gp6-rhel7-x86_64.gppkg
20220209:10:16:19:002504 gppkg:gpmdw:gpadmin-[INFO]:-Starting gppkg with args: -i ./madlib-1.18.0+2-gp6-rhel7-x86_64/madlib-1.18.0+2-gp6-rhel7-x86_64.gppkg
20220209:10:16:19:002504 gppkg:gpmdw:gpadmin-[INFO]:-Installing package madlib-1.18.0+2-gp6-rhel7-x86_64.gppkg
20220209:10:16:19:002504 gppkg:gpmdw:gpadmin-[INFO]:-Validating rpm installation cmdStr='rpm --test -i /usr/local/greenplum-db-6.19.1/.tmp/madlib-1.18.0-1.x86_64.rpm --dbpath /usr/local/greenplum-db-6.19.1/share/packages/database --prefix /usr/local/greenplum-db-6.19.1'
20220209:10:16:20:002504 gppkg:gpmdw:gpadmin-[INFO]:-Installing madlib-1.18.0+2-gp6-rhel7-x86_64.gppkg locally
20220209:10:16:20:002504 gppkg:gpmdw:gpadmin-[INFO]:-Validating rpm installation cmdStr='rpm --test -i /usr/local/greenplum-db-6.19.1/.tmp/madlib-1.18.0-1.x86_64.rpm --dbpath /usr/local/greenplum-db-6.19.1/share/packages/database --prefix /usr/local/greenplum-db-6.19.1'
20220209:10:16:20:002504 gppkg:gpmdw:gpadmin-[INFO]:-Installing r