impala docker build

cloudera/impala-dev

Last pushed: 22 days ago

Short Description

Impala development environment

Full Description

Dockerfiles for an Impala development environment

Images

Several images are available but only two are inteded for use:

Complete

The complete image has Impala built and test data loaded. Because of the size and time required to build this image and limitations of dockerhub, the image is not available for download but can be built locally. The built image will be about 50 GB in size and take 1-4 hours depending on your system.

To build the complete image:

# Get the latest version of the "minimal" image. Only really needed
# if you already have a minimal image from before Apr 10, 2016\. That
# image would be incompatible.
$ docker pull cloudera/impala-dev:minimal

# Build the image.
$ docker build complete

Using the complete image:

$ docker run -i -t cloudera/impala-dev:complete /bin/bash
[container]$ docker-boot   # starts Postgres and SSH both needed to run Impala
[container]$ . bin/impala-config.sh   # sets the Impala environment variables
[container]$ run-all.sh   # starts dependent services -- HDFS, Hive metastore, etc
[container]$ start-impala-cluster.py
[container]$ impala-shell.sh
[localhost:21000] > select count(*) from tpch.lineitem;

The image can also be run in the background and logged into over SSH:

(Note the "-d" and the lack of the trailing "/bin/bash".)

$ docker run -d -t cloudera/impala-dev:complete

$ docker inspect  | grep IPAddress

$ ssh dev@   # password is cloudera
[container]$ . bin/impala-config.sh   # sets the Impala environment variables
[container]$ run-all.sh   # starts dependent services -- HDFS, Hive metastore, etc
[container]$ start-impala-cluster.py
[container]$ impala-shell.sh
[localhost:21000] > select count(*) from tpch.lineitem;

This works because the default command for the image is to run "docker-boot" which starts an SSH service.

Minimal

The minimal image has Impala built but the test data is not loaded. The image is about 5 GB and can be downloaded from dockerhub or built locally.

Using the minimal image:

$ docker run -i -t cloudera/impala-dev:minimal /bin/bash
[container]$ docker-boot   # starts Postgres and SSH both needed to run Impala
[container]$ cd Impala
[container]$ . bin/impala-config.sh   # sets the Impala environment variables
[container]$ ./buildall.sh -format -skiptests
[container]$ run-all.sh   # starts dependent services -- HDFS, Hive metastore, etc
[container]$ start-impala-cluster.py
[container]$ impala-shell.sh
[localhost:21000] > create database test;

This image can also be started in the background.

If you want to load the test data manually inside the minimal instance, see the necessary steps in complete/Dockerfile.

Other Images

The remainder of the images only exist as workarounds for the limitations of dockerhub. Compiling code on dockerhub is slow and builds have a 2 hour timeout. The build of the Minimal image needed to be split into several steps to avoid the timeout.

Prebuilt images are hosted by Dockerhub.

For more information see the Impala wiki or ask a question on the dev user group.

转载于:https://my.oschina.net/innovation/blog/789791

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值