How to install Impala
https://github.com/cloudera/Impala/wiki/How-to-build-Impala
https://github.com/cloudera/Impala/wiki/Build-prerequisites
Prepare
1. Install JDK
a> wget --no-cookies --no-check-certificate --header "Cookie:gpw_e24=http%3A%2F%2Fwww.oracle.com%2F;oraclelicense=accept-securebackup-cookie" "http://download.oracle.com/otn-pub/java/jdk/7u79-b15/jdk-7u79-linux-x64.rpm"
b> sudo yum localinstall
jdk-7u79-linux-x64.rpm
c>
Now Java should be installed at
/usr/java/jdk1.7.0_79/jre/bin/java,and linked from /usr/bin/java.
d>
Basic command:
Version:java –version
Alternatives:sudo alternatives --config java
e>Add JAVA_HOME to system environment.
sudo sh -c "echo export JAVA_HOME=
/usr/java/jdk1.7.0_79/jre >> /etc/environment"
2. Install dependence tools
a>
sudo yum groupinstall "Development Tools"
b>
sudo yum -y install git ant libevent-devel automake libtool flex bison gcc-c++ openssl-devel make cmake doxygen.x86_64 glib-devel python-devel bzip2-devel svn libevent-devel krb5-workstation openldap-devel db4-devel python-setuptools python-pip cyrus-sasl* postgresql postgresql-server ant-nodeps lzo-devel lzop
c>
install python-pip
wget
https://pypi.python.org/packages/e7/a8/7556133689add8d1a54c0b14aeff0acb03c64707ce100ecd53934da1aa13/pip-8.1.2.tar.gz#md5=87083c0b9867963b29f7aba3613e8f4a
tar zxvf pip-8.1.2.tar.gz
cd pip-8.1.2
sudo python setup.py install
d> sudo pip install allpairs pytest pytest-xdistparamiko texttable prettytable sqlparse psutil==0.7.1 pywebhdfs gitpythonjenkinsapi boto3
3. Configure Postgresql
a> sudo service postgresql initdb
b> vi /var/lib/pgsql/data/pg_hba.conf
In the followinglines at the end of the file, change peer
or ident
to trust
.
c> service postgresql restart
4. create Hive metastore user
a> sudo -u postgres psql postgres
b>
CREATE ROLE hiveuser LOGIN PASSWORD 'password';
c> ALTER ROLE hiveuser WITH CREATEDB;
d> Quit from postgresql:\q
5. Get Maven 3
wget http://mirrors.tuna.tsinghua.edu.cn/apache/maven/maven-3/3.3.9/binaries/apache-maven-3.3.9-bin.tar.gz
tar xvf apache-maven-3.3.9-bin.tar.gz && sudo mv apache-maven-3.3.9 /usr/local
6. Set environment variables
vi ~/.bashrc
export M2_HOME=/usr/local/apache-maven-3.3.9
export M2=$M2_HOME/bin
export PATH=$M2:$PATH
7. Add path for HDFS domainsockets
sudo mkdir /var/lib/hadoop-hdfs/
sudo chown <user> /var/lib/hadoop-hdfs/
8. Start local ssh server
sudo service sshd start
9. Enable password-less SSH forHBase
ssh-keygen -t dsa
# Do not type in any passkey. Just press enter.
cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
Build Impala
1. Get Impala code
git clone https://github.com/cloudera/Impala.git
2. Source impala configurationfile
source bin/impala-config.sh
3. Build Impala for the firsttime.
a>
${IMPALA_HOME}/buildall.sh -noclean -skiptests -build_shared_libs –format
b>
Build with cmake:
cmake . -DCMAKE_BUILD_TYPE=Debug -DBUILD_SHARED_LIBS=ON -DCMAKE_TOOLCHAIN_FILE=./cmake_modules/toolchain.cmake
4. Build Impala subsequently
${IMPALA_HOME}/buildall.sh -skiptests -build_shared_libs
5. Build backend only
${IMPALA_HOME}/bin/make_debug.sh [-notests]
${IMPALA_HOME}/bin/make_release.sh [-notests]
Build frontend only
cd ${IMPALA_HOME}/fe && mvn clean package dependency:copy-dependencies -DskipTests=true
6. Starting supporting services
${IMPALA_HOME}/testdata/bin/run-all.sh
This script starts a full set of local services including HDFS, HBase, Hive and ZooKeeper, amongst other things. If you have trouble starting this script, check the log files in${IMPALA_HOME}/cluster_logs/
for clues.
7. Starting the impala cluster
${IMPALA_HOME}/bin/start-impala-cluster.py