Hadoop环境已经安装完毕
(1)到https://archive.apache.org/dist/pig/下载对应的tar包,如果是hadoop0.20.x之前的版本,则直接可以用,如果Hadoop2.x之后的,则需要重新编译,在pig的根目录下执行如下命令:
- ant clean jar-withouthadoop -Dhadoopversion=23
ant clean jar-withouthadoop -Dhadoopversion=23
否则执行MapReduce时,会报如下的异常
- 2013-10-24 09:35:19,300 [main] WARN
- org.apache.pig.backend.hadoop20.PigJobControl - falling back to default
- JobControl (not using hadoop 0.20 ?)
- java.lang.NoSuchFieldException: runnerState
- at java.lang.Class.getDeclaredField(Class.java:1938)
- at
- org.apache.pig.backend.hadoop20.PigJobControl.<clinit>(PigJobControl.java:51)
- at
- org.apache.p
2013-10-24 09:35:19,300 [main] WARN
org.apache.pig.backend.hadoop20.PigJobControl - falling back to default
JobControl (not using hadoop 0.20 ?)
java.lang.NoSuchFieldException: runnerState
at java.lang.Class.getDeclaredField(Class.java:1938)
at
org.apache.pig.backend.hadoop20.PigJobControl.<clinit>(PigJobControl.java:51)
at
org.apache.p
(2)配置Pig的环境变量:
- export PIG_CLASSPATH=$HADOOP_HOME/etc/hadoop
- export PATH=/home/search/pig-0.12.1/bin:$PATH
export PIG_CLASSPATH=$HADOOP_HOME/etc/hadoop
export PATH=/home/search/pig-0.12.1/bin:$PATH
(3)直接在linux终端执行pig命令,即可进入grunt界面:
- 2015-05-01 12:44:58,573 [main] INFO org.apache.pig.Main - Apache Pig version 0.12.2-SNAPSHOT (r: unknown) compiled May 01 2015, 12:28:37
- 2015-05-01 12:44:58,574 [main] INFO org.apache.pig.Main - Logging error messages to: /home/search/pig-0.12.1/build/pig_1430498698551.log
- 2015-05-01 12:44:58,602 [main] INFO org.apache.pig.impl.util.Utils - Default bootup file /home/search/.pigbootup not found
- 2015-05-01 12:44:59,244 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
- 2015-05-01 12:44:59,244 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
- 2015-05-01 12:44:59,244 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://h1:8020
- 2015-05-01 12:44:59,247 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.used.genericoptionsparser is deprecated. Instead, use mapreduce.client.genericoptionsparser.used
- 2015-05-01 12:45:00,465 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: h1:8021
- 2015-05-01 12:45:00,469 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
- grunt>
2015-05-01 12:44:58,573 [main] INFO org.apache.pig.Main - Apache Pig version 0.12.2-SNAPSHOT (r: unknown) compiled May 01 2015, 12:28:37
2015-05-01 12:44:58,574 [main] INFO org.apache.pig.Main - Logging error messages to: /home/search/pig-0.12.1/build/pig_1430498698551.log
2015-05-01 12:44:58,602 [main] INFO org.apache.pig.impl.util.Utils - Default bootup file /home/search/.pigbootup not found
2015-05-01 12:44:59,244 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2015-05-01 12:44:59,244 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2015-05-01 12:44:59,244 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://h1:8020
2015-05-01 12:44:59,247 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.used.genericoptionsparser is deprecated. Instead, use mapreduce.client.genericoptionsparser.used
2015-05-01 12:45:00,465 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: h1:8021
2015-05-01 12:45:00,469 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
grunt>
(4)pig -i查看pig的版本,
pig --help 查看pig的一些帮助命令
pig -x local 执行local模式
pig -x mapreduce 执行MapReduce模式