查看Ambari自动配置的hive与spark sql元数据库

27 篇文章 1 订阅
25 篇文章 1 订阅

我用Ambari搭建的HDP集群,顺便装了hive和spark客户端(勾上即可),有时间我会写一篇完全离线安装Ambari并搭建HDP集群的博客,敬请期待~
先找下spark配置文件在哪

[root@ws1dn3 ~]# whereis spark
spark: /etc/spark
[root@ws1dn3 ~]# cd /etc/spark/
[root@ws1dn3 spark]# ll
total 8
drwxr-xr-x 3 root root 4096 Oct  8 11:16 2.4.2.0-258
lrwxrwxrwx 1 root root   34 Oct  8 11:16 conf -> /usr/hdp/current/spark-client/conf
drwxr-xr-x 2 root root 4096 Oct  8 11:16 conf.backup
[root@ws1dn3 spark]# cd conf
[root@ws1dn3 conf]# ll
total 60
-rw-r--r-- 1 root  root   987 Apr 14 03:14 docker.properties.template
-rw-r--r-- 1 root  root  1105 Apr 14 03:14 fairscheduler.xml.template
-rw-r--r-- 1 spark spark  172 Oct  8 11:16 hive-site.xml
-rw-r--r-- 1 spark spark  621 Oct  9 09:48 log4j.properties
-rw-r--r-- 1 root  root  1734 Apr 14 03:14 log4j.properties.template
-rw-r--r-- 1 spark spark 4956 Oct  8 11:16 metrics.properties
-rw-r--r-- 1 root  root  6671 Apr 14 03:14 metrics.properties.template
-rw-r--r-- 1 root  root   865 Apr 14 03:14 slaves.template
-rw-r--r-- 1 spark spark  722 Oct  8 11:16 spark-defaults.conf
-rw-r--r-- 1 root  root  1292 Apr 14 03:14 spark-defaults.conf.template
-rw-r--r-- 1 spark spark 1788 Oct  8 11:16 spark-env.sh
-rwxr-xr-x 1 root  root  4209 Apr 14 03:14 spark-env.sh.template

查看一下,只有客户端配置

[root@ws1dn3 conf]# cat hive-site.xml 
  <configuration>

    <property>
      <name>hive.metastore.uris</name>
      <value>thrift://ws1dn3.wondersoft.cn:9083</value>
    </property>

  </configuration>

我当时选的dn3安装hive metastore

接下来看一下hive-site.xml的完整配置

[root@ws1dn3 conf]# whereis hive
hive: /usr/bin/hive /etc/hive
[root@ws1dn3 conf]# cd /etc/hive/conf
[root@ws1dn3 conf]# ll
total 228
-rw-r--r-- 1 root root     1139 Apr 22 08:14 beeline-log4j.properties.template
drwxr-xr-x 2 hive hadoop   4096 Oct  8 15:25 conf.server
-rw-r--r-- 1 hive hadoop 175716 Apr 25 14:47 hive-default.xml.template
-rw-r--r-- 1 hive hadoop   1759 Oct  8 11:13 hive-env.sh
-rw-r--r-- 1 hive hadoop   2378 Apr 22 08:14 hive-env.sh.template
-rw-r--r-- 1 hive hadoop   2652 Oct  8 11:13 hive-exec-log4j.properties
-rw-r--r-- 1 hive hadoop   3050 Oct  8 11:13 hive-log4j.properties
-rw-r--r-- 1 hive hadoop  19199 Oct  8 11:00 hive-site.xml
-rw-r--r-- 1 root root     1593 Apr 22 08:14 ivysettings.xml
-rw-r--r-- 1 hive hadoop   6529 Oct  8 11:13 mapred-site.xml
[root@ws1dn3 conf]# cat hive-site.xml 
  <configuration>

    <property>
      <name>ambari.hive.db.schema.name</name>
      <value>hive</value>
    </property>

    <property>
      <name>atlas.hook.hive.maxThreads</name>
      <value>1</value>
    </property>

    <property>
      <name>atlas.hook.hive.minThreads</name>
      <value>1</value>
    </property>

    <property>
      <name>datanucleus.autoCreateSchema</name>
      <value>false</value>
    </property>

    <property>
      <name>datanucleus.cache.level2.type</name>
      <value>none</value>
    </property>

    <property>
      <name>datanucleus.fixedDatastore</name>
      <value>true</value>
    </property>

    <property>
      <name>hive.auto.convert.join</name>
      <value>true</value>
    </property>

    <property>
      <name>hive.auto.convert.join.noconditionaltask</name>
      <value>true</value>
    </property>

    <property>
      <name>hive.auto.convert.join.noconditionaltask.size</name>
      <value>1073741824</value>
    </property>

    <property>
      <name>hive.auto.convert.sortmerge.join</name>
      <value>true</value>
    </property>

    <property>
      <name>hive.auto.convert.sortmerge.join.to.mapjoin</name>
      <value>false</value>
    </property>

    <property>
      <name>hive.cbo.enable</name>
      <value>true</value>
    </property>

    <property>
      <name>hive.cli.print.header</name>
      <value>false</value>
    </property>

    <property>
      <name>hive.cluster.delegation.token.store.class</name>
      <value>org.apache.hadoop.hive.thrift.ZooKeeperTokenStore</value>
    </property>

    <property>
      <name>hive.cluster.delegation.token.store.zookeeper.connectString</name>
      <value>ws1dn2.wondersoft.cn:2181,ws1dn1.wondersoft.cn:2181,ws1dn3.wondersoft.cn:2181</value>
    </property>

    <property>
      <name>hive.cluster.delegation.token.store.zookeeper.znode</name>
      <value>/hive/cluster/delegation</value>
    </property>

    <property>
      <name>hive.compactor.abortedtxn.threshold</name>
      <value>1000</value>
    </property>

    <property>
      <name>hive.compactor.check.interval</name>
      <value>300L</value>
    </property>

    <property>
      <name>hive.compactor.delta.num.threshold</name>
      <value>10</value>
    </property>

    <property>
      <name>hive.compactor.delta.pct.threshold</name>
      <value>0.1f</value>
    </property>

    <property>
      <name>hive.compactor.initiator.on</name>
      <value>false</value>
    </property>

    <property>
      <name>hive.compactor.worker.threads</name>
      <value>0</value>
    </property>

    <property>
      <name>hive.compactor.worker.timeout</name>
      <value>86400L</value>
    </property>

    <property>
      <name>hive.compute.query.using.stats</name>
      <value>true</value>
    </property>

    <property>
      <name>hive.conf.restricted.list</name>
      <value>hive.security.authenticator.manager,hive.security.authorization.manager,hive.users.in.admin.role</value>
    </property>

    <property>
      <name>hive.convert.join.bucket.mapjoin.tez</name>
      <value>false</value>
    </property>

    <property>
      <name>hive.default.fileformat</name>
      <value>TextFile</value>
    </property>

    <property>
      <name>hive.default.fileformat.managed</name>
      <value>TextFile</value>
    </property>

    <property>
      <name>hive.enforce.bucketing</name>
      <value>false</value>
    </property>

    <property>
      <name>hive.enforce.sorting</name>
      <value>true</value>
    </property>

    <property>
      <name>hive.enforce.sortmergebucketmapjoin</name>
      <value>true</value>
    </property>

    <property>
      <name>hive.exec.compress.intermediate</name>
      <value>false</value>
    </property>

    <property>
      <name>hive.exec.compress.output</name>
      <value>false</value>
    </property>

    <property>
      <name>hive.exec.dynamic.partition</name>
      <value>true</value>
    </property>

    <property>
      <name>hive.exec.dynamic.partition.mode</name>
      <value>strict</value>
    </property>

    <property>
      <name>hive.exec.failure.hooks</name>
      <value>org.apache.hadoop.hive.ql.hooks.ATSHook</value>
    </property>

    <property>
      <name>hive.exec.max.created.files</name>
      <value>100000</value>
    </property>

    <property>
      <name>hive.exec.max.dynamic.partitions</name>
      <value>5000</value>
    </property>

    <property>
      <name>hive.exec.max.dynamic.partitions.pernode</name>
      <value>2000</value>
    </property>

    <property>
      <name>hive.exec.orc.compression.strategy</name>
      <value>SPEED</value>
    </property>

    <property>
      <name>hive.exec.orc.default.compress</name>
      <value>ZLIB</value>
    </property>

    <property>
      <name>hive.exec.orc.default.stripe.size</name>
      <value>67108864</value>
    </property>

    <property>
      <name>hive.exec.orc.encoding.strategy</name>
      <value>SPEED</value>
    </property>

    <property>
      <name>hive.exec.parallel</name>
      <value>false</value>
    </property>

    <property>
      <name>hive.exec.parallel.thread.number</name>
      <value>8</value>
    </property>

    <property>
      <name>hive.exec.post.hooks</name>
      <value>org.apache.hadoop.hive.ql.hooks.ATSHook</value>
    </property>

    <property>
      <name>hive.exec.pre.hooks</name>
      <value>org.apache.hadoop.hive.ql.hooks.ATSHook</value>
    </property>

    <property>
      <name>hive.exec.reducers.bytes.per.reducer</name>
      <value>67108864</value>
    </property>

    <property>
      <name>hive.exec.reducers.max</name>
      <value>1009</value>
    </property>

    <property>
      <name>hive.exec.scratchdir</name>
      <value>/tmp/hive</value>
    </property>

    <property>
      <name>hive.exec.submit.local.task.via.child</name>
      <value>true</value>
    </property>

    <property>
      <name>hive.exec.submitviachild</name>
      <value>false</value>
    </property>

    <property>
      <name>hive.execution.engine</name>
      <value>tez</value>
    </property>

    <property>
      <name>hive.fetch.task.aggr</name>
      <value>false</value>
    </property>

    <property>
      <name>hive.fetch.task.conversion</name>
      <value>more</value>
    </property>

    <property>
      <name>hive.fetch.task.conversion.threshold</name>
      <value>1073741824</value>
    </property>

    <property>
      <name>hive.limit.optimize.enable</name>
      <value>true</value>
    </property>

    <property>
      <name>hive.limit.pushdown.memory.usage</name>
      <value>0.04</value>
    </property>

    <property>
      <name>hive.map.aggr</name>
      <value>true</value>
    </property>

    <property>
      <name>hive.map.aggr.hash.force.flush.memory.threshold</name>
      <value>0.9</value>
    </property>

    <property>
      <name>hive.map.aggr.hash.min.reduction</name>
      <value>0.5</value>
    </property>

    <property>
      <name>hive.map.aggr.hash.percentmemory</name>
      <value>0.5</value>
    </property>

    <property>
      <name>hive.mapjoin.bucket.cache.size</name>
      <value>10000</value>
    </property>

    <property>
      <name>hive.mapjoin.optimized.hashtable</name>
      <value>true</value>
    </property>

    <property>
      <name>hive.mapred.reduce.tasks.speculative.execution</name>
      <value>false</value>
    </property>

    <property>
      <name>hive.merge.mapfiles</name>
      <value>true</value>
    </property>

    <property>
      <name>hive.merge.mapredfiles</name>
      <value>false</value>
    </property>

    <property>
      <name>hive.merge.orcfile.stripe.level</name>
      <value>true</value>
    </property>

    <property>
      <name>hive.merge.rcfile.block.level</name>
      <value>true</value>
    </property>

    <property>
      <name>hive.merge.size.per.task</name>
      <value>256000000</value>
    </property>

    <property>
      <name>hive.merge.smallfiles.avgsize</name>
      <value>16000000</value>
    </property>

    <property>
      <name>hive.merge.tezfiles</name>
      <value>false</value>
    </property>

    <property>
      <name>hive.metastore.authorization.storage.checks</name>
      <value>false</value>
    </property>

    <property>
      <name>hive.metastore.cache.pinobjtypes</name>
      <value>Table,Database,Type,FieldSchema,Order</value>
    </property>

    <property>
      <name>hive.metastore.client.connect.retry.delay</name>
      <value>5s</value>
    </property>

    <property>
      <name>hive.metastore.client.socket.timeout</name>
      <value>1800s</value>
    </property>

    <property>
      <name>hive.metastore.connect.retries</name>
      <value>24</value>
    </property>

    <property>
      <name>hive.metastore.execute.setugi</name>
      <value>true</value>
    </property>

    <property>
      <name>hive.metastore.failure.retries</name>
      <value>24</value>
    </property>

    <property>
      <name>hive.metastore.kerberos.keytab.file</name>
      <value>/etc/security/keytabs/hive.service.keytab</value>
    </property>

    <property>
      <name>hive.metastore.kerberos.principal</name>
      <value>hive/_HOST@EXAMPLE.COM</value>
    </property>

    <property>
      <name>hive.metastore.pre.event.listeners</name>
      <value>org.apache.hadoop.hive.ql.security.authorization.AuthorizationPreEventListener</value>
    </property>

    <property>
      <name>hive.metastore.sasl.enabled</name>
      <value>false</value>
    </property>

    <property>
      <name>hive.metastore.server.max.threads</name>
      <value>100000</value>
    </property>
    ###################客户端配置
    <property>
      <name>hive.metastore.uris</name>
      <value>thrift://ws1dn3.wondersoft.cn:9083</value>
    </property>

    <property>
      <name>hive.metastore.warehouse.dir</name>
      <value>/apps/hive/warehouse</value>
    </property>

    <property>
      <name>hive.optimize.bucketmapjoin</name>
      <value>true</value>
    </property>

    <property>
      <name>hive.optimize.bucketmapjoin.sortedmerge</name>
      <value>false</value>
    </property>

    <property>
      <name>hive.optimize.constant.propagation</name>
      <value>true</value>
    </property>

    <property>
      <name>hive.optimize.index.filter</name>
      <value>true</value>
    </property>

    <property>
      <name>hive.optimize.metadataonly</name>
      <value>true</value>
    </property>

    <property>
      <name>hive.optimize.null.scan</name>
      <value>true</value>
    </property>

    <property>
      <name>hive.optimize.reducededuplication</name>
      <value>true</value>
    </property>

    <property>
      <name>hive.optimize.reducededuplication.min.reducer</name>
      <value>4</value>
    </property>

    <property>
      <name>hive.optimize.sort.dynamic.partition</name>
      <value>false</value>
    </property>

    <property>
      <name>hive.orc.compute.splits.num.threads</name>
      <value>10</value>
    </property>

    <property>
      <name>hive.orc.splits.include.file.footer</name>
      <value>false</value>
    </property>

    <property>
      <name>hive.prewarm.enabled</name>
      <value>false</value>
    </property>

    <property>
      <name>hive.prewarm.numcontainers</name>
      <value>3</value>
    </property>

    <property>
      <name>hive.security.authenticator.manager</name>
      <value>org.apache.hadoop.hive.ql.security.ProxyUserAuthenticator</value>
    </property>

    <property>
      <name>hive.security.authorization.enabled</name>
      <value>false</value>
    </property>

    <property>
      <name>hive.security.authorization.manager</name>
      <value>org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdConfOnlyAuthorizerFactory</value>
    </property>

    <property>
      <name>hive.security.metastore.authenticator.manager</name>
      <value>org.apache.hadoop.hive.ql.security.HadoopDefaultMetastoreAuthenticator</value>
    </property>

    <property>
      <name>hive.security.metastore.authorization.auth.reads</name>
      <value>true</value>
    </property>

    <property>
      <name>hive.security.metastore.authorization.manager</name>
      <value>org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider</value>
    </property>

    <property>
      <name>hive.server2.allow.user.substitution</name>
      <value>true</value>
    </property>

    <property>
      <name>hive.server2.authentication</name>
      <value>NONE</value>
    </property>

    <property>
      <name>hive.server2.authentication.spnego.keytab</name>
      <value>HTTP/_HOST@EXAMPLE.COM</value>
    </property>

    <property>
      <name>hive.server2.authentication.spnego.principal</name>
      <value>/etc/security/keytabs/spnego.service.keytab</value>
    </property>

    <property>
      <name>hive.server2.enable.doAs</name>
      <value>true</value>
    </property>

    <property>
      <name>hive.server2.logging.operation.enabled</name>
      <value>true</value>
    </property>

    <property>
      <name>hive.server2.logging.operation.log.location</name>
      <value>/tmp/hive/operation_logs</value>
    </property>

    <property>
      <name>hive.server2.support.dynamic.service.discovery</name>
      <value>true</value>
    </property>

    <property>
      <name>hive.server2.table.type.mapping</name>
      <value>CLASSIC</value>
    </property>

    <property>
      <name>hive.server2.tez.default.queues</name>
      <value>default</value>
    </property>

    <property>
      <name>hive.server2.tez.initialize.default.sessions</name>
      <value>false</value>
    </property>

    <property>
      <name>hive.server2.tez.sessions.per.default.queue</name>
      <value>1</value>
    </property>

    <property>
      <name>hive.server2.thrift.http.path</name>
      <value>cliservice</value>
    </property>

    <property>
      <name>hive.server2.thrift.http.port</name>
      <value>10001</value>
    </property>

    <property>
      <name>hive.server2.thrift.max.worker.threads</name>
      <value>500</value>
    </property>

    <property>
      <name>hive.server2.thrift.port</name>
      <value>10000</value>
    </property>

    <property>
      <name>hive.server2.thrift.sasl.qop</name>
      <value>auth</value>
    </property>

    <property>
      <name>hive.server2.transport.mode</name>
      <value>binary</value>
    </property>

    <property>
      <name>hive.server2.use.SSL</name>
      <value>false</value>
    </property>

    <property>
      <name>hive.server2.zookeeper.namespace</name>
      <value>hiveserver2</value>
    </property>

    <property>
      <name>hive.smbjoin.cache.rows</name>
      <value>10000</value>
    </property>

    <property>
      <name>hive.stats.autogather</name>
      <value>true</value>
    </property>

    <property>
      <name>hive.stats.dbclass</name>
      <value>fs</value>
    </property>

    <property>
      <name>hive.stats.fetch.column.stats</name>
      <value>true</value>
    </property>

    <property>
      <name>hive.stats.fetch.partition.stats</name>
      <value>true</value>
    </property>

    <property>
      <name>hive.support.concurrency</name>
      <value>false</value>
    </property>

    <property>
      <name>hive.tez.auto.reducer.parallelism</name>
      <value>true</value>
    </property>

    <property>
      <name>hive.tez.container.size</name>
      <value>3072</value>
    </property>

    <property>
      <name>hive.tez.cpu.vcores</name>
      <value>-1</value>
    </property>

    <property>
      <name>hive.tez.dynamic.partition.pruning</name>
      <value>true</value>
    </property>

    <property>
      <name>hive.tez.dynamic.partition.pruning.max.data.size</name>
      <value>104857600</value>
    </property>

    <property>
      <name>hive.tez.dynamic.partition.pruning.max.event.size</name>
      <value>1048576</value>
    </property>

    <property>
      <name>hive.tez.input.format</name>
      <value>org.apache.hadoop.hive.ql.io.HiveInputFormat</value>
    </property>

    <property>
      <name>hive.tez.java.opts</name>
      <value>-server -Djava.net.preferIPv4Stack=true -XX:NewRatio=8 -XX:+UseNUMA -XX:+UseParallelGC -XX:+PrintGCDetails -verbose:gc -XX:+PrintGCTimeStamps</value>
    </property>

    <property>
      <name>hive.tez.log.level</name>
      <value>INFO</value>
    </property>

    <property>
      <name>hive.tez.max.partition.factor</name>
      <value>2.0</value>
    </property>

    <property>
      <name>hive.tez.min.partition.factor</name>
      <value>0.25</value>
    </property>

    <property>
      <name>hive.tez.smb.number.waves</name>
      <value>0.5</value>
    </property>

    <property>
      <name>hive.txn.manager</name>
      <value>org.apache.hadoop.hive.ql.lockmgr.DummyTxnManager</value>
    </property>

    <property>
      <name>hive.txn.max.open.batch</name>
      <value>1000</value>
    </property>

    <property>
      <name>hive.txn.timeout</name>
      <value>300</value>
    </property>

    <property>
      <name>hive.user.install.directory</name>
      <value>/user/</value>
    </property>

    <property>
      <name>hive.vectorized.execution.enabled</name>
      <value>true</value>
    </property>

    <property>
      <name>hive.vectorized.execution.reduce.enabled</name>
      <value>false</value>
    </property>

    <property>
      <name>hive.vectorized.groupby.checkinterval</name>
      <value>4096</value>
    </property>

    <property>
      <name>hive.vectorized.groupby.flush.percent</name>
      <value>0.1</value>
    </property>

    <property>
      <name>hive.vectorized.groupby.maxentries</name>
      <value>100000</value>
    </property>

    <property>
      <name>hive.zookeeper.client.port</name>
      <value>2181</value>
    </property>

    <property>
      <name>hive.zookeeper.namespace</name>
      <value>hive_zookeeper_namespace</value>
    </property>

    <property>
      <name>hive.zookeeper.quorum</name>
      <value>ws1dn2.wondersoft.cn:2181,ws1dn1.wondersoft.cn:2181,ws1dn3.wondersoft.cn:2181</value>
    </property>

    <property>
      <name>javax.jdo.option.ConnectionDriverName</name>
      <value>com.mysql.jdbc.Driver</value>
    </property>
  ###################服务器端配置  
    <property>
      <name>javax.jdo.option.ConnectionURL</name>
      <value>jdbc:mysql://ws1m.wondersoft.cn/hive?createDatabaseIfNotExist=true</value>
    </property>

    <property>
      <name>javax.jdo.option.ConnectionUserName</name>
      <value>root</value>
    </property>

注意一下我注释的地方,我也不知道这样说对不对,我在这http://duguyiren3476.iteye.com/blog/1632868 学的0.0

反正结果就是hive和spark sql的元数据库是同一个,想用哪个操作用哪个, 不过肯定用spark sql啊0.0
hive:

[root@ws1dn2 ~]# su hdfs
[hdfs@ws1dn2 root]$ hive
WARNING: Use "yarn jar" to launch YARN applications.

Logging initialized using configuration in file:/etc/hive/2.4.2.0-258/0/hive-log4j.properties
hive> show tables;
OK
t_log_2016
test
Time taken: 0.333 seconds, Fetched: 2 row(s)
hive> select count(1) from t_log_2016;
Query ID = hdfs_20161013143029_f605d459-878f-495f-95bf-3a3960537d00
Total jobs = 1
Launching Job 1 out of 1
Tez session was closed. Reopening...
Session re-established.


Status: Running (Executing on YARN cluster with App id application_1475896673093_0012)

--------------------------------------------------------------------------------
        VERTICES      STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  KILLED
--------------------------------------------------------------------------------
Map 1 ..........   SUCCEEDED     12         12        0        0       0       0
Reducer 2 ......   SUCCEEDED      1          1        0        0       0       0
--------------------------------------------------------------------------------
VERTICES: 02/02  [==========================>>] 100%  ELAPSED TIME: 16.56 s    
--------------------------------------------------------------------------------
OK
550734
Time taken: 25.034 seconds, Fetched: 1 row(s)

spark sql:

[hdfs@ws1dn2 root]$ spark-sql 
...
SET hive.support.sql11.reserved.keywords=false
SET spark.sql.hive.version=1.2.1
SET spark.sql.hive.version=1.2.1
spark-sql> show tables;
t_log_2016  false
test    false
Time taken: 1.964 seconds, Fetched 2 row(s)
spark-sql> select count(*) from t_log_2016;
550734                                                                          
Time taken: 2.715 seconds, Fetched 1 row(s)

一个25秒一个不到3秒,你说用哪个0.0,而且spark sql还可以结合scala写…

评论 4
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

光于前裕于后

您的打赏将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值