impala和mysql对比_hive和impala查询数据对比

首先impala查询数据,更像rdbms一样(mysql)。

--1.impala连接

[root@MASTER01 ~]# impala-shell -islave03;

Starting Impala Shell without Kerberos authentication

Connected to slave03:21000

Server version: impalad version 2.0.0-cdh5 RELEASE (build ecf30af0b4d6e56ea80297df2189367ada6b7da7)

Welcome to the Impala shell. Press TAB twice to see a list of available commands.

Copyright (c) 2012 Cloudera, Inc. All rights reserved.

(Shell build version: Impala Shell v2.0.0-cdh5 (ecf30af) built on Sat Oct 11 13:56:06 PDT 2014)

[slave03:21000] > use tmp;

Query: use tmp

[slave03:21000] > select count(*) from view_0809_02;

Query: select count(*) from view_0809_02

+----------+

| count(*) |

+----------+

| 312923 |

+----------+

Fetched 1 row(s) in 0.52s

[slave03:21000] >

[slave03:21000] > show databases;

Query: show databases

+------------------+

| name |

+------------------+

| _impala_builtins |

| analyse |

| default |

| preparation |

| result |

| tmp |

| trans |

| unicomidmp |

| unicomidmptext |

+------------------+

Fetched 9 row(s) in 0.04s

[slave03:21000] >

--2.hive查询

把你的查询写入到文件中,这里以test为例

[root@MASTER01 ~]# pwd

/root

[root@MASTER01 ~]# more test

select * from tmp.view_0809_02 limit 10;

[root@MASTER01 ~]#

方式一

[root@MASTER01 ~]# hive

2016-07-10 23:13:23,618 WARN [main] conf.HiveConf (HiveConf.java:initialize(1488)) - DEPRECATED: Configuration property hive.metastore.local no longer has any effect. Make sure to provide a valid value for hive.metastore.uris if you are connecting to a remote metastore.

Logging initialized using configuration in jar:file:/opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/jars/hive-common-0.13.1-cdh5.2.0.jar!/hive-log4j.properties

hive> source /root/test;

OK

SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".

SLF4J: Defaulting to no-operation (NOP) logger implementation

SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.

127127010.0235272.00.00.01995.71217179076620.54606802586760131NULL

1271270410597.17747451898315851.00.00.02214.8119594275830.5935170862044721甯..

127127019000.21532129144515180.00.00.01635.47218477398880.53046095126247791甯..

127127019127.89991113976612088.00.00.01315.33129526608580.40749257449458661甯..

127127097248.96671687159712211.00.00.01917.68722227062810.477744807121661731甯..

127127017173.39387186049913058.00.00.01618.20317881411140.49416409200137321甯..

127127026611.04109938432414520.00.00.02039.50768260972340.51968944533818281甯..

127127017141.8042285512545984.00.00.01666.41422109443970.465080583269378341甯..

127127017299.3800899960093047.00.00.01652.15920627595760.50773088173840361甯..

127127036979.5982123612275697.00.00.01696.50900443295150.46532999164578111甯..

Time taken: 1.225 seconds, Fetched: 10 row(s)

hive>

方式二

[root@MASTER01 ~]# hive -f test

2016-07-10 23:14:20,138 WARN [main] conf.HiveConf (HiveConf.java:initialize(1488)) - DEPRECATED: Configuration property hive.metastore.local no longer has any effect. Make sure to provide a valid value for hive.metastore.uris if you are connecting to a remote metastore.

Logging initialized using configuration in jar:file:/opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/jars/hive-common-0.13.1-cdh5.2.0.jar!/hive-log4j.properties

OK

SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".

SLF4J: Defaulting to no-operation (NOP) logger implementation

SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.

127127010.0235272.00.00.01995.71217179076620.54606802586760131NULL

1271270410597.17747451898315851.00.00.02214.8119594275830.5935170862044721市区

127127019000.21532129144515180.00.00.01635.47218477398880.53046095126247791市区

127127019127.89991113976612088.00.00.01315.33129526608580.40749257449458661市区

127127097248.96671687159712211.00.00.01917.68722227062810.477744807121661731市区

127127017173.39387186049913058.00.00.01618.20317881411140.49416409200137321市区

127127026611.04109938432414520.00.00.02039.50768260972340.51968944533818281市区

127127017141.8042285512545984.00.00.01666.41422109443970.465080583269378341市区

127127017299.3800899960093047.00.00.01652.15920627595760.50773088173840361市区

127127036979.5982123612275697.00.00.01696.50900443295150.46532999164578111市区

Time taken: 1.423 seconds, Fetched: 10 row(s)

Jul 10, 2016 11:14:23 PM WARNING: parquet.hadoop.ParquetRecordReader: Can not initialize counter due to context is not a instance of TaskInputOutputContext, but is org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl

Jul 10, 2016 11:14:23 PM INFO: parquet.hadoop.InternalParquetRecordReader: RecordReader initialized will read a total of 23383 records.

Jul 10, 2016 11:14:23 PM INFO: parquet.hadoop.InternalParquetRecordReader: at row 0. reading next block

Jul 10, 2016 11:14:23 PM INFO: parquet.hadoop.InternalParquetRecordReader: block read in memory in 29 ms. row count = 23383

[root@MASTER01 ~]#

方式三,直接用sql

[root@MASTER01 ~]# hive -e 'select * from tmp.view_0809_02 limit 10'

2016-07-10 23:31:35,744 WARN [main] conf.HiveConf (HiveConf.java:initialize(1488)) - DEPRECATED: Configuration property hive.metastore.local no longer has any effect. Make sure to provide a valid value for hive.metastore.uris if you are connecting to a remote metastore.

Logging initialized using configuration in jar:file:/opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/jars/hive-common-0.13.1-cdh5.2.0.jar!/hive-log4j.properties

OK

SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".

SLF4J: Defaulting to no-operation (NOP) logger implementation

SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.

127127010.0235272.00.00.01995.71217179076620.54606802586760131NULL

1271270410597.17747451898315851.00.00.02214.8119594275830.5935170862044721甯..

127127019000.21532129144515180.00.00.01635.47218477398880.53046095126247791甯..

127127019127.89991113976612088.00.00.01315.33129526608580.40749257449458661甯..

127127097248.96671687159712211.00.00.01917.68722227062810.477744807121661731甯..

127127017173.39387186049913058.00.00.01618.20317881411140.49416409200137321甯..

127127026611.04109938432414520.00.00.02039.50768260972340.51968944533818281甯..

127127017141.8042285512545984.00.00.01666.41422109443970.465080583269378341甯..

127127017299.3800899960093047.00.00.01652.15920627595760.50773088173840361甯..

127127036979.5982123612275697.00.00.01696.50900443295150.46532999164578111甯..

Time taken: 1.491 seconds, Fetched: 10 row(s)

Jul 10, 2016 11:31:39 PM WARNING: parquet.hadoop.ParquetRecordReader: Can not initialize counter due to context is not a instance of TaskInputOutputContext, but is org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl

Jul 10, 2016 11:31:39 PM INFO: parquet.hadoop.InternalParquetRecordReader: RecordReader initialized will read a total of 23383 records.

Jul 10, 2016 11:31:39 PM INFO: parquet.hadoop.InternalParquetRecordReader: at row 0. reading next block

Jul 10, 2016 11:31:39 PM INFO: parquet.hadoop.InternalParquetRecordReader: block read in memory in 53 ms. row count = 23383

[root@MASTER01 ~]#

其他参数尝试,参加官方文档

[root@MASTER01 ~]# hive -help

2016-07-10 23:34:54,959 WARN [main] conf.HiveConf (HiveConf.java:initialize(1488)) - DEPRECATED: Configuration property hive.metastore.local no longer has any effect. Make sure to provide a valid value for hive.metastore.uris if you are connecting to a remote metastore.

usage: hive

-d,--define Variable subsitution to apply to hive

commands. e.g. -d A=B or --define A=B

--database Specify the database to use

-e SQL from command line

-f SQL from files

-H,--help Print help information

-h connecting to Hive Server on remote host

--hiveconf Use value for given property

--hivevar Variable subsitution to apply to hive

commands. e.g. --hivevar A=B

-i Initialization SQL file

-p connecting to Hive Server on port number

-S,--silent Silent mode in interactive shell

-v,--verbose Verbose mode (echo executed SQL to the

console)

### 回答1: Presto是一种分布式SQL查询引擎,可用于处理大规模数据。搭建Presto需要安装Java和Presto软件,并配置相关参数。与Impala和SparkSQL相比,Presto具有更高的灵活性和可扩展性,可以处理更广泛的数据类型和格式。但是,Presto的性能可能不如Impala和SparkSQL,特别是在处理大规模数据时。因此,选择哪种查询引擎应该根据具体的需求和数据类型来决定。 ### 回答2: Presto是一个分布式的SQL查询引擎,可以用于实时查询大规模的数据。搭建Presto可以分为以下几个步骤: 1. 安装Java:Presto是基于Java开发的,因此需要先安装Java运行环境。 2. 下载Presto软件包:从官方网站下载Presto的最新版本软件包。 3. 配置Presto节点:根据实际情况,配置Presto节点的相关参数,如分配的内存、CPU等,并设置集群间的通信方式。 4. 配置Presto连接器:Presto支持连接多种数据源,需要根据需要配置相应的连接器,如MySQLHive等。 5. 启动Presto集群:按照指定的顺序启动Presto的协调器和工作节点,确保它们可以正常通信和协调任务。 对比impala和sparksql,Presto有以下几个特点: 1. 支持多种数据源:Presto可以连接多种数据源,包括关系型数据库、NoSQL数据库和分布式文件系统等,可以方便地进行跨数据源的查询和分析。 2. 兼容性和灵活性:Presto兼容标准的SQL语法,可以进行复杂的JOIN操作和子查询等,同时支持使用UDF扩展功能。Presto还提供了灵活的查询优化和调整功能,可以根据查询情况自动调整执行计划。 3. 分布式查询:Presto采用分布式查询引擎,可以并行处理大规模的数据,并且支持动态扩展集群规模,以适应不断增长的查询负载。 4. 实时性能:Presto通过使用内存进行计算和高效的查询引擎优化,可以达到毫秒级的查询延迟,适用于实时分析等对查询性能要求较高的场景。 5. 社区支持和生态系统:Presto是一个开源项目,有一个活跃的社区和庞大的用户群体,提供了完善的文档和支持。同时,Presto还有丰富的生态系统,可以与其他工具和平台进行集成,如Hadoop、Apache Kafka等。 综上所述,Presto是一个灵活、高性能的分布式查询引擎,可以满足复杂查询和实时分析的需求,同时具有广泛的兼容性和生态系统支持。 ### 回答3: Presto是一个开源的分布式SQL查询引擎,用于处理大规模的数据处理和分析任务。要搭建Presto,首先需要设置一个Presto集群,该集群包括一个或多个协调器节点和多个工作节点。协调器节点负责接收和处理查询请求,工作节点负责执行查询操作。搭建Presto还需要配置分布式存储系统(如Hadoop HDFS或Amazon S3),以及定义表和分区。 与Impala相比,Presto更加灵活,并且可以支持更广泛的数据源和格式。Impala是基于Hadoop生态系统的分析性SQL查询引擎,而Presto可以连接到多个数据源(如HiveMySQL、Oracle、Cassandra等),并支持各种数据格式(如Parquet、CSV、JSON等)。Presto还具有更好的查询优化和执行性能,能够快速执行复杂的分析查询。 与Spark SQL相比,Presto具有更低的延迟和更好的交互性能。Presto将查询结果实时返回给用户,适合于需要即时响应的交互式查询场景。而Spark SQL则更侧重于大规模批处理和复杂的数据转换任务。Spark SQL基于Apache Spark引擎,可以在内存中处理数据,提供更高的吞吐量和并行处理能力。 总而言之,Presto是一个功能强大、灵活性高的分布式SQL查询引擎,适用于各种数据处理和分析任务。Impala更适合在Hadoop生态系统中进行快速的分析查询,而Spark SQL适用于大规模批处理和复杂的数据转换操作。选择适合项目需求的工具,可以根据数据源、查询需求和性能要求进行权衡。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值