CCA Spark and Hadoop Developer Exam(CCA175)考试模拟题讲解二（每日一更两题）

AllenGd

已于 2022-08-26 13:46:09 修改

阅读量396

点赞数 1

分类专栏： CCA175 文章标签： spark

于 2020-06-17 12:14:13 首次发布

本文链接：https://blog.csdn.net/Allenzyg/article/details/106804630

版权

CCA175 专栏收录该内容

5 篇文章 2 订阅

订阅专栏

NO.3 CORRECT TEXT（第三题：正确文本）

Problem Scenario 84 : In Continuation of previous question, please accomplish following activities.（问题场景84 : 继续前一个问题（NO2），请完成以下活动。）

1. Select all the products which has product code as null

2. Select all the products, whose name starts with Pen and results should be order by Price descending order.

3. Select all the products, whose name starts with Pen and results should be order by Price descending order and quantity ascending order.

4. Select top 2 products by price

Answer:（答）

See the explanation for Step by Step Solution and configuration.（请参阅逐步解决方案和配置的说明）

Explanation:（说明）

Solution :（解决方案）

Step 1 : Select all the products which has product code as null（选出code为空的所有产品）

val sqlContext=new org.apache.spark.sql.SQLContext(sc)

val results = sqlContext.sql("SELECT * FROM products WHERE code IS NULL")

results. show()

Step 2 : Select all the products , whose name starts with Pen and results should be order by Price descending order. （选出name以Pen开头所有产品，并按Price降序排列）

val results = sqlContext.sql("SELECT * FROM products WHERE name LIKE 'Pen %' ORDER BY price DESC")

results. show()

Step 3 : Select all the products , whose name starts with Pen and results should be order by Price descending order and quantity ascending order. （选出name以pen开头，并按Price降序和quantity升序排列所有产品）

val results = sqlContext.sql("SELECT * FROM products WHERE name LIKE 'Pen %' ORDER BY quantity,price DESC")

results. show()

Step 4 : Select top 2 products by price（按Price选择前2名产品）

val results = sqlContext.sql("SELECT * FROM products ORDER BY price desc LIMIT 2")

results. show()

NO.4 CORRECT TEXT（第四题正确文本）

Problem Scenario 4: You have been given MySQL DB with following details.（问题场景4：MySQL数据库提供了以下详细信息）

user=retail_dba

password=cloudera

database=retail_db

table=retail_db.categories

jdbc URL = jdbc:mysql://quickstart:3306/retail_db

Please accomplish following activities.（请完成以下活动）

Import Single table categories (Subset data} to hive managed table , where category_id between 1 and 22（将categories表（子集数据）id介于1和22之间的数据导入hive托管表）

备注：

1.我的数据库用户名和密码跟考试环境不同

2.考试的时候表是已经建好的，我这里就先造一点数据

create table categories (

id int,

name varchar(12)

);

insert into categories values

(1,'yee'),(2,'xun'),(3,'zhang'),

(4,'ya'),(5,'guang'),(6,'zyg'),

(7,'ya'),(8,'guang'),(9,'zyg'),

(10,'ya'),(11,'guang'),(12,'zyg'),

(13,'ya'),(14,'guang'),(15,'zyg'),

(16,'ya'),(17,'guang'),(18,'zyg'),

(19,'ya'),(20,'guang'),(21,'zyg'),

(22,'ya'),(23,'guang'),(24,'zyg'),

(25,'ya'),(26,'guang'),(27,'zyg'),

(28,'ya'),(29,'guang'),(30,'zyg');

Answer:（答）

See the explanation for Step by Step Solution and configuration.（请参阅逐步解决方案和配置的说明）

Explanation:（说明）

Solution :（解决方案）

Step 1 : Import Single table (Subset data)（导入categories表（子集数据））

sqoop import --connect jdbc:mysql://quickstart:3306/retail_db -username=retail_dba -

password=cloudera -table=categories -where "category_id between 1 and 22" --hive-import --m 1

我测试环境的sqoop语句：

sqoop import --connect jdbc:mysql://manager:3306/retail_db -username=root -password=yeexun123 -table=categories -where "id between 1 and 22" --hive-import --m 1

部分执行过程如下：