- 博客(7)
- 收藏
- 关注
原创 **Hadoop纵览之(五)数据仓库解决方案Hive**
1. 官网介绍A data warehouse infrastructure that provides data summarization and ad hoc querying.提供数据汇总和临时查询的一个数据仓库基础设施。The Apache Hive™ data warehouse software facilitates reading, writing, and managin...
2019-02-12 20:17:30 741
原创 **Hadoop纵览之(四)分布式应用程序协调服务工具Zookeeper**
1. 官网介绍Welcome to Apache ZooKeeper™欢迎来到Apache Zookeeper的世界。Apache ZooKeeper is an effort to develop and maintain an open-source server which enables highly reliable distributed coordination.Apache...
2019-02-12 12:30:18 278
原创 **Hadoop纵览之(三)分布式计算框架MapReduce**
1. 官网介绍:Hadoop MapReduce is a software framework for easily writing applications which process vast amounts of data (multi-terabyte data-sets) in-parallel on large clusters (thousands of nodes) of co...
2018-12-27 20:49:16 352
原创 **HQL经典练习50题之(二)**
26、查询每门课程被选修的学生数:select course_id,count(stu_id)fromt_scoregroup by course_id;运行结果:1 62 63 627、查询出只有两门课程的全部学生的学号和姓名:select a.*,count(a.stu_id)fromt_stu_info a join t_score b on a.stu_id...
2018-12-27 12:17:52 830
原创 **HQL经典练习50题之(一)**
学生表create table if not exists t_stu_info(stu_id int,stu_name string,birthday string,gender string)row format delimitedfields terminated by ’ ';load data local inpath ‘/home/testdata/stu_info....
2018-12-27 11:47:47 3687 3
原创 **Hadoop纵览之(二)分布式文件系统HDFS**
1.官网介绍The Hadoop Distributed File System (HDFS) is a distributed file system designed to run on commodity hardware. It has many similarities with existing distributed file systems. However, the diffe...
2018-12-26 16:22:10 259
原创 **Hadoop纵览之(一)Hadoop简介与集群搭建**
Hadoop历史雏形开始于2002年的Apache的Nutch,Nutch是一个开源Java 实现的搜索引擎。它提供了我们运行自己的搜索引擎所需的全部工具。包括全文搜索和Web爬虫。随后在2003年Google发表了一篇技术学术论文谷歌文件系统(GFS)。GFS也就是google File System,google公司为了存储海量搜索数据而设计的专用文件系统。2004年Nutch创始人Do...
2018-12-26 15:29:15 162
空空如也
空空如也
TA创建的收藏夹 TA关注的收藏夹
TA关注的人