- 谷粒影音 - 需求分析
- 谷粒影音 - Mapper
- 谷粒影音 - ETLUtil
- 谷粒影音 - Driver
- 谷粒影音 - 清洗数据
- 谷粒影音 - 建表&导入数据
- 执行数据库 片段代码(在IDea 中执行此段代码,需要先配置Hive Driver)
-
@Test public void TestHiveDriverCreateTable() throws Exception { Class.forName("org.apache.hive.jdbc.HiveDriver"); Connection conn = DriverManager.getConnection("jdbc:hive2://10.103.60.233:10000/default", "root", ""); // default 为 hive的数据库名 stmt = conn.createStatement(); String deleteTableSql = "drop table gulivideo_ori"; boolean res = stmt.execute(deleteTableSql); System.out.println(res); String createTableSql = "create table if not exists gulivideo_ori (" + "videoId string," + " uploader string, " + "age int , "+ "category array<string>," + "length int ,"+ "views int,"+ "rate float,"+ "ratings int,"+ "comments int,"+ "relatedId array<string> )"+ "row format delimited fields terminated by \"\\t\" "+ "collection items terminated by \"&\" "+ "stored as textfile"; res = stmt.execute(createTableSql); String createUserSql = "create table if not exists gulivideo_user_ori(" + "uploader string," + "videos int," + "friends int)" + "row format delimited fields terminated by \"\\t\"" + "stored as textfile"; stmt.execute(createUserSql); String createOrcSql = "create table if not exists gulivideo_orc(" + "videoId string," + "uploader string," + "age int)" + "row format delimited fields terminated by \"\\t\"" + "collection items terminated by \"&\""+ "stored as orc"; stmt.execute(createOrcSql); String createUserOrcSql = "create table if not exists gulivideo_user_orc(" + "uploader string," + "videos int, " + "friends int)" + "row format delimited fields terminated by \"\\t\"" + "collection items terminated by \"&\""+ "stored as orc"; stmt.execute(createUserOrcSql); System.out.println(res); }
- 谷粒影音 - 需求 (1)
- select videoId,views from gulivideo_orc order by views desc LIMIT 10;
- 谷粒影音 - 需求 (2)
- 视频类别,视频个数越多,类别越多,类别 + 个数
- 谷粒影音 - 需求 (3)
- 谷粒影音 - 需求 (4)
- 谷粒影音 - 需求 (5)
- 谷粒影音 - 需求 (6)
- 谷粒影音 - 需求 (7)
别人写的 方便自己记录,写的很好