2015年11月_duncandai

RStudio Server安装说明

rstudio-server安装步骤：一、安装R1、安装R的core核心包：到Ｒ官网下载安装包并通过如下命令安装rpm -ivh R-core-3.2.0-2.el6.x86_64.rpm 2、检测是否安装成功：R 查看是否正确进入R，然后在交互式控制台输入1 + 1，查看是否正确运算即可。 [running]slave1@192.168.13.1...

2015-11-25 10:27:18 627

spark sql基本使用方法介绍

spark中可以通过spark sql 直接查询hive或impala中的数据，一、启动方法/data/spark-1.4.0-bin-cdh4/bin/spark-sql --master spark://master:7077 --total-executor-cores 10 --executor-memory 1g --executor-cores 2 注：/...

2015-11-23 19:22:43 624

ecutor-memory 1g --executor-cores 2 注：/data/spark-1.4.0-bin-cdh4/为spark的安装路径 /data/spark-1.4.0-bin-cdh4/bin/spark-sql –help 查看启动选项 --master MASTER_URL 指定master url--executor-memory MEM 每个executor的内存，默认为1G--total-executor

2015-11-23 19:22:43 171

impala数据插入的方法详解

impala是一种内存计算的数据库，查询性能相比于hive官网称是快100倍，其向表中插入数据的方法如下：１、insert into[slave12:21000] > insert into parquet_snappy select * from raw_text_data;Inserted 1000000000 rows in 181.98s 2、CTAS...

2015-11-19 10:38:47 7916

impala数据插入的方法详解

_snappy select * from raw_text_data;Inserted 1000000000 rows in 181.98s 2、CTAS [slave12:21000] > create table test_table STORED AS PARQUET as select * from table;Query: create table test_table STORED AS PARQUET as select * from table+---

2015-11-19 10:38:47 1051

hive 查看一个表的总文件大小方法

要查看一个hive表文件总大小时，我们可以通过一行脚本快速实现，其命令如下： $ hadoop fs -ls /user/hive/warehouse/test_table/ds=20151111|awk -F ' ' '{print $5}'|awk '{a+=$1}END{print a}'32347122009 这样可以省去自己相加，下面命令是列出该表的详细文件列表...

2015-11-12 18:02:17 3785

hive 查看一个表的总文件大小方法

awk '{a+=$1}END{print a}'32347122009 这样可以省去自己相加，下面命令是列出该表的详细文件列表hadoop fs -ls /user/hive/warehouse/test_table/ds=20151111 方法二：查看该表总容量大小，单位为Ｇ hadoop fs -du /user/hive/warehouse/test_table|awk ' { SUM += $1 } END { print SUM/(1024*1024*1024) }

2015-11-12 18:02:17 1679

mysql碎片整理方法

　　对于一个表如果经常插入数据和删除数据，则会产生很多不连续的碎片，这样久而久之，这个表就会占用很大空间，但实际上表里面的记录数却很少，这样不但会浪费空间，并且查询速度也更慢，因此为了解决这个问题，可以有以下解决方案１、myisam存储引擎清理碎片方法 OPTIMIZE TABLE table_name ２、innodb存储引擎清理碎片方法 ALTER T...

2015-11-10 11:55:44 650

mysql碎片整理方法

少，这样不但会浪费空间，并且查询速度也更慢，因此为了解决这个问题，可以有以下解决方案１、myisam存储引擎清理碎片方法 OPTIMIZE TABLE table_name ２、innodb存储引擎清理碎片方法 ALTER TABLE tablename ENGINE=InnoDB ３、查看表碎片的方法 mysql> select ROW_FORMAT,TABLE_ROWS,DATA_LENGTH,INDEX_LENGTH,MAX_D

2015-11-10 11:55:44 340

原创 shell中let和expr用法及性能比较

1、expr计算整数变量值格式 :expr arg例子：计算（2＋3）×4的值1、分步计算，即先计算2＋3，再对其和乘4s=`expr 2 + 3`expr $s \* 42、一步完成计算：expr `expr 2 + 3 ` \* 4–说明：运算符号和参数之间要有空格分开；通配符号(*),在作为乘法运算符时要用\、“”、‘’符号修饰–:expr 3 \...

2015-11-10 11:25:56 985

原创 shell中let和expr用法及性能比较

r `expr 2 + 3 ` \* 4–说明：运算符号和参数之间要有空格分开；通配符号(*),在作为乘法运算符时要用\、“”、‘’符号修饰–:expr 3 \* 2 expr 3 “*” 2 expr 3 ‘*’ 2 `(反引号)与键盘上的~同一个键上的符号 [fsy@localhost ~]$ s=`expr 2 + 3`[fsy@localhost ~]$ echo $s5[fsy@localhost ~]$ expr $s \* 42

2015-11-10 11:25:56 791

shell dirname的使用

１、用途说明dirname命令可以取给定路径的目录部分，如果给定的参数本身为一个目录，那就取当前目前的上一层目录。这个命令很少直接在shell命令行中使用，一般把它用在shell脚本中，用于取得脚本文件所在目录，然后将当前目录切换过去。　Usage: dirname NAME or: dirname OPTIONPrint NAME with its trailing /...

2015-11-09 19:11:40 1522

shell dirname的使用

2015-11-09 19:11:40 396

hive 行转列和列转行的方法

一、行转列的使用１、问题hive如何将a b 1a b 2a b 3c d 4c d 5c d 6变为：a b 1,2,3c d 4,5,6 ２、数据test.txta...

2015-11-06 19:56:05 1507

hive 行转列和列转行的方法

5c d 6变为：a b 1,2,3c d 4,5,6 ２、数据test.txta b 1 a b 2 a b 3 c d 4 c d 5 c d 6 ３、答案1.建表drop table tmp_jiangzl_test;create table tm

2015-11-06 19:56:05 340

原创 hive array、map、struct使用

hive提供了复合数据类型：Structs： structs内部的数据可以通过DOT（.）来存取，例如，表中一列c的类型为STRUCT{a INT; b INT}，我们可以通过c.a来访问域aMaps（K-V对）：访问指定域可以通过["指定域名称"]进行，例如，一个Map M包含了一个group-》gid的kv对，gid的值可以通过M['group']来获取Arrays：array中的数据为相同...

2015-11-06 19:40:35 60

原创 hive array、map、struct使用

K-V对）：访问指定域可以通过["指定域名称"]进行，例如，一个Map M包含了一个group-》gid的kv对，gid的值可以通过M['group']来获取Arrays：array中的数据为相同类型，例如，假如array A中元素['a','b','c']，则A[1]的值为'b'Struct使用建表：[plain] view plaincopy hive> create table student_test(id INT, info s

2015-11-06 19:40:35 95

hive with查询用法及CTAS的使用

hive 可以通过with查询来提高查询性能，因为先通过with语法将数据查询到内存，然后后面其它查询可以直接使用 with q1 as ( select key from src where key = '5')select *from q1; -- from stylewith q1 as (select * from s...

2015-11-04 17:48:02 1849

hive with查询用法及CTAS的使用

m src where key = '5')select *from q1; -- from stylewith q1 as (select * from src where key= '5')from q1select *; -- chaining CTEswith q1 as ( select key from q2 where key = '5'),q2 as ( select key from src where key = '

2015-11-04 17:48:02 463

duncan

RStudio Server安装说明

RStudio Server安装说明

spark sql基本使用方法介绍

spark sql基本使用方法介绍

impala数据插入的方法详解

impala数据插入的方法详解

hive 查看一个表的总文件大小方法

hive 查看一个表的总文件大小方法

mysql碎片整理方法

mysql碎片整理方法

原创 shell中let和expr用法及性能比较

原创 shell中let和expr用法及性能比较

shell dirname的使用

shell dirname的使用

hive 行转列和列转行的方法

hive 行转列和列转行的方法

原创 hive array、map、struct使用

原创 hive array、map、struct使用

hive with查询用法及CTAS的使用

hive with查询用法及CTAS的使用

apache-tomcat-5.5.27.exe

空空如也