![](https://img-blog.csdnimg.cn/20201014180756919.png?x-oss-process=image/resize,m_fixed,h_64,w_64)
hadoop
路遥车慢
这个作者很懒,什么都没留下…
展开
-
HBase命令行操作
hbase(main):001:0> create 'user_info',{NAME=>'base_info',VERSIONS=>3},{NAME=>'extra_info'}hbase(main):001:0> put 'user_info' , 'rk001', 'base_info:id','1'hbase(main):001:0> put 'user_info' , 'rk0012',原创 2017-09-13 18:09:53 · 411 阅读 · 0 评论 -
sotrm.yaml配置文件
storm.yaml配置项配置说明storm.zookeeper.serversZooKeeper服务器列表storm.zookeeper.portZooKeeper连接端口storm.local.dirstorm使用的本地文件系统目录(必须存在并且storm进程可读写)storm.cluster.modeSt原创 2017-10-17 22:42:35 · 367 阅读 · 0 评论 -
storm之wordCount
WordCount spoutpackage storm.demo;import org.apache.storm.spout.SpoutOutputCollector;import org.apache.storm.task.TopologyContext;import org.apache.storm.topology.OutputFieldsDeclarer;import org.a原创 2017-10-18 00:52:01 · 274 阅读 · 0 评论 -
Hadoop HA on Yarn——集群配置
集群搭建因为服务器数量有限,这里服务器开启的进程有点多:机器名 安装软件 运行进程 hadoop001 Hadoop,Zookeeper NameNode, DFSZKFailoverController, ResourceManagerDataNode, NodeManagerQuorumPeerMa原创 2017-10-19 12:00:32 · 275 阅读 · 0 评论 -
Spark Streaming
package storm.streamimport org.apache.spark.streaming.dstream.{DStream, InputDStream, ReceiverInputDStream}import org.apache.spark.streaming.{Seconds, StreamingContext}import org.apache.spark.{Spa原创 2017-10-19 21:31:32 · 253 阅读 · 0 评论 -
spark的转换算子操作
package os.Streamingimport org.apache.spark.rdd.RDDimport org.apache.spark.{Partition, SparkConf, SparkContext}import org.junitimport org.junit.{Before, Test}import scala.collection.mutableclass St原创 2017-10-15 17:21:48 · 382 阅读 · 0 评论 -
spark之设置检查点
cache: 将数据缓存到内存 @Test def test1(): Unit ={ val rdd1: RDD[String] = sc.textFile("hdfs://192.168.8.128:9000/test/README.txt") //进行缓存 val rdd2: RDD[String] = rdd1.cache()原创 2017-10-15 19:38:51 · 1201 阅读 · 0 评论 -
spark之action算子
first() count() reduce() @Test def test3(): Unit ={ val rdd1: RDD[(String, Int)] = sc.makeRDD(Array(("A",1),("B",2),("C",3))) println(rdd1.first()) //返回第一个元素 ("A",1) print原创 2017-10-15 20:29:37 · 320 阅读 · 0 评论 -
sparkSQL操作基本操作
点击edit configuration,在左侧点击该项目。在右侧VM options中输入“-Dspark.master=local”,指示本程序本地单线程运行new.txt001,goods0001,10,20.00002,goods0001,10,20.00003,goods0002,50,30.00004,goods0001,10,30.00005,goods0003,90,10.原创 2017-10-16 15:10:25 · 1053 阅读 · 0 评论 -
sparkSQL操作结果集
原始数据zhang san,15li si,15wang wu,20zhao liu,22zhang san,42li wu,22li si,20hello world,18hello world,18 /** * 从文本文件中创建Person对象的RDD,将其转换为Dataframe */ @Test def test4(): Unit原创 2017-10-16 16:11:32 · 2383 阅读 · 0 评论 -
SparkSQL已编程模式指定Schema
def test5(): Unit ={ val ss: SparkSession = SparkSession.builder().appName("Spark SQL basic example") .config("spark.some.config.option", "some-value").getOrCreate() import ss.原创 2017-10-16 16:33:46 · 740 阅读 · 0 评论 -
Spark 的JAVA版 wordCount
package os.unix;import org.apache.spark.SparkConf;import org.apache.spark.api.java.JavaPairRDD;import org.apache.spark.api.java.JavaRDD;import org.apache.spark.api.java.JavaSparkContext;import org.原创 2017-09-29 11:59:24 · 298 阅读 · 0 评论 -
zookeeper基本操作
package os.zk.demo;import java.io.IOException;import java.io.UnsupportedEncodingException;import java.util.List;import org.apache.hadoop.ha.protocolPB.ZKFCProtocolClientSideTranslatorPB;import org.原创 2017-09-13 08:45:41 · 213 阅读 · 0 评论 -
hbase客户端api--建表
package os.hbase.index;import java.io.IOException;import org.apache.hadoop.conf.Configuration;import org.apache.hadoop.hbase.HBaseConfiguration;import org.apache.hadoop.hbase.HColumnDescriptor;impor原创 2017-09-13 18:39:01 · 382 阅读 · 0 评论 -
hbase客户端查询API
/* * 查询数据 */ @Test public void testGet() throws IOException { Configuration conf = HBaseConfiguration.create(); conf.set("hbase.zookeeper.quorum","os-1:2181,os-2:2181,o原创 2017-09-13 19:55:37 · 494 阅读 · 0 评论 -
hBase客户端API-增删改
package os.hbase.index;import java.io.IOException;import java.util.ArrayList;import org.apache.hadoop.conf.Configuration;import org.apache.hadoop.hbase.HBaseConfiguration;import org.apache.hadoop.hb原创 2017-09-13 19:24:27 · 350 阅读 · 0 评论 -
hadoop
hadoop安装hadoop安装先安装JDK然后下载hadoop包 解压[os@localhost conf]$ pwd/home/os/hadoop-1.2.1/confvim hadoop-env.shexport JAVA_HOME=/usr/local/jdk1.8.0_144#修改core-site.xml[os@localhost conf]$ cat core-site.xm原创 2017-09-06 22:37:00 · 201 阅读 · 0 评论 -
SparSQL版 wordCount
原始数据hello wordsord RDDRDD hellohello worldhello c++hello worldworld ni hao输出结果+-----+------+| word|counts|+-----+------+|hello| 5||world| 3|| RDD| 2|| hao| 1|| sord|原创 2017-09-27 12:33:54 · 1016 阅读 · 0 评论 -
hadoop
./hadoop-daemon.sh start namenode./hadoop-daemon.sh start datanode./hadoop-daemon.sh start secondarynamenode# resourcemanager nodemanager./yarn-daemon.sh start resourcemanager./yarn-daemon.sh s原创 2017-09-11 19:47:28 · 168 阅读 · 0 评论 -
hdfs的一些操作
import java.io.FileNotFoundException;import java.io.IOException;import java.net.URI;import java.net.URISyntaxException;import org.apache.hadoop.conf.Configuration;import org.apache.hadoop.fs.BlockL原创 2017-09-11 19:50:26 · 263 阅读 · 0 评论 -
wordCount MapReduce
mappackage os.unix.cn;import java.io.IOException;import org.apache.hadoop.io.LongWritable;import org.apache.hadoop.io.Text;import org.apache.hadoop.mapreduce.Mapper;/* * KEYIN:输入数据KV对中的key数据类型 (行的起始原创 2017-09-11 22:57:49 · 211 阅读 · 0 评论 -
流量汇总mapreduce
FlowCountMapper.javapackage os.os.flowcount;import java.io.IOException;import org.apache.commons.lang.StringUtils;import org.apache.hadoop.io.LongWritable;import org.apache.hadoop.io.Text;import org原创 2017-09-12 08:11:03 · 310 阅读 · 0 评论 -
分区汇总流量MapReduce
ProviceCountMapper.javapackage os.bigdata.provincflowcount;import java.io.IOException;import org.apache.commons.lang.StringUtils;import org.apache.hadoop.io.LongWritable;import org.apache.hadoop.io.T原创 2017-09-12 09:42:19 · 397 阅读 · 0 评论 -
scala隐式转换
package test.zhuanhuanimport java.io.Fileimport scala.io.Source// implict def形式的隐式转换object ImplicitDefDemo { object MyImplicitTypeConversion { implicit def strToInt(str: String) =原创 2017-10-16 22:26:59 · 325 阅读 · 0 评论