- 博客(20)
- 收藏
- 关注
![](https://csdnimg.cn/release/blogv2/dist/pc/img/listFixedTop.png)
原创 个人pom依赖文件
<dependencies> <dependency> <groupId>org.apache.hadoop</groupId> <artifactId>hadoop-common</artifactId> <version>2...
2020-04-07 19:38:30
131
原创 HBases API
import org.apache.hadoop.conf.Configuration;import org.apache.hadoop.hbase.*;import org.apache.hadoop.hbase.client.*;import org.apache.hadoop.hbase.util.Bytes;import java.io.IOException;public class HBaseAPI2 { private static Connection connecti
2020-05-26 12:35:51
160
1
原创 sqoop HDFS到MySQL表
$ bin/sqoop export \--connect jdbc:mysql://hadoop102:3306/company \--username root \--password 000000 \--table staff \--num-mappers 1 \--export-dir /user/hive/warehouse/staff_hive \--input-fields-terminated-by "\t"
2020-05-20 12:20:24
95
原创 sqoop hdfs导入sql数据
(1)全部导入$ bin/sqoop import \--connect jdbc:mysql://hadoop102:3306/company \--username root \--password 000000 \--table staff \--target-dir /user/company \--delete-target-dir \--num-mappers 1 \--fields-terminated-by "\t"(2)查询导入$ bin/sqoop import
2020-05-20 12:19:15
107
原创 实时监控目录下的多个追加文件
a3.sources = r3a3.sinks = k3a3.channels = c3# Describe/configure the sourcea3.sources.r3.type = TAILDIRa3.sources.r3.positionFile = /opt/module/flume-1.7.0/tail_dir.jsona3.sources.r3.filegroups = f1 f2a3.sources.r3.filegroups.f1 = /opt/module/flume
2020-05-19 23:37:18
264
原创 实时监控目录下多个新文件
a3.sources = r3a3.sinks = k3a3.channels = c3# Describe/configure the sourcea3.sources.r3.type = spooldira3.sources.r3.spoolDir = /opt/module/flume-1.7.0/uploada3.sources.r3.fileSuffix = .COMPLETEDa3.sources.r3.fileHeader = true#忽略所有以.tmp结尾的文件,不上传
2020-05-19 23:36:45
241
原创 实时监控单个追加文件
# Name the components on this agenta2.sources = r2a2.sinks = k2a2.channels = c2# Describe/configure the sourcea2.sources.r2.type = execa2.sources.r2.command = tail -F /opt/module/datas/A.loga2.sources.r2.shell = /bin/bash -c# Describe the sinka2
2020-05-19 23:35:14
248
原创 监控端口数据
# Name the components on this agenta1.sources = r1a1.sinks = k1a1.channels = c1# Describe/configure the sourcea1.sources.r1.type = netcata1.sources.r1.bind = localhosta1.sources.r1.port = 44444# Describe the sinka1.sinks.k1.type = logger# Use
2020-05-19 23:32:55
314
原创 自定义Source
导入pom依赖<dependencies> <dependency> <groupId>org.apache.flume</groupId> <artifactId>flume-ng-core</artifactId> <version>1.7.0</version> </dependency></dependencies>
2020-05-19 23:29:48
137
原创 yarn部署
./spark-submit \--class com.bawei.foryk.SparkStreamTraffic01 \--master yarn --deploy-mode cluster \--executor-memory 1G \--executor-cores 3 \/opt/1710e-1.0-SNAPSHOT.jar \hdfs://hdp2:8020/checkdir
2020-05-13 09:00:26
132
原创 spark-sql 处理完数据发送到mysql
import java.util.Propertiesimport org.apache.spark.rdd.RDDimport org.apache.spark.sql.{DataFrame, SparkSession}case class Student(name:String,sex:String,age:Int)object SparkSqlReview01 { def main(args: Array[String]): Unit = { val spark: Spark
2020-05-12 22:32:15
169
原创 制造数据用kafka发送
import java.io.PrintWriterimport java.text.SimpleDateFormatimport java.util.{Date, Properties}import org.apache.kafka.clients.producer.{KafkaProducer, ProducerRecord}/** * Created by xiang on 2020/5/11. */object SparkTrafficMockData { def mai
2020-05-12 22:16:26
231
原创 从Kafka获取数据后,处理数据后,发送到mysql
import java.sql.DriverManagerimport com.bawei.stream.StreamFromKafka.createFuncimport org.apache.kafka.clients.consumer.ConsumerRecordimport org.apache.kafka.common.serialization.StringDeserializerimport org.apache.spark.streaming.dstream.{DStream, In
2020-05-12 22:14:35
1368
原创 Spark-kafka
package com.bawei.mytestimport org.apache.kafka.clients.consumer.ConsumerRecordimport org.apache.kafka.common.serialization.StringDeserializerimport org.apache.spark.streaming.dstream.{DStream, In...
2020-05-06 00:18:30
73
原创 Spark-SQL
package com.bawei.mytestimport org.apache.spark.sql.SparkSession/** * Created by xiang on 2020/5/3. */case class People(id:Int,sex:String,height:Int)object Test1 { def main(args: Array[Str...
2020-05-06 00:17:16
97
原创 App-SQL
日活跃用户数selectcount(distinct deviceid)from ext_startup_logswhere appid = 'sdk34734'and createdatms >= getdaybegin() and createdatms < getdaybegin(1);周活跃用户数selectcount(distinct deviceid)f...
2020-04-13 08:57:21
165
原创 日志文件log4j.properties
log4j.rootLogger=INFO, stdoutlog4j.appender.stdout=org.apache.log4j.ConsoleAppenderlog4j.appender.stdout.layout=org.apache.log4j.PatternLayoutlog4j.appender.stdout.layout.ConversionPattern=%d %p [%...
2020-04-07 20:49:20
68
原创 Hive-JavaApi
public class JdbcAip { public static void main(String[] args) throws SQLException, ClassNotFoundException { Connection connection = getConnection(); Statement statement = connectio...
2020-04-07 20:44:07
98
原创 HdfsApi
public class HdfsApi { public static void main(String[] args) throws Exception { Configuration cfg = new Configuration(); FileSystem fs = FileSystem.get(new URI("hdfs://192.168.72.13...
2020-04-07 20:39:19
65
原创 MapReduce根据主键合并文件内容
JavaBeanpackage one;import org.apache.hadoop.io.WritableComparable;import java.io.DataInput;import java.io.DataOutput;import java.io.IOException;public class OrderBean implements WritableComp...
2020-04-07 20:23:36
99
空空如也
空空如也
TA创建的收藏夹 TA关注的收藏夹
TA关注的人