2019年06月_坚持，再坚持一下

08月 07月 06月 03月 02月

原创 struct streaming自定义MysqlSink组件

1.项目所用jar包<dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-sql-kafka-0-10_2.11</artifactId> <version>${spark.version}</versi...

2019-06-27 16:39:17 415

转载 Struct Streaming的流-流连接

流 - 流连接的案例：广告货币化想象一下，您有两个流 - 一个广告展示流（即，向用户显示广告时）和另一个广告点击流（即，当用户点击显示的广告时）。要通过广告获利，您必须匹配导致点击的广告展示。换句话说，您需要根据公共密钥加入这些流，公共密钥是两个流的事件中存在的每个广告的唯一标识符。在高级别，问题如下所示。虽然这在概念上是一个简单的想法，但仍有一些核心技术挑战需要克服。使用缓冲处理延迟/延...

2019-06-27 16:30:35 1190

运行struct streaming报错Exception in thread "main" org.apache.spark.sql.AnalysisException: cannot resolve 'timewindow(timestamp, 10000000, 5000000, 0)' due to data type mismatch: argument 1 requires time...

2019-06-24 10:02:20 447

原创 Struct streaming +kafka 时间窗操作

import java.sql.Timestampimport org.apache.spark.sql.streaming.OutputModeimport org.apache.spark.sql.{DataFrame, Dataset, SparkSession}object StructStreamingWindows { def main(args: Array[String...

2019-06-24 09:59:41 1074 3

HIERARCHICAL CLUSTERING SCHEMES

Techniques for partitioning objects into optimally homogeneous groups on the basis of empirical measures of similarity among those objects have received increasing attention in several different fields. This paper develops a useful correspondence between any hierarchical system of such clusters, and a particular type of distance measure. The correspondence gives rise to two methods of clustering that are computationally rapid and invariant under monotonic transformations of the data. In an explicitly defined sense, one method forms clusters that are optimally "connected," while the other forms clusters that are optimally "compact."

2018-10-29

聚类原始数据集

聚类数据集 %% 利用不同方法对债券样本进行聚类 %说明 %分别采用不同的方法，对数据进行聚类 %可以选择的pdist/clustering距离 % methods = {'euclidean'; 'seuclidean'; 'cityblock'; 'chebychev'; ... % 'mahalanobis'; 'minkowski'; 'cosine'; 'correlation'; ... % 'spearman'; 'hamming'; 'jaccard'}; %Y=pdist(X) 生成各数据点之间距离的行向量 %squareform(Y) 生成方阵（i，j）代表i个点与j各点之间的距离 %聚类方法： %k-means %kidx=kmeans(bonds,numClust,'distance',dist_k); %层次聚类 %hidx=clusterdata(bonds,'maxclust',numClust,'distance',dist_h,'linkage',link); %liankage产生层次聚类树 %获取距离矩阵，第二参数指定距离计算方法

2018-10-26

空空如也

TA创建的收藏夹 TA关注的收藏夹

TA关注的人

坚持才会胜利

原创 struct streaming自定义MysqlSink组件

转载 Struct Streaming的流-流连接

原创运行struct streaming报错

原创 Struct streaming +kafka 时间窗操作

HIERARCHICAL CLUSTERING SCHEMES

聚类原始数据集

空空如也

原创 struct streaming自定义MysqlSink组件

转载 Struct Streaming的流-流连接

原创 运行struct streaming报错

原创 Struct streaming +kafka 时间窗操作

HIERARCHICAL CLUSTERING SCHEMES

聚类原始数据集

空空如也

原创运行struct streaming报错