ETL
栗子ma
这个作者很懒,什么都没留下…
展开
-
【Spark】抽取,转换,特征选取——Spark机器学习
Extracting, transforming and selecting features - spark.ml此单元包含处理特征的算法,大致可以分为:抽取:从原数据抽取特征转换:Scaling,转化,修改特征选择:从大特征集选区子集This section covers algorithms for working with features, roughly divided into th...翻译 2018-06-06 01:01:31 · 1087 阅读 · 0 评论 -
【Spark】TF-IDF
TF-IDFTerm frequency-inverse document frequency (TF-IDF) is a feature vectorization method widely used in text mining to reflect the importance of a term to a document in the corpus. Denote a term b...翻译 2018-06-06 02:09:58 · 399 阅读 · 0 评论