数据挖掘（Data Mining）——Pentaho Weka

最新推荐文章于 2023-09-24 00:02:30 发布

zxs421819166

最新推荐文章于 2023-09-24 00:02:30 发布

阅读量1.1k

点赞数

分类专栏： Data Mining 文章标签：数据挖掘 classification attributes visualization processing statistics

6 篇文章 0 订阅

订阅专栏

关于 Pentaho Weka Experimenter 功能的使用

关于 Pentaho Weka Knowledge Flow 功能的使用

Define a data mining “process”
Like the Explorer, all of WEKA's algorithms are available
Data flows through the process from node to node
Accommodates both batch-based processing and data streams
Command line interface to WEKA can also train incremental classifiers on data streams
Fully multi-threaded
Accommodates multiple independent “flows” on the same layout
Knowledge Flow’s Classifier step is multi-threaded
Build models for more than one cross-validation fold in parallel
Binary and XML-based persistence of flow layouts

关于 Pentaho Weka Explorer 功能的使用

·“Preprocess” panel

·Load data from various sources (file, SQL database, URL etc.)

·Apply pre-processing “filters” to the data

·Summary statistics & histograms

·“Classify” panel

·Apply classification and regression algorithms

·Evaluate resulting models

·Numerically via statistical estimation

·Graphically through visualization (data and model)

·“Cluster” panel

·Apply clustering algorithms to the data

·Visualize the outcome

·Clusters that represent density estimates can be evaluated based on the statistical likelihood of the data

·“Associate” panel

·Learn association rules for market-basket type analysis

·“Select attributes” panel

·Mix and match algorithms for evaluating the utility of attributes and sets of attributes with different search methods

·“Visualize” panel

·Color-coded scatter plot matrix of the data

·Select, enlarge, zoom in etc.

关注

专栏目录