java weka.classifiers.trees.J48 -t data/weather.arff
java 类的完整名称 -t表示下一个参数是训练数据集的名称
java weka.classifiers.trees.J48 -h
查看java命令行中各个参数的具体含义
-h or -help
Output help information.-synopsis or -info
Output synopsisfor classifier (use in conjunction with -h)-t Sets training file.-T Sets test file. If missing, a cross-validation will be performed
on the training data.-c Sets index ofclass attribute (default: last).-x Sets number of foldsfor cross-validation (default: 10).-no-cv
Do not perform any cross validation.-force-batch-training
Always train classifier in batch mode, never incrementally.-split-percentage Sets the percentagefor the train/test set split, e.g., 66.-preserve-order
Preserves the order in the percentage split.-s Sets random number seedfor cross-validation or percentage split
(default: 1).-m Sets file with cost matrix.-disable Comma separated list of metric names not to print to the output.
Available metrics:
Correct,Incorrect,Kappa,Total cost,Average cost,KB relative,KB information,
Correlation,Complexity0,Complexity scheme,Complexity improvement,
MAE,RMSE,RAE,RRSE,Coverage,Region size,TP rate,FP rate,Precision,Recall,
F-measure,MCC,ROC area,PRC area-l Sets model input file. Incase the filename ends with '.xml',
a PMML file is loaded or,ifthat fails, options are loaded
from the XML file.-d Sets model output file. Incase the filename ends with '.xml',
only the options are saved to the XML file, not the model.-v
Outputs no statisticsfortraining data.-o
Outputs statistics only, not the classifier.-i
Outputs detailed information-retrieval statistics for each class.-k
Outputs information-theoretic statistics.-classifications "weka.classifiers.evaluation.output.prediction.AbstractOutput + options"Uses the specifiedclass forgenerating the classification output.
E.g.: weka.classifiers.evaluation.output.prediction.PlainText-p range
Outputs predictionsfor test instances (or the train instances ifno test instances provided and-no-cv is used), along with the
attributes in the specified range (and nothingelse).
Use'-p 0' ifno attributes are desired.
Deprecated: use"-classifications ..."instead.-distribution
Outputs the distribution instead of only the prediction
in conjunction with the'-p'option (only nominal classes).
Deprecated: use"-classifications ..."instead.-r
Only outputs cumulative margin distribution.-z Only outputs the source representation of the classifier,
giving it the supplied name.-g
Only outputs the graph representation of the classifier.-xml filename | xml-string
Retrieves the options from the XML-data instead of the command line.-threshold-file The file to save the threshold data to.
The format is determined by the extensions, e.g.,'.arff' forARFF
format or'.csv' forCSV.-threshold-label Theclass label to determine the threshold data for(defaultis the first label)
Options specific to weka.classifiers.trees.J48:-U
Use unpruned tree.-O
Do not collapse tree.-C Set confidence thresholdforpruning.
(default 0.25)-M Set minimum number of instances per leaf.
(default 2)-R
Use reduced error pruning.-N Set number of foldsforreduced error
pruning. One fold is used as pruning set.
(default 3)-B
Use binary splits only.-S
Don't perform subtree raising.
-L
Do not clean up after the tree has been built.-A
Laplace smoothingforpredicted probabilities.-J
Do not use MDL correctionforinfo gain on numeric attributes.-Q Seedfor random data shuffling (default 1).
weka.core
weka核心包,基本所有类都与他有联系
核心包中的关键类:Attribute:包含attribute’s name, its type, and, in the case of a nominal or string attribute, its possible values
Instance:contains the attribute values of a particular instance
Instances:holds an ordered set of instances—in other words, a dataset
weka.classifiers
内容:contains implementations of most of the algorithms for clas-sification and numeric prediction
关键抽象类:Classifier---->>defines the general structure of any scheme for classification or numeric prediction
包含三个核心方法:buildClassifier(), classifyInstance(),distributionForInstance()
继承这个抽象类的例子:
weka.classifiers.trees.DecisionStump
覆写了distributionForInstance()
包含getRevision(),simply returns the revision number of the classifier,used by Weka maintainers when diagnosing and debugging problems reported by users.
包含globalInfo(),returns a string describing the classifier, which, along with the scheme’s options
包含toString(), returns a textual representation of the classifier
包含toSource(),s used to obtain a source code repre-sentation of the learned classifier
包含main(),called when you ask for a decision stump from the command line,相当于执行这个类的入口
包含getCapabilities() ,called by the generic object editor to provide information about the capabilities of a learning scheme
其他的一些比较重要的包
weka.associations
:contains association-rule learners
weka.clusterers
:contains methods for unsupervised learning.包含非监督学习方法
weka.datagenerators
:产生人工数据
weka.estimators package
:computes different types of probability distribution
weka.filters
:提供数据清理的相关方法