Collections of datasets Available separately:
A jarfile containing 37 classification problems, originally obtained from the UCI repository (datasets-UCI.jar, 1,190,961 Bytes) . A jarfile containing 37 regression problems, obtained from various sources (datasets-numeric.jar, 169,344 Bytes) . A jarfile containing 6 agricultural datasets obtained from agricultural researchers in New Zealand (agridatasets.jar, 31,200 Bytes) . A jarfile containing 30 regression datasets collected by Luis Torgo (regression-datasets.jar, 10,090,266 Bytes) . A gzip'ed tar containing UCI and UCI KDD datasets (uci-20070111.tar.gz, 17,952,832 Bytes ) A gzip'ed tar containing StatLib datasets (statlib-20050214.tar.gz, 12,785,582 Bytes ) A gzip'ed tar containing ordinal, real-world datasets donated by Dr. Arie Ben David (Holon Inst. of Technology/Israel ) (datasets-arie_ben_david.tar.gz, 11,348 Bytes ) A zip file containing 19 multi-class (1-of-n) text datasets donated by George Forman /Hewlett-Packard Labs (19MclassTextWc.zip, 14,084,828 Bytes ) A bzip'ed tar file containing the Reuters21578 dataset split into separate files according to the ModApte split (reuters21578-ModApte.tar.bz2, 81,745,032 Bytes ) After expanding into a directory using your jar utility (or an archive program that handles tar-archives/zip files in case of the gzip'ed tars/zip files), these datasets may be used with Weka. Other datasets in ARFF format:
http://www.cs.waikato.ac.nz/ml/weka/index_datasets.html