- 1 weka简介
Weka的全名是怀卡托智能分析环境(Waikato Environment for Knowledge Analysis),是一款免费的,非商业化(与之对应的是SPSS公司商业数据挖掘产品--Clementine )的,基于JAVA环境下开源的机器学习(machine learning)以及数据挖掘(data mining)软件。
2 weka的下载与安装
首先需要到官网下载https://www.cs.waikato.ac.nz/ml/weka/downloading.html,分别有windowns32\64位、mac ios、Linux等,其中windows版本32、64位分别有包含jre和不包含jre版本,这个根据电脑是否安装jdk\jre而定。
下载之后,直接点击exe文件,傻瓜式安装,直接下一步,直至安装完成!
3 数据库文件的配置
weka是一个比较强大的数据挖掘分析工具,支持mysql、pg、oracle等数据库,只需要修改里面的配置文件即可。
解压安装文件目录里面的weka.jar解压打开,找到weka\experiment 文件夹下的DatabaseUtils.props(连接mysql)或DatabaseUtils.props.mysql(连接mysql)进行修改:
# Database settings for MySQL 3.23.x, 4.x
#
# General information on database access can be found here:
# http://weka.wikispaces.com/Databases
#
# url: http://www.mysql.com/
# jdbc: http://www.mysql.com/products/connector/j/
# author: Fracpete (fracpete at waikato dot ac dot nz)
# version: $Revision: 5836 $
# JDBC driver (comma-separated list)
#jdbcDriver=org.gjt.mm.mysql.Driver
jdbcDriver=com.mysql.jdbc.Driver;#数据库驱动
# database URL
#jdbcURL=jdbc:mysql://server_name:3306/database_name
jdbcURL=jdbc:mysql://localhost:3306/数据库名;#数据库url
# specific data types
# string, getString() = 0; --> nominal
# boolean, getBoolean() = 1; --> nominal
# double, getDouble() = 2; --> numeric
# byte, getByte() = 3; --> numeric
# short, getByte()= 4; --> numeric
# int, getInteger() = 5; --> numeric
# long, getLong() = 6; --> numeric
# float, getFloat() = 7; --> numeric
# date, getDate() = 8; --> date
# text, getString() = 9; --> string
# time, getTime() = 10; --> date
#增加的数据类型说明
TINYINT=3
#
# General information on database access can be found here:
# http://weka.wikispaces.com/Databases
#
# url: http://www.mysql.com/
# jdbc: http://www.mysql.com/products/connector/j/
# author: Fracpete (fracpete at waikato dot ac dot nz)
# version: $Revision: 5836 $
# JDBC driver (comma-separated list)
#jdbcDriver=org.gjt.mm.mysql.Driver
jdbcDriver=com.mysql.jdbc.Driver;#数据库驱动
# database URL
#jdbcURL=jdbc:mysql://server_name:3306/database_name
jdbcURL=jdbc:mysql://localhost:3306/数据库名;#数据库url
# specific data types
# string, getString() = 0; --> nominal
# boolean, getBoolean() = 1; --> nominal
# double, getDouble() = 2; --> numeric
# byte, getByte() = 3; --> numeric
# short, getByte()= 4; --> numeric
# int, getInteger() = 5; --> numeric
# long, getLong() = 6; --> numeric
# float, getFloat() = 7; --> numeric
# date, getDate() = 8; --> date
# text, getString() = 9; --> string
# time, getTime() = 10; --> date
#增加的数据类型说明
TINYINT=3