mahout学习第四天

7 篇文章 0 订阅
mahout学习第四天,今日目标,完成mahout的推荐部分的学习:
 
3.2.4 Update files
     FileDataModel  supports update files. These are just more data files that are read after the main data file, and that overwrite any  previously read data. New preferences are added and existing ones are updated. Deletes are handled by providing an empty preference value string.
    文件数据模型支持文件更新。数据模型会根据数据覆盖已有的,增加没有的。并删除已被删除的。
3.2.5 Database-based data
      Sometimes data is just too large to fit into memory. Once the data set is several tens of millions of preferences, memory requirements grow to several gigabytes, and this amount of memory may be unavailable in some contexts.
    It's possible to store and access preference  data from a relational database; Mahout supports this. Several classes in Mahout's recommender implementation will attempt to push computations into the database for performance.
在数据规模巨大的情况下,内存可能不够,mahout通过支持将数据存储在关系数据来解决内存不够的问题。
    Note that running a recommender engine from data in a database will be much slower, by orders of magnitude, than using in-memory data representations. It's no fault of the database; properly tuned and  configured, a modern database is excellent at indexing and retrieving information efficiently, but the overhead of retrieving, mar-shalling, serializing, transmitting, and deserializing result sets is still much greater than the overhead of reading data from  optimized in-memory data structures. This adds up quickly for recommender algorithms, which are data intensive. Yet, working from a database may be desirable in cases where you have no choice, or where the data set isn’t huge and reusing an existing table of data is desirable for integration purposes.
    除非无可选择,不要试用基于数据库的方式存储数据,因为将数据存储在内存的数据处理效率最高。
基于数据库的先不看!

3.2.8 Configuring programmatically 基于程序的配置
You also don't have to use  JNDI directly; you can instead pass a  DataSource  directly to
the  MySQLJDBCDataModel  constructor. The next listing shows a full example of config-uring  MySQLJDBCDataModel , including the use of the MySQL  Connector/J driver
( http://www.mysql.com/products/connector/) and a  DataSource  with customized
table and column names. 可以通过程序进行数据源配置。

MysqlDataSource dataSource = new MysqlDataSource ();
dataSource.setServerName("my_database_host");
dataSource.setUser("my_user");
dataSource.setPassword("my_password");
dataSource.setDatabaseName("my_database_name");
JDBCDataModel dataModel = new MySQLJDBCDataModel(dataSource, "my_prefs_table","my_user_column","my_item_column", "my_pref_value_column");
 今日总结:
基于mahout有多种数据源,包括文件的,和数据库的。数据库数据源可以通过程序设置数据相关链接参数。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值