![](https://img-blog.csdnimg.cn/20201014180756925.png?x-oss-process=image/resize,m_fixed,h_64,w_64)
大数据
zhumin726
这个作者很懒,什么都没留下…
展开
-
mahout -传统朴素贝叶斯分类
Naive Bayes 朴素贝叶斯 Naive Bayes is an algorithm that can be used to classify objects into usually binary categories. It is one of the most common learning algorithms in spam filters. Despite its s翻译 2012-12-24 16:52:40 · 1818 阅读 · 0 评论 -
Random Forests
How to grow a Decision Tree source : [3] LearnUnprunedTree(X,Y) Input: X a matrix of R rows and M columns where Xij = the value of the j'th attribute in the i'th input datapoint. Each column转载 2012-12-24 17:45:09 · 1026 阅读 · 0 评论 -
mahout 创建向量问题There are too many documents that do not have a term vector
bin/mahout lucene.vector --dir /home/hadoop/index --output /user/hadoop/out/part-out.vec --field title --idField id --dictOut /user/hadoop/out/dict.out --maxPercentErrorDocs 0.1 Exception i原创 2012-12-25 14:31:02 · 823 阅读 · 0 评论 -
高性能数据库实践经验
1 数据量增大时 分布式,多实例共享 2 读请求很多 使用二级缓存,memcached 3写请求增多 取消2级节点,改变架构,原始的master/slaves改成cluster的,多个写入节点的解决方案支持 或者升级master 16核128gRAM 15krpm硬盘驱动器的增强型服务器 减少索引,改变设计 4 需求变更,查询复杂,连接操作多 数据冗余原创 2012-12-27 17:25:53 · 516 阅读 · 0 评论 -
hbase笔记
1 安装 下载,解压 配置 hbase-env.sh java_home, HBASE_MANAGES_ZK:true export JAVA_HOME=/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64 export HBASE_MANAGES_ZK=true hbase-site.xml h原创 2012-12-27 20:46:45 · 842 阅读 · 0 评论 -
hadoop2.0源码编译 错误及原因
1 编译 mvn clean install -Dmaven.test.skip=true 错误 [ERROR] [ERROR] To see the full stack trace of the errors, re-run Maven with the -e swit ch. [ERROR] Re-run Maven using the -X switch to enable fu原创 2013-02-18 14:17:02 · 2794 阅读 · 0 评论