Matlab随机森林库

326 篇文章 2 订阅
183 篇文章 6 订阅

什么是随机森林?

Random forest is a classification technique that proposed by Leo Brieman (2001), given the set of class-labeled data, builds a set of classification trees. Each tree is developed from a bootstrap sample from the training data. When developing individual trees, an arbitrary subset of attributes is drawn (hence the term "random") from which the best attribute for the split is selected. The classification is based on the majority vote from individually developed tree classifiers in the forest.


更为详细的解释:http://en.wikipedia.org/wiki/Random_forest


Matlab库下载

原始实现:

http://www.stat.berkeley.edu/~breiman/RandomForests/cc_software.htm,即将发布新版

从R改装来的实现:

http://randomforest-matlab.googlecode.com/files/Windows-Precompiled-RF_MexStandalone-v0.02-.zip


基于随机森林的集成分类应用:

ENSEMBLE CLASSIFICATION

(1) A conference paper investigating binary classification strategies with ensemble classification has been published. [Chan J.C.-W., Demarchi, L., Van De Voorde, T., & Canters, F. (2008),”Binary classification strategies for mapping urban land cover with ensemble classifiers”, Proceedings of IEEE International Geoscience and Remote Sensing Symposium (IGARSS), July 6-11, 2008, Boston, Massachusetts, USA. Vol. III, pp. 1004-1007.] (see Annex A.9)
Since the data sets related to HABISTAT were not ready in the beginning of 2008, a study on binary classification with ensemble classifiers was conducted using 2 data sets in suburban areas. In the paper, two binary classification strategies were examined to further extend the strength of ensemble classifiers for mapping of urban objects. The first strategy was a one-against-one approach. The idea behind it was to employ a pairwise binary classification where n(n-1)/2 classifiers are created, n being the number of classes. Each of the n(n-1)/2  classifiers was trained using only training cases from two classes at a time. The ensemble was then combined by majority voting. The second strategy was a one-against-all binary approach: if there are n classes, with a = {1,…, n} being one of the classes, then n classifiers were generated, each representing a binary classification of a and non-a. The ensemble was combined using accuracy estimates obtained for each class. Both binary strategies were applied on two single classifiers (decision trees and artificial neural networks) and two ensemble classifiers (Random Forest and Adaboost). Two multi-source data sets were used: one was prepared for an object-based classification and one for a conventional pixel-based approach. Our results indicate that ensemble classifiers generate significantly higher accuracies than a single classifier. Compared to a single C5.0 tree, Random Forest and Adaboost increased the accuracy by 2 to 12%. The range of increase depends on the data set that was used. Applying binary classification strategies often increases accuracy, but only marginally (between 1-3%). All increases are statistically significant, except on one occasion. Coupling ensemble classifiers with binary classification always yielded the highest accuracies. For our first data set, the highest accuracy was obtained with Adaboost and a 1-against-1 strategy, 4.3% better than for a single tree;  for the second data set with the Random Forest approach and a 1-against-all strategy, 13.6% higher than for a single tree.
While the results show statistically significant improvement, the increase in accuracy is marginal. Given its long training time, we have to consider carefully if it is worthwhile to apply this strategy.

(2) We used the ensemble classifier Random Forest to produce four levels of classification using 3 different data sets in the framework of workpackage Validation WP 5200. The data set that was used for this experiment is AHS airborne data. A total of 12 classifications were made (see Figure 9). The results with Random Forest were compared with the performance from other classifiers: Linear Discriminant Analysis, Markov Random Field.

The processing has a problem in terms of the number of training samples and also spatial independence (see Table 5). This issue with the training, testing and validation sets has been discussed during the mid-term evaluation and is under investigation.
Validation exercise
Figure 9. Validation exercise using airborne AHS data. The columns represent 3 data sets and rows represent 4 levels of classification. Classifications were done using Random Forest.

Table 5. Table showing the classification scheme and training size at each level.
Table5

(3) The use of ensemble classification was studied in all classification tasks with spaceborne data. Two conference papers in relation to classification of heath lands using superresolution enhanced CHRIS data were presented. Random Forest were used for the classifications. The results show rather consistent and satisfactory results with Random Forest. Below are two illustrations (Figure 10 and Figure 11) of the application of Random Forest on the original CHRIS and superresolution enhanced CHRIS data set. For more details, please refer to the paper attached in annex A.8. Random Forest seems to have worked very well with our data sets. We will continue to use and investigate the strength of this ensemble classifier.

ResultsKalmthout
Figure 10. Random Forest classification of SR CHRIS (Kalmthout, Belgium). Results presented at IGARSS, July 6-11, 2008, Boston, Massachusetts, USA. (see Annex B of annual report #1)

ResultsGinkel

Figure 11. Random Forest classification of SR CHRIS (Ginkel, the Netherlands). Results presented at the 6th EARSeL SIG Imaging Spectroscopy workshop 2009, Tel Aviv, March 16-19 2009. (see Annex A.8)


来源:http://habistat.vgt.vito.be/modules/Results/EC.php


Orange软件提供的随机森林实现

http://orange.biolab.si/doc/widgets/_static/Classify/RandomForest.htm






  • 0
    点赞
  • 4
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值