Top 20 R Machine Learning and Data Science packages

421 篇文章 14 订阅
The  CRAN Package repository features 6778 active packages. Which of these should you know? Here is an analysis. See also link to the raw data at the bottom of the post.

top-20-r-packages-machine-learning-downloadsMost of these R packages are favorites of Kagglers, endorsed by many authors, rated based on one package's dependency on other packages. They are also rated & reviewed by users as a crowdsourced solution by  Crantastic.org. However, these user ratings are too few to be based on for analysis. 

Let us explore how many machine learning packages are being downloaded from Jan to May by analysing CRAN daily downloads. 

  1. e1071 Functions for latent class analysis, short time Fourier transform, fuzzy clustering, support vector machines, shortest path computation, bagged clustering, naive Bayes classifier etc (142479 downloads) 
  2. rpart Recursive Partitioning and Regression Trees. (135390)
  3. igraph A collection of network analysis tools. (122930)
  4. nnet Feed-forward Neural Networks and Multinomial Log-Linear Models. (108298)
  5. randomForest Breiman and Cutler's random forests for classification and regression. (105375)
  6. caret package (short for Classification And REgression Training) is a set of functions that attempt to streamline the process for creating predictive models. (87151)
  7. kernlab Kernel-based Machine Learning Lab. (62064)
  8. glmnet Lasso and elastic-net regularized generalized linear models. (56948)
  9. ROCR Visualizing the performance of scoring classifiers. (51323)
  10. gbm Generalized Boosted Regression Models. (44760)
  11. party A Laboratory for Recursive Partitioning. (43290)
  12. arules Mining Association Rules and Frequent Itemsets. (39654)
  13. tree Classification and regression trees. (27882)
  14. klaR Classification and visualization. (27828)
  15. RWeka R/Weka interface. (26973)
  16. ipred Improved Predictors. (22358)
  17. lars Least Angle Regression, Lasso and Forward Stagewise. (19691)
  18. earth Multivariate Adaptive Regression Spline Models. (15901)
  19. CORElearn Classification, regression, feature evaluation and ordinal evaluation. (13856)
  20. mboost Model-Based Boosting. (13078)
It is interesting to note that some open source R tools are gaining popularity such as  Rattle, a GUI for data mining using R (35539 downloads), and  fastcluster, fast hierarchical clustering routines for R and Python (14214 downloads). 

Did we miss your favorites? Light up this space and contribute to the community by letting us know which R packages you use!! 

For completeness, here is  data on 135 R package downloads, from Jan to May 2015

IMG_9213

Bio: Bhavya Geethika is pursuing a masters in Management Information Systems at University of Illinois at Chicago. Her areas of interests include Statistics & Data Mining for Business, Machine learning and Data-Driven Marketing.



Related:
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值