XGBoost: A Scalable Tree Boosting System
ABSTRACT
Tree boosting is a highly e ective and widely used machine learning method. In this paper, we describe a scalable end-to-end tree boosting system called XGBoost, which is used widely by data scientists to achieve state-of-the-art results on many machine learning challenges. We propose a novel sparsity-aware algorithm for sparse data and weighted quan-tile sketch for approximate tree learning. More importantly, we provide insights on cache access patterns, data compres-sion and sharding to build a scalable tree boosting system. By combining these insights, XGBoost scales beyond billions of examples using far fewer resources than existing systems.
树推进是一种高效且广泛使用的机器学习方法。 在本文中,我们描述了一个可扩展的端到端树推进系统XGBoost,它被数据科学家广泛使用,以在许多机器学习挑战中获得最先进的结果。 我们提出了一种新的稀疏数据稀疏感知算法和近似树学习的加权量化草图。 更重要的是,我们提供有关缓存访问模式,数据压缩和分片的见解ÿ