本文为芬兰坦佩雷大学(作者:Murat Pojon)的硕士论文,共39页。
本文研究了机器学习算法在预测学生是否成功方面的应用。本文的重点是比较机器学习方法和特征工程技术在多大程度上提高了预测性能,采用了三种不同的机器学习方法。它们是线性回归、决策树和朴素贝叶斯分类。特征工程是对数据集特征进行修改和选择的过程,用于改进这些学习算法的预测。使用了两个包含学生信息记录的不同数据集,将机器学习方法应用于数据集的原始版本和特征工程版本,以预测学生的成功与否。本文得出了与以往研究相同的结论:结果表明,利用机器学习成功地预测学生的学习成绩是可能的。最好的结果是第一个数据集的朴素贝叶斯分类,准确度为98%,第二个数据集的决策树,准确度为78%。在本研究所使用的资料中,特征工程在预测绩效方面比方法选择更为重要。
This thesis examines the application ofmachine learning algorithms to predict whether a student will be successful ornot. The specific focus of the thesis is the comparison of machine learningmethods and feature engineering techniques in terms of how much they improvethe prediction performance. Three different machine learning methods were usedin this thesis. They are linear regression, decision trees, and naïve Bayesclassification. Feature engineering, the process of modification and selectionof the features of a data set, was used to improve predictions made by theselearning algorithms. Two different data sets containing records of studentinformation were used. The machine learning methods were applied to both theraw version and the feature engineered version of the data sets, to predict thestudent’s success. The thesis comes to the same conclusion as the earlier studies:The results show that it is possible to predict student performancesuccessfully by using machine learning. The best algorithm was naïve Bayesclassification for the first data set, with 98 percent accuracy, and decisiontrees for the second data set, with 78 percent accuracy. Feature engineeringwas found to be more important factor in prediction performance than methodselection in the data used in this study.
1 引言
2 已有的工作
3 研究方法
4 研究素材
5 具体实现与结果
6 评估
7 讨论与结论
8 未来工作展望
更多精彩文章请关注公众号: