A Survey of Knowledge Tracing（个人笔记）

最新推荐文章于 2024-08-18 14:22:39 发布

RSociopath

最新推荐文章于 2024-08-18 14:22:39 发布

阅读量1k

点赞数 9

文章标签：笔记

本文链接：https://blog.csdn.net/RSociopath/article/details/140500135

版权

“1 INTRODUCTION”

Online Learning

“optimal and adaptive learning experience.” 最优的、适应性的学习经验
“more effective than traditional learning styles” 比传统更有效
“assess their knowledge levels and learning preferences” 评估知识水平和学习偏好

“Knowledge Tracing”

“KT utilizes a series of sequence modeling-oriented machine learning methods capable of exploiting educationally related data” 面向序列建模的机器学习方法，这个算法能够利用教育相关数据

Process

Aim

“maintain an estimate of student’s changing knowledge state” 评估学生知识状况
“provide intelligent services to students” 提供智能服务

Categorization

“three categories: (1) probabilistic models, (2) logistic models, and (3) deep learning-based models.”

“2 OVERVIEW”

Basic Concepts

“a set of students S and a set of exercises E” 数据和练习的集合
“Knowledge Concepts (KCs)” 知识概念
“the set of all KCs as KC” 知识概念集合
“M and K are used to represent the total number of different exercises and KCs” 练习和知识概念的总数
“tuple (et, at, rt)” t-time,e-exercise,a-anwser,r-side information.

“3 BASIC KNOWLEDGE TRACING MODELS”

“3.1 Probabilistic Models”

“The basic paradigm for probabilistic models in KT assumes that the learning process follows a Markov process, where students’ latent knowledge state can be estimated by their observed learning performance [13].” KT中的概率模型假设学习过程遵循一个马尔科夫过程，在此过程中学生的潜在知识状态可以根据已观测到的学习表现来估计。

“3.1.1 Bayesian Knowledge Tracing” (BKT)

“In fact, BKT is a special case of Hidden Markov Model (HMM).” BKT实际上是一种特殊的隐马尔科夫模型

“There are two types of parameters in HMM: transition probabilities and emission probabilities.” HMM中有两种类型的参数：转移概率和释放概率

Basic Concepts

“ P (T ), the probability of transition from the unlearned state to the learned state;” 从未学会到学会的概率
“P (F ), the probability of forgetting a previously known KC, which is assumed to be zero in BKT” 忘记以往知识的概率，默认为0（即假定不会遗忘）
“P (G), the probability that a student will guess correctly in spite of non-mastery;” 猜对（实际未掌握）的概率
“P (S), the probability a student will make a mistake in spite of mastery.” 错选（实际掌握）的概率
“the parameter P (L0) represents the initial probability of mastery.”

初始为掌握的概率

“estimate the knowledge state and the probability of correct answers:”

预估知识掌握状况和答对下一个问题的概率

“The posterior probability P (Ln|Answer) is estimated by a Bayesian inference scheme, as follows:”

“3.1.2 Dynamic Bayesian Knowledge Tracing” (DBKT)

“, KCs are not completely independent of each other, but rather hierarchical and closely related [36]. Dynamic Bayesian networks are able to jointly represent multiple skills within one model, which can potentially increase the representational power of knowledge tracing.” 知识概念并不是互相独立的，他们更加类似于层次结构并紧密相连。动态贝叶斯网络能够在一个模型内对多种技能进行联合表示，可以潜在地提高知识追踪的表征能力。

Basic Concepts

“The objective of DBKT is to find the parameters θ that maximize the likelihood of the joint probability p(am, hm|θ).”DBKT的目的是找到使联合概率p(am, hm|θ)最大化的一组参数θ。

“3.2 Logistic Models”

“logistic models are a large class of models based on logistic functions, the underlying concept behind which is that the probability of answering exercises correctly can be represented by a mathematical function of student and KC parameters.”Logistic模型是基于Logistic函数的一大类模型，其背后的基本概念是正确回答习题的概率可以用学生和KC参数的数学函数来表示。

“3.2.1 Learning Factor Analysis”

Basic concepts

“parameter α estimates the initial knowledge state of each student;” 参数α估计每个学生的初始知识状态；
“parameter β captures the easiness of different KCs” 参数β反映了不同KCs的难易程度；
“parameter γ denotes the learning rate of KCs” 参数γ表示KCs的学习率

“The standard form of the LFA model is as follows:”

“Si is the covariates for the student i, Tj represents the covariate for the number of practice opportunities on KC j, Kj is the covariate for KC j, θ is the estimation of the probability of student and KC parameters, and p(θ) is the estimation of the probability of a correct answer.” Si为学生i的协变量，Tj为KCj上实践机会数量的协变量，Kj为KCj的协变量，θ为学生和KC参数概率的估计，p ( θ )为正确答案概率的估计。

“3.2.2 Performance Factor Analysis”

“The PFA model [23] can be seen as an extension of the LFA model that is especially sensitive to the strongest indicator of student learning performance.” PFA模型可以看作是对LFA模型的扩展，LFA模型对学生学习表现的最强指标特别敏感。

Basic concepts

“parameter f is the prior failures for the KC of the student” 参数f是学生的KC之前出现的错误
“parameter s represents the prior successes for the KC of the student;” 参数s表示学生的KC先前正确情况；

“μ and ν are the coefficients for s and f , which denote the learning rates for successes and failures, respectively” μ和ν为s和f的系数，分别表示成功和失败的学习率

“3.2.3 Knowledge Tracing Machines”

“The KTM model [32] takes advantage of factorization machines (FMs) [38, 39] to generalize previous logistic models to higher dimensions” KTM模型利用因子分解机( factorization machine，FMs ) 将以往的Logistic模型推广到更高的维度

“FMs provide a means of encoding side information about exercises or students into the model;” FMs提供了一种将关于练习或学生的辅助信息编码到模型中的方法；

“3.3 Deep Learning-based Models”

“Deep learning has a powerful ability to achieve non-linearity and feature extraction, making it well suited to modeling the complex learning process, especially when a much larger amount of learning interaction data is available” 深度学习具有强大的实现非线性和特征提取的能力，使其非常适合对复杂的学习过程进行建模，特别是当可获得的学习交互数据量非常大时

“Nevertheless, deep learning-based models are poorly interpretable due to their end-to-end learning strategy, which limits their further applicability owing to the crucial significance of interpretability for students modeling.” (Shen 等, 2024, p. 5) 然而，基于深度学习的模型由于其端到端的学习策略，可解释性较差，这限制了其进一步的适用性，因为可解释性对于学生建模至关重要。

“3.3.1 Deep Knowledge Tracing”

“Deep Knowledge Tracing utilizes recurrent neural networks (RNNs) [25] to model the students’ knowledge states” 利用循环神经网络( RNNs )对学生的知识状态进行建模

“DKT sets each input vector xt to a corresponding random vector, then takes the embedded learning sequence as the input of RNNs and applies a linear mapping and activation function to the output hidden states to obtain the knowledge state of students” DKT将每个输入向量xt设置为相应的随机向量，然后将嵌入的学习序列作为RNNs的输入，并对输出的隐藏状态应用线性映射和激活函数，以获得学生的知识状态

“unreasonable phenomena”

“it fails to reconstruct the observed input” 它不能重建观察到的输入
“the predicted knowledge state is not consistent across time-steps”预测的知识状态在时间步上是不一致的

“3.3.2 Memory-aware Knowledge Tracing”

“memory-aware knowledge tracing introduces an external memory module [45] to store the knowledge concepts and update the corresponding knowledge mastery of the student.”记忆感知知识追踪引入了外部存储模块来存储知识概念和更新学生的相关知识掌握情况。

“For the read operation, DKVMN can predict student performance based on the student’s knowledge mastery.” 对于读操作，DKVMN可以根据学生的知识掌握情况来预测学生的成绩。

“For the write operation, after an exercise has been responded to, DKVMN updates students’ knowledge mastery (i.e., the value matrix) based on their performance.” 对于写操作，在练习得到响应后，DKVMN根据学生的表现更新学生的知识掌握情况(即,值矩阵)。

“although DKVMN modeled students’ knowledge state through their most recent practices, it failed to capture long-term dependencies in learning process” 尽管DKVMN通过学生最近一次的练习来建模学生的知识状态，但它未能捕获学习过程中的长期依赖关系

“3.3.3 Exercise-aware Knowledge Tracing”

“the Exercise-aware Knowledge Tracing (EKT) model to leverage the effectiveness of the text content of exercises in order to enhance the KT process” 练习感知知识追踪( EKT )模型，利用练习文本内容的有效性，以增强KT过程

“3.3.4 Attentive Knowledge Tracing”

“this model abandons recurrence and relies entirely on an attention mechanism to capture global dependencies within a sequence. The transformer has demonstrated superior power in feature extraction and dependency capture while maintaining high computational efficiency.” 该模型摒弃了递归，完全依靠注意力机制来捕获序列内部的全局依赖关系。该转换器在保持较高计算效率的同时，在特征提取和依赖捕获方面表现出了优越的性能。

“3.3.5 Graph-based Knowledge Tracing”

“the graph-based knowledge tracing (GKT), which conceptualizes the potential graph structure of the knowledge concepts as a graph G = (V, E);” 基于图的知识追踪( GKT )，将知识概念的潜在图结构概念化为图G = ( V、E)；

“The architecture for graph-based knowledge tracing is presented in Fig. 9, which is composed of three parts: (1) aggregate, (2) update and (3) predict.” 图9给出了基于图的知识追踪的体系结构，它由三部分组成：( 1 )聚集，( 2 )更新和( 3 )预测。

“In the aggregate module, GKT aggregates the temporal knowledge state and the embedding for the answered KC i and its neighboring KC j” 在聚合模块中，GKT聚合了被回答的KC i及其邻近的KC j的时态知识状态和嵌入
“In the update module, GKT updates the temporal knowledge state based on the aggregated features and the knowledge graph structure” 在更新模块中，GKT基于聚合特征和知识图谱结构更新暂时的知识状态
“In the predict module, GKT predicts the student’s performance at the next time step according to the updated temporal knowledge state” 在预测模块中，GKT根据更新后的暂时知识状态预测学生在下一时间步的表现

“4 VARIANTS OF KNOWLEDGE TRACING MODELS”

“As a consequence, the above basic KT models are straightforward, but have reduced performance in real-world learning scenarios.” 因此，上述基本的KT模型是直接的，但在真实世界的学习场景中性能降低。

“4.1 Modeling Individualization before Learning”

“individualization in the KT task refers to that different students tend to have different learning characteristics” KT任务中的个性化是指不同的学生往往具有不同的学习特点

“4.1.1 Modeling Individualization in BKT”

“Individualize students’ initial probability of mastery and the probability of transition from the unlearned state to the learned state, respectively” 将学生的初始掌握概率和从未学状态到已学状态的转移概率分别个体化

“In this case, the student node gives individualized P (T ) parameters to each student, as shown in Fig” 在这种情况下，学生节点为每个学生提供个性化的P ( T )参数，如图所示

“Another means of modeling individualization for a larger range of students is clustering, where we can train more appropriate models for different groups of students” 对更大范围的学生进行个性化建模的另一种手段是聚类，在这里我们可以为不同的学生群体训练更合适的模型

“4.1.2 Modeling Individualization in DKT”

“According to students’ previous performance, DKTDSC assigns students with similar learning ability into the same group” 根据学生的先前表现，DKTDSC将具有相似学习能力的学生分配到同一小组中

“At the start of each time interval, DKT-DSC will reassess students’ learning ability and reassign their groups.” 在每个时间间隔开始时，DKT - DSC将重新评估学生的学习能力并重新分配小组。

“4.2 Incorporating Engagement during Learning”

“Student engagement is defined as ”the quality of effort students themselves devote to educationally purposeful activities that contribute directly to desired outcomes””学生参与被定义为：“学生自己致力于的直接有助于达到期望效果的有教育目的的活动所做出的努力的质量”

“4.2.1 Incorporating Engagement into BKT”

“inexpensive portable electroencephalography (EEG) devices can successfully help to detect a variety of student mental states related to the learning process” 价格低廉的便携式脑电( EEG )设备可以成功帮助检测与学习过程相关的多种学生心理状态

“Rather than assuming equal influence of knowledge and engagement on students’ knowledge state, one variation on the KAT model defines the connection between knowledge and engagement, and accordingly considers that a student’s knowledge state will influence their engagement.” KAT模型的一个变式并不是假设知识和投入对学生知识状态的影响相等，而是定义了知识和投入之间的联系，并据此认为学生的知识状态会影响他们的投入。

“4.2.2 Incorporating Engagement into DKT”

“These features reflect student engagement from various aspects, that are playback speed, whether or not the video was paused, fast-forwarded or rewound, and whether or not the video was completed.” 这些特征从不同的方面反映了学生的参与度，即播放速度，视频是否被暂停，快速转发或翻卷，以及视频是否完成。

“4.3 Utilizing Side Information during Learning”

“4.3.1 BKT with side information”

“we first introduce several works that extend BKT to enable modeling only one kind of side information for specific purposes, after which we present a general model that can utilize all types of side information.” 我们首先介绍了一些工作，这些工作扩展了BKT，使其能够为特定的目的仅建模一种侧面信息，然后我们提出了一个可以利用所有类型侧面信息的通用模型。

“To deal with this large number of features, FAST uses logistic regression parameters instead of conditional probability tables; thus, its number of features and complexity grow increase linearly rather than exponentially.” 为了处理这种大量的特征，FAST使用逻辑回归参数而不是条件概率表；因此，其特征数量和复杂度呈线性增长而非指数增长。

“4.3.2 DKT with side information”

“it incorporates an autoencoder network layer (a multi-layer neural network, as shown in Fig. 12(b)) to convert the higher-dimensional input data into smaller representative feature vectors, thereby reducing both the resource requirement and time needed for training.” 它结合了一个自编码网络层( (多层神经网络,如图12 ( b )所示) )，将高维的输入数据转换为较小的代表性特征向量，从而减少了训练所需的资源需求和时间。

“4.4 Considering Forgetting after Learning”

“5 APPLICATIONS”

“5.1 Learning Resources Recommendation”

“automatically recommend appropriate exercises to each student based on artificially designed intelligent algorithms.” 基于人工设计的智能算法，自动为每个学生推荐合适的习题。

“three more beneficial and specific objectives, which are review and explore, smoothness of difficulty level and student engagement, respectively” 三个更有益和具体的目标，分别是回顾和探索，难度水平的平滑和学生参与

“5.2 Adaptive Learning”

“In contrast to learning resources recommendation, adaptive learning needs to design efficient learning schemes and dynamic learning paths to organize learning resources for students based on specific knowledge structures.” 与学习资源推荐不同，自适应学习需要设计高效的学习方案和动态的学习路径，根据特定的知识结构为学生组织学习资源。

“5.3 Educational Gaming”

“it is possible to enable students to learn effectively and happily by designing suitable educational games on an online learning platform” 通过在网络学习平台上设计合适的教育游戏，可以让学生高效快乐地学习

“6 FUTURE RESEARCH DIRECTIONS”

“6.1 Knowledge Tracing with Interpretability”

“interpretability is of significant importance in the domain of education; for example, students usually care more about why a specific item is recommended rather than which/what item is recommended” 可解释性在教育领域具有重要意义；例如，学生通常更关心为什么推荐一个特定的项目，而不是推荐哪个/什么项目

“6.2 Knowledge Tracing with Continuous Responses”

“Simple binarization of the continuous responses introduces inevitable systemic errors to the estimation of students’ knowledge states.” 对连续响应进行简单的二值化会给学生知识状态的估计引入不可避免的系统误差。

“6.3 Knowledge Tracing with Student’ Feedback”

“student feedback provides us with their proactive understanding about their knowledge states, which in turn yields direct and real indicators of their learning situation.” 学生反馈为我们提供了他们对自己知识状态的积极主动的理解，进而产生了他们学习情况的直接和真实的指标。

“6.4 Knowledge Tracing with Less Learning Data”

“practical educational scenarios often suffer from the cold-start problem and the data isolation problem” 在实际的教育场景中，经常会遇到冷启动问题和数据隔离问题

“6.5 Knowledge Tracing for General User Modeling”

“In addition to education, knowledge tracing can be generally applied in a number of domains for user modeling, such as games, sports and recruitment.” 除教育外，知识追踪一般可应用于游戏、体育、招聘等多个领域的用户建模。

RSociopath

关注

9
点赞
踩
14

收藏

觉得还不错? 一键收藏
0
评论
A Survey of Knowledge Tracing（个人笔记）

optimal and adaptive learning experience.” 最优的、适应性的学习经验“more effective than traditional learning styles” 比传统更有效“assess their knowledge levels and learning preferences” 评估知识水平和学习偏好。
复制链接

扫一扫