Context-aware In-process Crowdworker Recommendation 论文

最新推荐文章于 2024-07-29 19:54:40 发布

VinCinx

最新推荐文章于 2024-07-29 19:54:40 发布

阅读量36

点赞数

文章标签：前端

本文链接：https://blog.csdn.net/qq_46875641/article/details/132977844

版权

Context-aware In-process Crowdworker Recommendation

上下文感知过程中众测工人推荐

重点：过程中动态调整，可以参考用户的打标签方法，以及任务的测试充分度(利用term)

现有方法存在的问题：

long-sized non-yielding windows

no new bugs are revealed in consecutive test reports during the process of a crowdtesting task

测试环境、经验、能力、专业偏向

by modeling the workers’ testing environment [51, 60], experience [13, 60], capability [51], or expertise with the task

上述方法参照【Characterizing Crowds to Better Optimize Worker Recommendation in Crowdsourced Testing】只是在任务刚发布的开始阶段提供一次性的推荐

They merely provide one-time recommendation at the beginning of a new task, without considering constantly changing context information of ongoing testing processes.

解决：

提出iRec，基于上下文感知，在众包测试过程中进行推荐

at a specific point of crowdtesting process

iRec三部分：

iRec consists of three main components: testing context modeling, learning-based ranking, and diversity-based re-ranking

分别对应于：

• The crowdtesting context model which consists of two perspectives, i.e., process context and resource context to facilitate in-process crowdworker recommendation.

建立模型：

资源模型——活跃、偏好、专业、设备

过程模型——

• The development of the learning-based ranking method to learn appropriate crowdworkers who can detect bugs in a dynamic manner.

找出最有可能检测出报告的人

• The development of the diversity-based re-ranking method to adjust the ranked workers to reduce duplicate bugs.

减少重复度，让可能检测树出报告的人之间有一定的差异性，避免大量重复bug的产生

研究背景：

• 任务可能被分配给根本不适合这个任务的人，最终没有bug被找出

• 任务被分给很多同类型的工人，他们找出的bug很多都是重复的

• 体现在下图中就是平行线，称为a non-yielding window，没有新的bug被发现

众包工人：活跃度、偏好、专业偏向(一个工人偏好一些任务并不代表选择该任务后能检测出bug)

Preference focuses more on whether a crowdworker would take a specific task, and expertise focuses more on whether a crowdworker can detect bugs in the task.

Approach：

数据预处理

需要知道任务信息、到目前为止该任务接收的所有报告、所有的工人信息(包括他们提交的所有历史报告以及在当前任务下提交的报告)、历史众测任务

仿照现有研究，将文档分词、去除停用词、同义词替换，最终将document表示为一个term向量

There are two types of textual documents in our data repository: one is test reports and the other is test requirements. Following the existing studies [48, 52], each document goes through standard word segmentation, stopwords removal, with synonym replacement being applied to reduce noise. As an output, each document is represented using a vector of term

根据文档频率将所有的term排序，去除最高和最低的5%，这样就得到了一个descriptive terms list来表示document

由于测试报告往往比较短，tf不具有区分度所以不适用

We rank the terms according to the number of documents in which a term appears (i.e., document frequency, also known as df )

1.Testing Context Modeling

Process context——测试充分性

将测试任务的要求也变为term list的形式

定义descriptive term of task requirements中每个term的测试充分性

tj表示descriptive term of task requirements中的某个词

ie：统计这个词在所有收到的该任务的cesium报告中的出现频率，频率越高、测试越充分

Resource context

活跃度度量：

preference的度量：

ProbPref——当想要产生一个含有termj的测试报告时，推荐工人w的可能性probability

图：

expertise的度量：

如图

与prefernce的唯一不同是，这里是从bug report中寻找tf和df

device：

Phone type used to run the testing task,

Operating system of the device model,

ROM type of the phone,

Network environment

2.Learning-based Ranking

(具体的计算，match等)

定义三个相似度计算方法，定义学习模型，找出相似度高的降序排列(设置一个阈值，过低时不考虑推荐)

？建模

3.Diversity-based Re-ranking

定义两个差异度公式专业和设备两个角度

第二部得到一个推荐列表w1-w(recnum)

迭代下面的操作

使用一个集合S，初始为空

1.将w1移入S

2.计算定义的两个差异度

3.添加一个设备权值，计算综合差异度，将w2-w(recnum)中综合差异度值最小的移入S

这样得到的列表，排序就综合考虑了工人的能力以及bug重复度(不要有太多重复的bug)

VinCinx

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Context-aware In-process Crowdworker Recommendation 论文

Context-aware In-process Crowdworker Recommendation上下文感知过程中众测工人推荐重点：过程中动态调整，可以参考用户的打标签方法，以及任务的测试充分度(利用term)现有方法存在的问题：long-sized non-yielding windowsno new bugs are revealed in consecutive tes...
复制链接

扫一扫