Content
Crowdsourcing
Outsourcing some tasks to a crowd -> Crowdsourcing
Improve the quality, timeliness and breadth of data
将一些任务外包给人群 -> Crowdsourcing
提高数据的质量、及时性和广度
Key questions:
-
What computational problems can/should be solved?
Data augmenting, Data processing -
What are the programming paradigms/platforms?
A programming paradigm is the classification, style or way of programming. It is an approach to solve problems by using programming languages. -
How do we guarantee that the solution is accurate, efficient and economical?
Quality, cost and latency -
How do we motivate participation and leverages their unique expertise and interests of workers?
-
How do we leverage the joint efforts of both automated and
human computers as workers?
3 central aspects of crowdsourcing
- What
- What tasks can be performed by machines
- Decompose the macro and micro tasks
- Who
- Expertise of workers (如何模拟工人的专业知识)
- Manage cultural aspects and language barrier
- How
- How to design and execute tasks
- Aggregate noisy & complex output ( defines how intelligent aggregation techniques should be, such as Hierarchical—cluster-based aggregation) 聚合嘈杂和复杂的输出(定义智能聚合技术应该如何,例如分层 - 基于集群的聚合)
Overall process
Process
- 使用Parallel安排worker
- Operations & Control: 多产线并行,成本高
- Cost vs latency:cost high, low latency 成本高,延迟小
- 使用sequential安排worker
- Operations & Control: 一个接一个
- Cost vs latency:延迟高,需要等上一个工人的结果,但如果计划分配三名工人,如果他们中的两个同意结果,那么不需要执行另一个 HIT,节约成本
- Operations & Control:
- Repetition
You repeat the tasks until you are satisfied
重复任务直到满意 - Selection
You retrieve tasks using selection mechanisms
使用选择机制检索任务
- Repetition