Learn from Humans
Learning of robots
- Four main questions: when, what, who and when to imitate?
- The interface and the channels to convey the demonstrations ?
- LFH is different robot learns on its own.
- Equal words: LFD, PbD, imitation learning, and apprenticeship learning.
Principle
- Main principle-----end users can teach robots new tasks without programming.
- LFD is not record and play technique. LFD implies learning, henceforth, generalization.
Brief History
- Started in the 1980s
- Threefold:
(1) Powerful mechanism for reducing the complexity of search spaces for learning.
(2) offer an implicit means of training a machine, such that explicit and tedious programming can be minimized or eliminated.
(3) Studying and modeling the coupling of perception and action.
Key Issues when LFD
What and how to imitate
What
- extract invariant features----what to imitate
- In continuous control tasks, what to imitate---------defining automatically the feature space for learning, constraints and the cost function.
- In discrete control tasks, what to imitate-------------treated by RL and symbolic reasoning, what to imitate------------define the state and action space and how to automatically learn the pre/post conditions in an autonomous decision system.
How
- How to imitate-----determine how to robot perform the learned behaviors to maximize the metric found when solving the what to imitate problem.
- Perceptual equivalence -----information necessary to perform the task is available to both humans and robots.
- Physical equivalence-----affect and interact with the world.
Interfaces for Demonstation
- Record human motions.
- Kinesthetic teaching.
- Immersive teleoperation scenarios—human operator is limited to using the robot’s own sensors and effectors to perform the task.
- Explicit information–such as speech.
Algorithms to learn from Humans
Two trends
- A low-level representation of the skill, taking the form of a non-linear mapping between sensory and motor information
- A high-level representation of the skill that decomposes the skill into a sequence of action-perception units.
Algorithms
- Learning individual motions: Teaching Force-Control Tasks.
- Learning Compound Actions:
(1) Learn all individual motions+learn the right sequence and combination of these actions.
(2) Observe the whole task and segment the task automatically to extract the primitive actions.
总体来讲,看完了Handbook of robotics 部分的Learning from Humans, 感觉过于抽象,只是演示了几个demo视频,以及笼统的介绍了一些算法概念。关于具体怎么学习,没有讲到,还需要去阅读具体的论文,去探索。
另,Learning from Humans 与 Visual Servoing并没有什么关系,二者是独立的,Handbook of Robotics也有相关章节的介绍,也需要自己去阅读。