任务描述
SemEval2010 Task8详细信息请参考官方文档。
任务:
对于给定了的句子和两个做了标注的名词,从给定的关系清单中选出最合适的关系。
关系清单(9+1)如下所示:
关系 | 定义 | 例子 |
Cause-Effect (因果关系) | Cause-Effect(X, Y) is true for a sentence S that mentions entities X and Y if and only if (1) S, X and Y are in accordance with the general annotation guidelines (http://docs.google.com/Doc?docid=dfhkmm46_0f63mfvf7) (2) the situation described in S entails that X is the cause of Y, or that X causes/makes/produces/emits/... Y. | "A person infected with a particular <e1>flu</e1> <e2>virus</e2> strain develops an antibody against that virus." Cause-Effect(e2, e1) Comment: flu is a state, virus is the causal agent, thus (a) is satisfied; the virus is actively involved in causing flu and thus (c) is satisfied. |
Instrument-Agency | Instrument-Agency(X, Y) is true of a sentence S that mentions entities X and Y if and only if: (1) S, X and Y are in accordance with the general annotation guidelines (http://docs.google.com/Doc?docid=dfhkmm46_0f63mfvf7) (2) the situation described in S entails the fact that X is the instrument (tool) of Y or, equivalently, that Y uses X. | "A person infected with a particular <e1>flu</e1> <e2>virus</e2> strain develops an antibody against that virus." Cause-Effect(e2, e1) Comment: flu is a state, virus is the causal agent, thus (a) is satisfied; the virus is actively involved in causing flu and thus (c) is satisfied. |
Product-Producer (生产与被生产之间的关系) | Product-Producer (X, Y) is true for a sentence S that mentions entities X and Y if and only if: (1) S, X and Y are in accordance with the general annotation guidelines (http://docs.google.com/Doc?docid=dfhkmm46_0f63mfvf7) (2) the situation described in S entails the fact that X is a product of Y, or Y produces X. | "The <e1>honey</e1> <e2>bee</e2> is the third insect genome published by scientists, after a lab workhorse, the fruit fly, and a health menace, the mosquito." Product-Producer(e1, e2) Comment: This is a typical example of Product-Producer. Honey is a tangible concrete object (c), and the bee is actively involved in producing it (a). |
Content-Container | Content-Container(X, Y) is true for a sentence S that mentions entities X and Y if and only if (1) S, X and Y are in accordance with the general annotation guidelines (http://docs.google.com/Doc?docid=dfhkmm46_0f63mfvf7) (2) the situation described in S entails that X is or was (usually temporarily) stored or carried inside Y. | "The <e1>apples</e1> are in the <e2>basket</e2>." Content-Container(e1, e2) Comment: This is a prototypical example of Content-Container. |
Entity-Origin | Entity-Origin(X, Y) is true for a sentence S that mentions the entities X and Y if and only if (1) S, X and Y are in accordance with the general annotation guidelines (http://docs.google.com/Doc?docid=dfhkmm46_0f63mfvf7) (2) the situation described in S entails that Y is the origin of an entity X (rather than its location), and X is coming or derived from that origin. | "Under state law, minors are not permitted to have <e1>grain</e1> <e2>alcohol</e2>, even if a parent provides it to their children." Entity-Origin(e2, e1) Comment: This is a prototypical example of a material Entity-Origin relation. Restriction (b.4) applies. |
Entity-Destination | Entity-Destination(X, Y) is true for a sentence S that mentions the entities X and Y if and only if: (1) S, X and Y are in accordance with the general annotation guidelines (http://docs.google.com/Doc?docid=dfhkmm46_0f63mfvf7) (2) the situation described in S entails the fact that Y is the destination of X in the sense of X moving (in a physical or abstract sense) toward Y. | "The<e1>boy</e1> ran into the school <e2>cafeteria</e2>." Entity-Destination(e1,e2) Comment: school cafeteria is a spatial/geographical destination. |
Component - Whole | Component-Whole (X,Y) is true for a sentence S that mentions entities X and Y if and only if: (1) S, X and Y are in accordance with the general annotation guidelines (http://docs.google.com/Doc?docid=dfhkmm46_0f63mfvf7) (2) the situation described in S entails that X is a component of Y; (3) X has a functional relation with Y. In other words, X has an operating or usable purpose within Y. | We don't need Einstein's quantum mechanics to understand why each <e1>hand</e1> has 5 <e2>fingers</e2>, and not 4 or 6. Component-Whole(e2, e1) Comment: Fingers are functional, integral parts of the hand. |
Member-Collection | Member-Collection(X, Y) is true for a sentence S that mentions entities X and Y if and only if: (1) S, X and Y are in accordance with the general annotation guidelines (http://docs.google.com/Doc?docid=dfhkmm46_0f63mfvf7) (2) the situation described in S entails the fact that X is a member of Y. | "Italian playing cards most commonly consist of a <e1>deck</e1> of 40 <e2>cards</e2>." Member-Collection(e2, e1) Comment: A deck is a collection of cards, cards are different and separable from the deck, not functional to the deck. |
Message-Topic | Message-Topic(X, Y) is true for a sentence S that mentions the entities X and Y if and only if:
(1) S, X and Y are in accordance with the general annotation guidelines (http://docs.google.com/Doc?docid=dfhkmm46_0f63mfvf7)
(2) the situation described in S entails the fact that X is a communicative message containing information about Y. | "The recommendations contained the following key <e1>points</e1> about the <e2>new politics</e2> of the government." Message-Topic(e1, e2) Comment: politics is the topic of the key points. |
Other | 当句子中实体之前不满足前九种关系时,将标签设置为Other |
各类数据的占比如下图所示:
数据集
- Trial Dataset:试验数据集于2009年8月30日发布,它包含前五个关系的数据。但是,其中也包含了一些其他四种关系的引用, 这些数据在试验数据集上可以被视为Other关系,而不必多加处理。
- Training Dataset:训练集包含8000个样例,涵盖上文提到的9+1中关系。
- Development Dataset:没有提供官方开发集,但是参与者可以使用该部分训练数据集来调整期参数,如使用交叉验证。
- Test Dataset:测试集包含2717个样例,涵盖上文提到的9+1中关系,于2010年3月18日发布。
- WordNet senses提示:和SemEval-2007 Task 4不同,此处不提供人工标注的WordNet senses,会使得任务更加真实。
SemEval-2010 Task 8 VS SemEval-2007 Task 4
- l相比于2007中对于每一种关系提供一个单独的数据集和一个对应的二分类任务,2010仅仅提供一个单独的多类别数据集。
- l多分类任务
- l候选的实体仍然会提供,但是评测系统需要去决策实体在关系中的槽位。
- lWordNet senses 和 query strings将不再提供。
- l数据集中数据量大了很多(超过10000条标记的句子)。
- l关系的集合也变大了
难点
关系清单种中两组相近的关系:
l组1:
- lComponent-Whole
- lMember-Collection
- l都是Part-Whole的特殊情况
l组2:
- lContent-Container
- lEntity-Origin
- lEntity-Destination
- l可以通过考虑所表达的状态是静态的还是动态的进行区分