-
Attribute
- a property of an entity such as its name, alias, descriptor, or type
-
Annotation
- mark up of a text span in a specific format that indicates a feature or features of the text within the span
-
Benchmark
- assessment of performance according to standard measures
-
Data
- textual input for an information extraction system
-
Dataset
- a set of newswire texts chosen according to pre-specified conditions and meant to represent a rich text stream
-
Database
- data in tabular format stored with the assistance of a relational database management system
-
Developer
- a researcher who implements a system
-
Dry Run
- an end-to-end practice run of an evaluation
-
Entity
- an object of interest such as a person or organization
-
Evaluation
- assessment of performance according to agreed upon measures
-
Event
- an activity or occurrence of interest such as a terrorist act or an airline crash
-
Fact
- a relationship held between two or more entities
-
Formal Test Material
- a blind dataset, task definitions, test procedure, answer keys, and scoring software
-
Formal Run
- the "official" evaluation
-
Information Extraction
- the extraction or pulling out of pertinent information from large volumes of texts
-
Information Extraction Systems
- an automated system to extract pertinent information from large volumes of text
-
Information Extraction Technologies
- techniques used to automatically extract specified information from text
-
Metrics
- pre-defined measures of performance calculable by comparison of system output with human-generated answer keys
-
MUC
- Message Understanding Conference held at the end of the evaluation and attended only by participants and invited potential customers
-
Named Entity
- a named object of interest such as a person, organization, or location
-
SAIC
- Science Applications International Corporation
-
Scoring Software
- fully automated software for the comparison of system performance against answer keys that tallies and reports metrics and error types for developers and evaluators
-
Search Engine
- software which gives relevance rankings to documents in a collection based on a user query
-
Sources of News
- edited electronic feeds from established news organizations such as the Wall Street Journal and the New York Times News Service
-
Statistical Algorithm
- algorithm to determine the statistical significance of evaluation results
-
Systems Integration
- building a system from off-the-shelf components to accomplish a job previously not automated
-
Systems Integrator
- builder of a system from off-the-shelf components
-
Task Definition
- document which defines the format and criteria for annotation or extraction of text and placement into a database or template. For example, task definitions give general guidelines and examples for the extraction of named entities, attributes, facts, and events from texts.
-
Text
- electronically encoded alphabetic material from some human language
-
Training
- process by which a system learns about a dataset