Skip to content

Information extraction

Information Extraction (IE) tasks employ a wide array of data structures. We support the following, all of which can be imported from metametric.structures.ie:

  • Mention: A span of text, defined by its starting and ending indices (inclusive or exclusive) in a passage of text.
  • Relation: A typed binary relation between two Mentions.
  • RelationSet: A collection of Relations.
  • Trigger: A typed, event-denoting Mention.
  • Argument: A typed, entity-denoting Mention that satisfies a certain role in an event.
  • Entity: An entity, represented as a collection of Mentions that refer to it.
  • EntitySet: A collection of Entitys.
  • Membership: A membership relation between a particular Mention and the Entity it refers to.
  • Event: A complete event, represented by a particular Trigger together with all of its Arguments.
  • EventSet: A collection of Events.

One can define a wide array of metrics based on these data structures.

Coreference Resolution

We support three of the most widely used metrics for coreference resolution, including \(\text{MUC}\) (muc; paper), \(B^3\) [b_cubed_precision, b_cubed_recall; paper) and \(\text{CEAF}_{\phi_4}\) (ceaf_phi4; paper), as well as a metric suite (coref_suite) that includes all of these, plus the commonly reported average of all three. These metrics can be imported from metametric.metrics.coref.