Paper ID: 2406.05265
TLEX: An Efficient Method for Extracting Exact Timelines from TimeML Temporal Graphs
Mustafa Ocal, Ning Xie, Mark Finlayson
A timeline provides a total ordering of events and times, and is useful for a number of natural language understanding tasks. However, qualitative temporal graphs that can be derived directly from text -- such as TimeML annotations -- usually explicitly reveal only partial orderings of events and times. In this work, we apply prior work on solving point algebra problems to the task of extracting timelines from TimeML annotated texts, and develop an exact, end-to-end solution which we call TLEX (TimeLine EXtraction). TLEX transforms TimeML annotations into a collection of timelines arranged in a trunk-and-branch structure. Like what has been done in prior work, TLEX checks the consistency of the temporal graph and solves it; however, it adds two novel functionalities. First, it identifies specific relations involved in an inconsistency (which could then be manually corrected) and, second, TLEX performs a novel identification of sections of the timelines that have indeterminate order, information critical for downstream tasks such as aligning events from different timelines. We provide detailed descriptions and analysis of the algorithmic components in TLEX, and conduct experimental evaluations by applying TLEX to 385 TimeML annotated texts from four corpora. We show that 123 of the texts are inconsistent, 181 of them have more than one ``real world'' or main timeline, and there are 2,541 indeterminate sections across all four corpora. A sampling evaluation showed that TLEX is 98--100% accurate with 95% confidence along five dimensions: the ordering of time-points, the number of main timelines, the placement of time-points on main versus subordinate timelines, the connecting point of branch timelines, and the location of the indeterminate sections. We provide a reference implementation of TLEX, the extracted timelines for all texts, and the manual corrections of the inconsistent texts.
Submitted: Jun 7, 2024