Long Document Classification

Long document classification focuses on accurately assigning categories to lengthy texts, a task challenging traditional methods due to computational constraints and the need to capture both local and global contextual information. Current research emphasizes hierarchical models, often incorporating graph neural networks or transformers with specialized attention mechanisms (e.g., sparse attention, hierarchical attention) to efficiently process extensive text while preserving semantic relationships. These advancements are crucial for improving performance in various applications, such as financial analysis, legal document processing, and scientific literature categorization, where handling long, complex documents is essential.

Papers