Long Span

"Long span" research addresses the limitations of current models in processing and generating lengthy sequences of data, whether text, audio, or video. Current efforts focus on improving large language models (LLMs) and other deep learning architectures like transformers (including Longformer and variations) and LSTMs to handle longer contexts effectively, often employing techniques like coreference resolution, hierarchical attention, and efficient attention mechanisms. This research is crucial for advancing natural language processing, improving video and audio analysis, and enabling more sophisticated applications in diverse fields such as medical diagnosis, legal document processing, and personalized search.

Papers

June 3, 2024

Long and Short Guidance in Score identity Distillation for One-Step Text-to-Image Generation
Mingyuan Zhou, Zhendong Wang, Huangjie Zheng, Hai Huang
Text to Image Generation Long Span Score Distillation Text to Image Generation Model Text Guidance Free Distillation Text to Image Generator Score Matching Loss

May 31, 2024

Long-Span Question-Answering: Automatic Question Generation and QA-System Ranking via Side-by-Side Evaluation
Bernd Bohnet, Kevin Swersky, Rosanne Liu, Pranjal Awasthi, Azade Nova, Javier Snaider, Hanie Sedghi, Aaron T Parisi, Michael Collins, Angeliki Lazaridou, Orhan Firat, Noah Fiedel
Large Language Model Long Context Question Generation Reading Comprehension Knowledge Comprehension Capability Long Span Story Point

February 7, 2024

Long Is More for Alignment: A Simple but Tough-to-Beat Baseline for Instruction Fine-Tuning
Hao Zhao, Maksym Andriushchenko, Francesco Croce, Nicolas Flammarion
Fine Tuning Alignment Problem Long Span LLM Benchmark Instruction Fine Tuning Fine Tuned LLM New Baseline

January 13, 2024

GoMatching: A Simple Baseline for Video Text Spotting via Long and Short Term Matching
Haibin He, Maoyuan Ye, Jing Zhang, Juhua Liu, Bo Du, Dacheng Tao
Video Text Long Span Text Spotting Video Text Spotting

December 8, 2023

From Lengthy to Lucid: A Systematic Literature Review on NLP Techniques for Taming Long Sentences
Tatiana Passali, Efstathios Chatzikyriakidis, Stelios Andreadis, Thanos G. Stavropoulos, Anastasia Matonaki, Anestis Fachantidis, Grigorios Tsoumakas
Natural Language Processing Systematic Literature Review Long Text Long Span Chunk Wise Self Supervised Technique Sentence Compression

November 21, 2023

Long-MIL: Scaling Long Contextual Multiple Instance Learning for Histopathology Whole Slide Image Analysis
Honglin Li, Yunlong Zhang, Chenglu Zhu, Jiatong Cai, Sunyi Zheng, Lin Yang
Multiple Instance Learning Whole Slide Image Image Analysis Long Span Instance Learning Slide Level Histopathology Whole Slide Image

November 9, 2023

SeaTurtleID2022: A long-span dataset for reliable sea turtle re-identification
Lukáš Adam, Vojtěch Čermák, Kostas Papafitsoros, Lukáš Picek
Long Span Animal Re Identification

September 16, 2023

PDFTriage: Question Answering over Long, Structured Documents
Jon Saad-Falcon, Joe Barrow, Alexa Siu, Ani Nenkova, David Seunghyun Yoon, Ryan A. Rossi, Franck Dernoncourt
Large Language Model Yes No Question Long Span Structured Document Augmented Data User Query PDF Document

September 13, 2023

Electricity Demand Forecasting through Natural Language Processing with Long Short-Term Memory Networks
Yun Bai, Simon Camal, Andrea Michiorri
Natural Language Processing Long Short Term Memory Textual Feature Long Span Memory Network Consumption Forecasting Short Term Memory Electricity Demand

August 23, 2023

Evaluation of Faithfulness Using the Longest Supported Subsequence
Anirudh Mittal, Timo Schick, Mikel Artetxe, Jane Dwivedi-Yu
Large Language Model Global Evaluation Machine Generated Long Span Human Annotated Advanced Language Model

August 7, 2023

Video-based Person Re-identification with Long Short-Term Representation Learning
Xuehu Liu, Pingping Zhang, Huchuan Lu
Representation Learning Long Short Term Memory Video Representation Long Span Video Based Person Re Identification

August 3, 2023

Local Large Language Models for Complex Structured Medical Tasks
V. K. Cody Bumgardner, Aaron Mullen, Sam Armstrong, Caylin Hickey, Jeff Talbert
Large Language Model Multi Label Classification Long Span Domain Specific Task Clinical Task Language Reasoning

July 25, 2023

Empower Your Model with Longer and Better Context Comprehension
Yifei Gao, Lei Wang, Jun Fang, Longhua Hu, Jun Cheng
Large Language Model Full Model Reading Comprehension Knowledge Comprehension Capability Long Span Contextual Understanding Attention Transfer

July 20, 2023

GLSFormer: Gated - Long, Short Sequence Transformer for Step Recognition in Surgical Videos
Nisarg A. Shah, Shameema Sikder, S. Swaroop Vedula, Vishal M. Patel
Temporal Feature Temporal Attention Surgical Video Long Span Spatiotemporal Representation Step Recognition Surgical Workflow Recognition Sequence Transformer Cataract Surgery Video

July 18, 2023

Can Model Fusing Help Transformers in Long Document Classification? An Empirical Study
Damith Premasiri, Tharindu Ranasinghe, Ruslan Mitkov
Empirical Study Text Classification Long Span Model Fusion Long Document Classification

May 5, 2023

Predicting COVID-19 and pneumonia complications from admission texts
Dmitriy Umerenkov, Oleg Cherkashin, Alexander Nesterov, Victor Gombolevskiy, Irina Demko, Alexander Yalunin, Vladimir Kokh
Neural Network Covid 19 Patient Data Long Span Risk Assessment Ventilator Associated Pneumonia Risk Score Undergraduate Admission Exam

May 4, 2023

Leveraging BERT Language Model for Arabic Long Document Classification
Muhammad AL-Qurishi
Language Model Text Classification Long Document Long Span Arabic Dialect Direct Learning

May 2, 2023

Unlimiformer: Long-Range Transformers with Unlimited Length Input
Amanda Bertsch, Uri Alon, Graham Neubig, Matthew R. Gormley
Long Span Long Input Encoder Decoder Transformer Long Range Transformer

April 26, 2023

ChartSumm: A Comprehensive Benchmark for Automatic Chart Summarization of Long and Short Summaries
Raian Rahman, Rizvi Hasan, Abdullah Al Farhad, Md Tahmid Rahman Laskar, Md. Hamjajul Ashmafee, Abu Raihan Mostofa Kamal
Text Summarization High Quality Comprehensive Benchmark Long Span Human Written Summary Large Scale Benchmark Dataset Chart Summarization Chart Generation

April 20, 2023

Domain-specific Continued Pretraining of Language Models for Capturing Long Context in Mental Health
Shaoxiong Ji, Tianlin Zhang, Kailai Yang, Sophia Ananiadou, Erik Cambria, Jörg Tiedemann
Language Model Mental Health Long Context Pretrained Language Model Long Span Domain Specific Language Model Domain Specific Continual Pre Training

Long Span

Papers

Long and Short Guidance in Score identity Distillation for One-Step Text-to-Image Generation

Long-Span Question-Answering: Automatic Question Generation and QA-System Ranking via Side-by-Side Evaluation

Long Is More for Alignment: A Simple but Tough-to-Beat Baseline for Instruction Fine-Tuning

GoMatching: A Simple Baseline for Video Text Spotting via Long and Short Term Matching

From Lengthy to Lucid: A Systematic Literature Review on NLP Techniques for Taming Long Sentences

Long-MIL: Scaling Long Contextual Multiple Instance Learning for Histopathology Whole Slide Image Analysis

SeaTurtleID2022: A long-span dataset for reliable sea turtle re-identification

PDFTriage: Question Answering over Long, Structured Documents

Electricity Demand Forecasting through Natural Language Processing with Long Short-Term Memory Networks

Evaluation of Faithfulness Using the Longest Supported Subsequence

Video-based Person Re-identification with Long Short-Term Representation Learning

Local Large Language Models for Complex Structured Medical Tasks

Empower Your Model with Longer and Better Context Comprehension

GLSFormer: Gated - Long, Short Sequence Transformer for Step Recognition in Surgical Videos

Can Model Fusing Help Transformers in Long Document Classification? An Empirical Study

Predicting COVID-19 and pneumonia complications from admission texts

Leveraging BERT Language Model for Arabic Long Document Classification

Unlimiformer: Long-Range Transformers with Unlimited Length Input

ChartSumm: A Comprehensive Benchmark for Automatic Chart Summarization of Long and Short Summaries

Domain-specific Continued Pretraining of Language Models for Capturing Long Context in Mental Health