Paper ID: 2406.17047

Enhancing Scientific Figure Captioning Through Cross-modal Learning

Mateo Alejandro Rojas, Rafael Carranza

Scientific charts are essential tools for effectively communicating research findings, serving as a vital medium for conveying information and revealing data patterns. With the rapid advancement of science and technology, coupled with the advent of the big data era, the volume and diversity of scientific research data have surged, leading to an increase in the number and variety of charts. This trend presents new challenges for researchers, particularly in efficiently and accurately generating appropriate titles for these charts to better convey their information and results. Automatically generated chart titles can enhance information retrieval systems by providing precise data for detailed chart classification. As research in image captioning and text summarization matures, the automatic generation of scientific chart titles has gained significant attention. By leveraging natural language processing, machine learning, and multimodal techniques, it is possible to automatically extract key information from charts and generate accurate, concise titles that better serve the needs of researchers. This paper presents a novel approach to scientific chart title generation, demonstrating its effectiveness in improving the clarity and accessibility of research data.

Submitted: Jun 24, 2024