Paper ID: 2308.09070

Enhancing API Documentation through BERTopic Modeling and Summarization

AmirHossein Naghshzan, Sylvie Ratte

As the amount of textual data in various fields, including software development, continues to grow, there is a pressing demand for efficient and effective extraction and presentation of meaningful insights. This paper presents a unique approach to address this need, focusing on the complexities of interpreting Application Programming Interface (API) documentation. While official API documentation serves as a primary source of information for developers, it can often be extensive and lacks user-friendliness. In light of this, developers frequently resort to unofficial sources like Stack Overflow and GitHub. Our novel approach employs the strengths of BERTopic for topic modeling and Natural Language Processing (NLP) to automatically generate summaries of API documentation, thereby creating a more efficient method for developers to extract the information they need. The produced summaries and topics are evaluated based on their performance, coherence, and interoperability. The findings of this research contribute to the field of API documentation analysis by providing insights into recurring topics, identifying common issues, and generating potential solutions. By improving the accessibility and efficiency of API documentation comprehension, our work aims to enhance the software development process and empower developers with practical tools for navigating complex APIs.

Submitted: Aug 17, 2023