Paper ID: 2304.14516
pyBibX -- A Python Library for Bibliometric and Scientometric Analysis Powered with Artificial Intelligence Tools
Valdecy Pereira, Marcio Pereira Basilio, Carlos Henrique Tarjano Santos
Bibliometric and Scientometric analyses offer invaluable perspectives on the complex research terrain and collaborative dynamics spanning diverse academic disciplines. This paper presents pyBibX, a python library devised to conduct comprehensive bibliometric and scientometric analyses on raw data files sourced from Scopus, Web of Science, and PubMed, seamlessly integrating state of the art AI capabilities into its core functionality. The library executes a comprehensive EDA, presenting outcomes via visually appealing graphical illustrations. Network capabilities have been deftly integrated, encompassing Citation, Collaboration, and Similarity Analysis. Furthermore, the library incorporates AI capabilities, including Embedding vectors, Topic Modeling, Text Summarization, and other general Natural Language Processing tasks, employing models such as Sentence-BERT, BerTopic, BERT, chatGPT, and PEGASUS. As a demonstration, we have analyzed 184 documents associated with multiple-criteria decision analysis published between 1984 and 2023. The EDA emphasized a growing fascination with decision-making and fuzzy logic methodologies. Next, Network Analysis further accentuated the significance of central authors and intra-continental collaboration, identifying Canada and China as crucial collaboration hubs. Finally, AI Analysis distinguished two primary topics and chatGPT preeminence in Text Summarization. It also proved to be an indispensable instrument for interpreting results, as our library enables researchers to pose inquiries to chatGPT regarding bibliometric outcomes. Even so, data homogeneity remains a daunting challenge due to database inconsistencies. PyBibX is the first application integrating cutting-edge AI capabilities for analyzing scientific publications, enabling researchers to examine and interpret these outcomes more effectively.
Submitted: Apr 27, 2023