Paper ID: 2210.01637

Mining Duplicate Questions of Stack Overflow

Mihir Kale, Anirudha Rayasam, Radhika Parik, Pranav Dheram

There has a been a significant rise in the use of Community Question Answering sites (CQAs) over the last decade owing primarily to their ability to leverage the wisdom of the crowd. Duplicate questions have a crippling effect on the quality of these sites. Tackling duplicate questions is therefore an important step towards improving quality of CQAs. In this regard, we propose two neural network based architectures for duplicate question detection on Stack Overflow. We also propose explicitly modeling the code present in questions to achieve results that surpass the state of the art.

Submitted: Oct 4, 2022