Stack Overflow serves as a widely-used, community-driven platform where developers seek assistance with programming-related issues. While the platform allows users to post questions and receive multiple answers, a significant portion of these questions do not culminate in an accepted solution. This lack of a clearly identified best answer often results in confusion for both the original poster and future visitors, as well as increased time spent navigating through numerous responses. To address this challenge, we present a method for automatically identifying the most promising answer among unaccepted ones. Our approach involves the application of text mining techniques to extract 13 informative features from a large dataset comprising 15,464 questions, 37,275 answers, and 72,025 comments. These features capture various textual, structural, and user-related aspects of the posts. The extracted data are then used to train machine learning models aimed at predicting the answer most likely to be accepted. The study focuses solely on English-language content available on Stack Overflow. The proposed method demonstrates promising performance, achieving an overall accuracy of 71% and an F1 score of 70%. These results suggest that automated answer recommendation can significantly enhance the user experience by reducing ambiguity and improving the efficiency of information retrieval on Q&A platforms.
Faisal, M. S., et al. (2019). Expert ranking techniques for online rated forums. Computers in Human Behavior, 100, 168–176. https://doi.org/10.1016/j.chb.2018.06.013
Anderson, A., et al. (2012). Discovering value from community activity on focused question answering sites: A case study of Stack Overflow. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 850–858). https://doi.org/10.1145/2339530.2339665
Begel, A., et al. (2013). Social networking meets software development: Perspectives from GitHub, MSDN, Stack Exchange, and TopCoder. IEEE Software, 30(1), 52–66. https://doi.org/10.1109/MS.2013.13
Singh, V., et al. (2009). Users of open source software—How do they get help? In Proceedings of the 42nd Hawaii International Conference on System Sciences (pp. 1–10). IEEE. https://doi.org/10.1109/HICSS.2009.259
Storey, M.-A., et al. (2010). The impact of social media on software engineering practices and tools. In Proceedings of the FSE/SDP Workshop on Future of Software Engineering Research (pp. 359–364). https://doi.org/10.1145/1882362.1882435
Vasilescu, B., et al. (2014). How social Q&A sites are changing knowledge sharing in open source software communities. In Proceedings of the 17th ACM Conference on Computer Supported Cooperative Work & Social Computing (pp. 342–354). https://doi.org/10.1145/2531602.2531659
Parnin, C., et al. (2012). Crowd documentation: Exploring the coverage and the dynamics of API discussions on Stack Overflow. Georgia Institute of Technology, Tech. Rep, 11.
Mamykina, L., et al. (2011). Design lessons from the fastest Q&A site in the west. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (pp. 2857–2866). https://doi.org/10.1145/1978942.1979366
Deterding, S., et al. (2011). Gamification: Using game-design elements in non-gaming contexts. In CHI'11 Extended Abstracts on Human Factors in Computing Systems (pp. 2425–2428). https://doi.org/10.1145/1979742.1979575
Capiluppi, A., et al. (2012). Assessing technical candidates on the social web. IEEE Software, 30(1), 45–51. https://doi.org/10.1109/MS.2012.169
Naghashzadeh, M., et al. (2021). How do users answer MATLAB questions on Q&A sites? A case study on Stack Overflow and MathWorks. In Proceedings of the 2021 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER) (pp. 559–563). IEEE. https://doi.org/10.1109/SANER50967.2021.00059
Pundge, A. M., et al. (2016). Question answering system, approaches and techniques: A review. International Journal of Computer Applications, 141(3), 1–8. https://doi.org/10.5120/ijca2016909587
Yazdaninia, M., et al. (2021). Characterization and prediction of questions without accepted answers on Stack Overflow. In Proceedings of the 2021 IEEE/ACM 29th International Conference on Program Comprehension (ICPC) (pp. 1–11). IEEE. https://doi.org/10.1109/ICPC52881.2021.00015
Diyanati, A., et al. (2020). A proposed approach to determining expertise level of Stack Overflow programmers based on mining of user comments. Journal of Computer Languages, 61, 101000. https://doi.org/10.1016/j.cola.2020.101000
Pan, Y., & Zhang, J. Q. (2011). Born unequal: A study of the helpfulness of user-generated product reviews. Journal of Retailing, 87(4), 598–612. https://doi.org/10.1016/j.jretai.2011.05.002
Calefato, F., et al. (2018). Sentiment polarity detection for software development. In Proceedings of the 40th International Conference on Software Engineering (pp. 1–12). https://doi.org/10.1145/3180155.3182519
Hu, M., & Liu, B. (2004). Mining and summarizing customer reviews. In Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 168–177). https://doi.org/10.1145/1014052.1014073
Hu, M., & Liu, B. (2004). Mining opinion features in customer reviews. In Proceedings of the 19th National Conference on Artificial Intelligence (pp. 755–760). AAAI Press.
Wu, Z., & Palmer, M. (1994). Verb semantics and lexical selection. In Proceedings of the 32nd Annual Meeting on Association for Computational Linguistics (pp. 133–138). https://doi.org/10.3115/981732.981751
Jamshidiyan Tehrani,M. , Arjomand,P. and Haghighat,S. (2021). Finding the Potential Accepted Answer on Stack Overflow: a Text Mining Approach. Transactions on Machine Intelligence, 4(4), 238-244. doi: 10.47176/TMI.2021.238
MLA
Jamshidiyan Tehrani,M. , , Arjomand,P. , and Haghighat,S. . "Finding the Potential Accepted Answer on Stack Overflow: a Text Mining Approach", Transactions on Machine Intelligence, 4, 4, 2021, 238-244. doi: 10.47176/TMI.2021.238
HARVARD
Jamshidiyan Tehrani M., Arjomand P., Haghighat S. (2021). 'Finding the Potential Accepted Answer on Stack Overflow: a Text Mining Approach', Transactions on Machine Intelligence, 4(4), pp. 238-244. doi: 10.47176/TMI.2021.238
CHICAGO
M. Jamshidiyan Tehrani, P. Arjomand and S. Haghighat, "Finding the Potential Accepted Answer on Stack Overflow: a Text Mining Approach," Transactions on Machine Intelligence, 4 4 (2021): 238-244, doi: 10.47176/TMI.2021.238
VANCOUVER
Jamshidiyan Tehrani M., Arjomand P., Haghighat S. Finding the Potential Accepted Answer on Stack Overflow: a Text Mining Approach. Trans. Mach. Intell., 2021; 4(4): 238-244. doi: 10.47176/TMI.2021.238