A language-independent method for extracting the essence of a text in the form of phrases

Document Type : Original Article


K. N. Toosi University of Technology, Computer Engineering Faculty, Tehran, Iran


Due to the increasing adoption of Information Technology (IT) and its significant impact on individual learning preferences, there is a compelling need to enhance Social Learning Networks (SLNs). Accurately anticipating learners' requirements plays a crucial role in facilitating the learning process and improving overall performance. Therefore, anticipating learning needs remains an essential element for supporting learners' progress and enhancing their overall proficiency. This paper introduces a carefully crafted interpreter designed to predict users' learning needs within SLNs. The interpreter adeptly suggests and provides subsequent learning topics based on previously explored subjects. To refine this approach, we propose a user-centric Collaboration Filtering (CF) method. To assess the effectiveness of the proposed method, we utilized a dataset from a reputable SLN. The results indicate that individuals engaging with similar learning topics within a network exhibit consistent learning needs. The method demonstrated a commendable ability to predict approximately 60% of learning needs, according to the recall criteria.


  • Ajallouda, L., Fagroud, F., Zellou, A., & Lahmar, E. (2022). KP-USE: An Unsupervised Approach for Key-Phrases Extraction from Documents. International Journal of Advanced Computer Science and Applications. https://doi.org/10.14569/ijacsa.2022.0130433.
  • Schopf, T., Klimek, S., & Matthes, F. (2022). PatternRank: Leveraging Pretrained Language Models and Part of Speech for Unsupervised Keyphrase Extraction, 243-248. https://doi.org/10.5220/0011546600003335.
  • Wang, H., & Li, J. (2022). Unsupervised Keyphrase Extraction from Single Document Based on Bert. 2022 International Seminar on Computer Science and Engineering Technology (SCSET), 267-270. https://doi.org/10.1109/scset55041.2022.00068.
  • Bracewell, D. B., Ren, F., & Kuriowa, S. (2006). Multilingual single document keyword extraction for information retrieval. 2005 International Conference on Natural Language Processing and Knowledge Engineering. Wuhan, China. doi:10.1109/nlpke.2005.1598792
  • Wan, X., Yang, J., & Xiao, J. (2007). Towards an iterative reinforcement approach for simultaneous document summarization and keyword extraction. In Proceedings of the 45th annual meeting of the association of computational linguistics(pp. 552-559).
  • D’Avanzo, E., Magnini, B., & Vallin, A. (2004). Keyphrase extraction for summarization purposes: The LAKE system at DUC-2004. In Proceedings of the 2004 document understanding conference.
  • Tonella, P., Ricca, F., Pianta, E., & Girardi, C. (2004). Using keyword extraction for Web site clustering. Fifth IEEE International Workshop on Web Site Evolution, 2003. Theme: Architecture. Proceedings. Amsterdam, Netherlands. https://doi.org/10.1109/WSE.2003.1234007
  • Lee, S., & Kim, H.-J. (2008). News keyword extraction for topic tracking. Fourth International Conference on Networked Computing and Advanced Information Management. (NCM), Gyeongju, South Korea. https://doi.org/10.1109/NCM.2008.199
  • Wang, C., Zhang, M., Ru, L., & Ma, S. (2008). An automatic online news topic keyphrase extraction system. 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology. Sydney, Australia. https://doi.org/10.1109/WIIAT.2008.225
  • Matsuo, Y., & Ishizuka, M. (2004). Keyword extraction from a single document using word co-occurrence statistical information. International Journal of Artificial Intelligence Tools: Architectures, Languages, Algorithms, 13(01), 157-169. https://doi.org/10.1142/S0218213004001466
  • Ohsawa, Y., Benson, N. E., & Yachida, M. (1998, April). KeyGraph: Automatic indexing by co-occurrence graph based on building construction metaphor. In Research and Technology Advances in Digital Libraries, 1998. ADL 98. Proceedings. IEEE International Forum on (pp. 12-18). IEEE.
  • Hulth, A. (2003, July). Improved automatic keyword extraction given more linguistic knowledge. In Proceedings of the 2003 conference on Empirical methods in natural language processing, 216-223. Association for Computational Linguistics. https://doi.org/10.3115/1119355.1119383
  • Turney, P. D. (2000). Learning algorithms for keyphrase extraction. Information retrieval, 2(4), 303-336. https://doi.org/10.1023/A:1009976227802
  • Islam, M. R., & Islam, M. R. (2008, December). An improved keyword extraction method using graph based random walk model. In Computer and Information Technology, 2008. ICCIT 2008. 11th International Conference on, 225-229. IEEE. https://doi.org/10.1109/ICCITECHN.2008.4802967
  • Kaur, J., & Gupta, V. (2010). Effective approaches for extraction of keywords. International Journal of Computer Science Issues (IJCSI), 7(6), 144.
  • Chien, L. F. (1997, July). PAT-tree-based keyword extraction for Chinese information retrieval. In ACM SIGIR Forum , 31, 50-58. ACM. https://doi.org/10.1145/278459.258534
  • Onoda, T., Yumoto, T., & Sumiya, K. (2008, December). Extracting and Clustering Related Keywords based on History of Query Frequency. In Universal Communication, 2008. ISUC'08. Second International Symposium on,162-166. IEEE. https://doi.org/10.1109/ISUC.2008.22
  • Khoury, R., Karray, F., & Kamel, M. S. (2008). Keyword extraction rules based on a part-of-speech hierarchy. International Journal of Advanced Media and Communication, 2(2), 138-153. https://doi.org/10.1504/IJAMC.2008.018504
  • Paukkeri, M. S., Nieminen, I. T., Pöllä, M., & Honkela, T. (2008). A language-independent approach to keyphrase extraction and evaluation. Coling 2008: Companion volume: Posters, 83-86.
  • Aquino, G. O., Hasperué, W., Estrebou, C. A., & Lanzarini, L. C. (2013). A novel, language-independent keyword extraction method. In XVIII Congreso Argentino de Ciencias de la Computación.
  • Wu, C., Marchese, M., Wang, Y., Krapivin, M., Wang, C., Li, X., & Liang, Y. (2009, December). Data preprocessing in SVM-based keywords extraction from scientific documents. In Innovative Computing, Information and Control (ICICIC), 2009 Fourth International Conference on, 810-813. IEEE. https://doi.org/10.1109/ICICIC.2009.155
  • Schönhofen, P. (2009). Identifying document topics using the Wikipedia category network. Web Intelligence and Agent Systems: An International Journal, 7(2), 195-207. https://doi.org/10.3233/WIA-2009-0162
  • Moghaddam, J. D., Mosallanezhad, A., & Teshnehlab, M. (2013, August). Sunspot prediction by a Time Delay line Recurrent Fuzzy Neural Network using emotional learning. In Fuzzy Systems (IFSC), 2013 13th Iranian Conference on ,1-5. IEEE.
  • Joachims, T. (1996). A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categorization (No. CMU-CS-96-118). Carnegie-mellon univ pittsburgh pa dept of computer science.
  • Luo, L., & Li, L. (2014). Defining and evaluating classification algorithm for high-dimensional data based on latent topics. PloS one, 9(1), e82119. https://doi.org/10.1371/journal.pone.0082119