Investigation and Analysis of Gene Expression Using the Fusion Method of Feature Selection and Dynamic Neural Network Classification

Document Type : Original Article

Authors

Mechanical Engineering, K. N. Toosi University of Technology, Tehran, Iran

Abstract

The analysis of high-volume microarray data faces challenges such as limited sample size, computational complexity, and the risk of inappropriate gene selection. The scarcity of samples hampers computational analysis and classification complexity, while reducing the classification's ability to generalize and predict new samples. Moreover, datasets with a high gene-to-sample ratio raise concerns about the selection of relevant genes for accurate predictive models. Interpreting disease-causing genes becomes intricate as only a subset of genes offers a precise biological insight into the disease. To address these issues, a focus on a smaller set of gene expression data is crucial for a more effective understanding of informative genes. Hence, the primary objective in microarray data analysis is to significantly reduce the number of genes through discriminative gene selection, enhancing the precision of information contained in the data. This article conducts gene expression classification on various cancer types, including colon cancer, breast cancer, leukemia, prostate tumors, and DLBCL. Each cancer type is independently evaluated in the feature selection cycle and classified using varying numbers of features. This approach aims to overcome challenges in microarray data analysis and improve the accuracy and interpretability of gene expression classification.

Keywords


  • Zhang, H. (2021). Feature Selection Using Approximate Conditional Entropy Based on Fuzzy Information Granule for Gene Expression Data Classification. Frontiers in Genetics, 12, https://doi.org/10.3389/fgene.2021.631505.
  • Rezaee, K., Jeon, G., Khosravi, M., Attar, H., & Sabzevari, A. (2022). Deep learningā€based microarray cancer classification and ensemble gene selection approach. IET Systems Biology, 16, 120 - 131. https://doi.org/10.1049/syb2.12044.
  • Gakii, C., & Rimiru, R. (2021). Identification of cancer related genes using feature selection and association rule mining. Informatics in Medicine Unlocked, 24, 100595. https://doi.org/10.1016/J.IMU.2021.100595.
  • Seryasat, O. R., & Haddadnia, J. (2018). Evaluation of a new ensemble learning framework for mass classification in mammograms. Clinical breast cancer, 18(3), e407-e420. https://doi.org/10.1016/j.clbc.2017.05.009
  • Gordon, A. D. (1999). Classification. CRC Press. https://doi.org/10.1201/9780367805302
  • Boguslawski, L. (2004). Influence of pressure fluctuations distribution on local heat transfer on flat surface impinged by turbulent free jet. Proceedings of the ASME - ZSIS International Thermal Science Seminar II, Bled, Slovenia. https://doi.org/10.1615/ICHMT.2004.IntThermSciSemin.230
  • Brereton, R. G., & Lloyd, G. R. (2010). Support vector machines for classification and regression. Analyst, 135(2), 230-267. https://doi.org/10.1039/B918972F
  • Quinlan JR (1993) C4.5: programs for machine learning. Morgan Kaufmann, San Mateo.
  • Breiman, L., Friedman, J., Olshen, R., & Stone, C. (1993). Classification and regression trees, wadsworth international group, belmont, ca, 1984. Case Description Feature Subset Correct Missed FA Misclass, 1, 1-3.
  • Murphy, K. P. (2006). Naive bayes classifiers. University of British Columbia, 18(60), 1-8.
  • Monti, S., Savage, K. J., Kutok, J. L., Feuerhake, F., Kurtin, P., Mihm, M., & Shipp, M. A. (2005). Molecular profiling of diffuse large B-cell lymphoma identifies robust subtypes including one characterized by host inflammatory response. Blood, 105(5), 1851-1861. https://doi.org/10.1182/blood-2004-07-2947
  • Golub, T. R., Slonim, D. K., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J. P., & Lander, E. S. (1999). Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. science, 286(5439), 531-537. https://doi.org/10.1126/science.286.5439.531
  • Singh, D., Febbo, P. G., Ross, K., Jackson, D. G., Manola, J., Ladd, C., Sellers, W. R. (2002). Gene expression correlates of clinical prostate cancer behavior. Cancer Cell, 1(2), 203-209. https://doi.org/10.1016/S1535-6108(02)00030-2
  • Alon, U., Barkai, N., Notterman, D. A., Gish, K., Ybarra, S., Mack, D., & Levine, A. J. (1999). Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proceedings of the National Academy of Sciences of the United States of America, 96(12), 6745-6750. https://doi.org/10.1073/pnas.96.12.6745
  • Matamala, N., Vargas, M. T., González-Cámpora, R., Miñambres, R., Arias, J. I., Menéndez, P., Benítez, J. (2015). Tumor microRNA expression profiling identifies circulating microRNAs for early breast cancer detection. Clinical Chemistry, 61(8), 1098-1106. https://doi.org/10.1373/clinchem.2015.238691