An Experimental Study on Supervised Learning and Random Forest Feature Selection for Edible and Toxic Mushroom Classification

Authors

  • G. Ravi Kumar
  • G. Thippanna
  • D. Mahaboob Basha

DOI:

https://doi.org/10.46610/JoIDACS.2026.v03i01.002

Keywords:

Accuracy, Classification accuracy, Machine learning, Precision, Random forest feature selection, Recall, Supervised learning

Abstract

This paper presents an experimental study on data-driven classification using machine learning techniques with a focus on feature selection for improving predictive performance. The study utilizes a structured benchmark dataset consisting of 8124 instances with 23 categorical features to classify samples into edible and poisonous categories. To enhance model efficiency and accuracy, a random forest (RF)-based feature selection method was applied to identify and retain the most informative attributes. Three supervised learning algorithms, logistic regression (LR), naïve Bayes (NB), and decision tree (DT), were implemented and evaluated both before and after applying RF-based feature selection. Accuracy, precision, and recall were the standard criteria used to evaluate performance. The experimental results show that all classifiers achieved improved performance after feature selection, with the decision tree model obtaining the highest classification accuracy. The findings demonstrate that integrating ensemble-based feature selection with traditional classifiers significantly enhances classification performance and model robustness, particularly for high-dimensional categorical datasets. This study highlights the effectiveness of feature-driven learning frameworks for reliable and efficient data-driven classification systems.

References

L. Breiman, “Random forests,” Machine Learning, vol. 45, no. 1, pp. 5–32, 2001.

I. Guyon and A. Elisseeff, “An introduction to variable and feature selection,” Journal of Machine Learning Research, vol. 3, pp. 1157–1182, 2003.

T. Aditya Sai Srinivas, B. Thulasi Thanmai, A. David Donald, and G. Thippanna, “Battle of the algorithms: An exposé on classification techniques in machine learning,” Journal of Advancement in Software Engineering and Testing, vol. 6, no. 3, 2023.

A. A. Iorkaa, M. Barma, and H. G. Muazu, “Machine learning techniques, methods, and algorithms: Conceptual and practical insights,” International Journal of Engineering Research and Applications, vol. 11, no. 8, pp. 55–64, Aug. 2021.

G. Ravi Kumar and G. Thippanna, “A hybrid feature selection and extreme gradient boosting model for binary classification of pistachio varieties,” Journal of Intelligent Data Analysis and Computational Statistics, vol. 2, no. 3, pp. 1–12, Sep.–Dec. 2025.

A. Mohammed and R. Kora, “A comprehensive review on ensemble deep learning: Opportunities and challenges,” Journal of King Saud University – Computer and Information Sciences, vol. 35, no. 2, pp. 757–774, Feb. 2023.

M. Abdullah, W. Alshehri, S. Alamri, and N. Almutairi, “ADSS: Automated decision support systems,” International Journal of Advanced Trends in Computer Science and Engineering, vol. 8, no. 1.1, pp. 1–7, 2019.

G. Obaido, I. D. Mienye, et al, “Supervised machine learning in drug discovery and development: Algorithms, applications, challenges, and prospects,” Machine Learning with Applications, vol. 17, p. 100576, Sep. 2024.

A. Pal, “Logistic regression: A simple primer,” Cancer Research, Statistics, and Treatment, vol. 4, no. 3, pp. 551–554, Jul.–Sep. 2021.

H. A. Salman, A. Kalakech, and A. Steiti, “Random forest algorithm overview,” Babylonian Journal of Machine Learning, pp. 69–79, 2024.

O. Peretz, M. Koren, and O. Koren, “Naive Bayes classifier: An ensemble procedure for recall and precision enrichment,” Engineering Applications of Artificial Intelligence, vol. 136, part B, p. 108972, Oct. 2024.

I. Dwaraka Srihith, P. Vijaya Lakshmi, A. David Donald, T. Aditya Sai Srinivas, and G. Thippanna, “A forest of possibilities: Decision trees and beyond,” Journal of Advancement in Parallel Computing, vol. 6, no. 3, 2023.

UCI Machine Learning Repository, “Mushroom dataset,” 1987.

A. Zakeri, M. Fawakherji, J. Kang, B. Koirala, V. Balan, W. Zhu, et al., “M18K: A multi-purpose real-world dataset for mushroom detection, 3D pose estimation, and growth monitoring,” Computers, vol. 14, no. 5, p. 199, 2025.

G. Thippanna, M. Devi Priya, and T. Adithay Sai Srinivas, “An effective analysis of image processing with deep learning algorithms,” International Journal of Computer Applications, vol. 185, no. 27, Aug. 2023.

Published

2026-02-11

How to Cite

Kumar, G. R., Thippanna, G., & Basha, D. M. (2026). An Experimental Study on Supervised Learning and Random Forest Feature Selection for Edible and Toxic Mushroom Classification. Journal of Intelligent Data Analysis and Computational Statistics (p-ISSN: 3049-3056 E-ISSN: 3048-7080), 3(1), 16–23. https://doi.org/10.46610/JoIDACS.2026.v03i01.002