A Hybrid Feature Selection and Extreme Gradient Boosting Model for Binary Classification of Pistachio Varieties

Authors

  • G. Ravi Kumar
  • G. Thippanna

DOI:

https://doi.org/10.46610/JoIDACS.2025.v02i03.005

Abstract

Pistachios provide a rich supply of fiber, protein, antioxidants, healthy fats, and essential nutrients such as thiamine and vitamin B6. Regular consumption can support weight management, reduce cholesterol and blood sugar levels, and promote healthier digestion, vision, and blood vessels. In Turkey, the primary pistachio types grown and exported are the kirmizi and siirt varieties. This study proposes a hybrid machine learning approach for Pistachio type prediction using Extreme Gradient Boosting (XGBoost) integrated with Random Forest (RF)-based feature selection. The dataset consists of 2148 samples belonging to two Pistachio varieties: Kirmizi Pistachio and Siit Pistachio, represented by 16 morphological features. The objective is to improve classification performance by identifying the most influential features and reducing redundancy in the data. Experimental evaluation compares XGBoost trained on all features with XGBoost trained on RF-selected features. The results show that incorporating RF feature selection enhances model accuracy, precision, and recall significantly. XGBoost without feature selection achieves an accuracy of 87.63%, whereas the proposed RF–XGBoost pipeline improves accuracy to 90.47%. These findings highlight the effectiveness of RF-based feature selection in improving model robustness and classification performance for Pistachio type prediction.

Published

2025-12-19

How to Cite

Ravi Kumar, G., & Thippanna, G. (2025). A Hybrid Feature Selection and Extreme Gradient Boosting Model for Binary Classification of Pistachio Varieties. Journal of Intelligent Data Analysis and Computational Statistics (p-ISSN: 3049-3056 E-ISSN: 3048-7080), 2(3), 48–54. https://doi.org/10.46610/JoIDACS.2025.v02i03.005