A Comparative Study of Traditional and Hybrid Models for Text Classification

Muhammad Sabir; Talha Farooq Khan; Muhammad Azam

Authors

Muhammad Sabir University of southern Punjab
Talha Farooq Khan university of southern Punjab, multan, pakistan
Muhammad Azam University of southern Punjab

Keywords:

Text Classifcation, Machine Learning, Hybrid Models, Deep Learning, NLP

Abstract

Natural Language Processing (NLP) is a fundamental task that is essential for the automation of the categorization of textual data using an existing set of categories, such as sentiment analysis, spam detection, fake news detection, etc. Due to the interpretability and also efficiency, the Logistic Regression (LR), Support Vector Machine (SVM), and Random Forest (RF) have been very popularly used for text classification under traditional machine learning models. Yet, such models fail in modeling contextual linkages and semantic subtleties as they would be necessary to handle text with complex structure. As such, hybrid models that couple the two traditional and deep learning techniques have emerged as a potential way to address these problems.

In the note, I review all efforts of text classification that have the potentials of contributing to my classification task, which includes traditional machine learning models, hybrid models, and deep learning models. The AG News dataset is used for evaluation and accuracy, precision, recall and F1 score are used to measure the performance of the models. Finally, the results suggest that both deep learning based hybrid models such as BERT + SVM Hybrid Model (95.7%) and CNN + LSTM Hybrid Model (94.5%) surpass the performance of any traditional or ensemble learning based models by the exploitation of contextual embeddings and sequential modeling. XGBoost (92.8% accuracy) and Bagging Classifier (91.5% accuracy) of ensemble learning models have good generalization as well as stability compared to standalone learner.

Though the hybrid models offer superior classification performance at the sacrifice of computational resources, longer training times, there are tradeoffs in regards to the model classes and the problem. It brings out the tradeoffs made by traditional, ensemble, and the deep learning based hybrid models toward the applicability of the same towards different classification of text. The findings establish a platform towards choosing the best suitable classification model under performance requirements and computational constraints for researchers and practitioners.

Author Biographies

Muhammad Sabir, University of southern Punjab

DR. Muhammad Sabir is currently working as an Assistant Professor in the Department of Computer Science, University of Southern Punjab, Multan. Pakistan. in Computer Science from The Islamia University of Bahawalpur, Pakistan. His primary research interests include natural language processing, text mining, web mining, machine learning.

Talha Farooq Khan, university of southern Punjab, multan, pakistan

DR. TALHA FAROOQ KHAN is currently working as an Assistant Professor in the Department of Computer Science, University of Southern Punjab, Multan. Pakistan. He has over nine years of experience in teaching and research. He holds a Ph.D. in Computer Science from The Islamia University of Bahawalpur, Pakistan. His primary research interests include natural language processing, text mining, web mining, machine learning, deep learning and LLMs. He has published several research articles in well-reputed international journals and conference proceedings. He also serves as a reviewer for various peer-reviewed journals and has contributed to multiple research collaborations.

Muhammad Azam, University of southern Punjab

DR. Muhammad Azam is currently working as an Assistant Professor in the Department of Computer Science, University of Southern Punjab, Multan. Pakistan. He has published several research articles in well-reputed international journals and conference proceedings.

A Comparative Study of Traditional and Hybrid Models for Text Classification

Authors

Keywords:

Abstract

Author Biographies

Muhammad Sabir, University of southern Punjab

Talha Farooq Khan, university of southern Punjab, multan, pakistan

Muhammad Azam, University of southern Punjab

Downloads

Published

How to Cite

Issue

Section

Most read articles by the same author(s)

Collaboraters

Make a Submission

Information