A Hybrid NLP–LLM Framework for Intelligent Classification of Individual and Collaborative Tasks in Learning Environments

Amel Douar; Yacine SLIMANI; Fairouz Hadi; Adel Alti; Heythem Azzouz; Abdallah Marouki

doi:10.19139/soic-2310-5070-3268

Amel Douar LRSD, Faculty of Sciences, Setif 1 University - Ferhat Abbas (UFAS1), Sétif 19000, Algeria https://orcid.org/0000-0003-0906-5188
Yacine SLIMANI Artificial Intelligence Laboratory, Faculty of Sciences, Setif 1 University - Ferhat Abbas (UFAS1), Sétif 19000, Algeria https://orcid.org/0000-0003-2775-071X
Fairouz Hadi LRSD, Faculty of Sciences, Setif 1 University - Ferhat Abbas (UFAS1), Sétif 19000, Algeria https://orcid.org/0000-0002-0813-7330
Adel Alti LRSD, Faculty of Sciences, Setif 1 University - Ferhat Abbas (UFAS1), Sétif 19000, Algeria https://orcid.org/0000-0001-8348-1679
Heythem Azzouz Department of Computer Science, Faculty of Sciences, Setif 1 University - Ferhat Abbas (UFAS1), Sétif 19000, Algeria
Abdallah Marouki Department of Computer Science, Faculty of Sciences, Setif 1 University - Ferhat Abbas (UFAS1), Sétif 19000, Algeria

DOI: https://doi.org/10.19139/soic-2310-5070-3268

Keywords: Deep Learning (DL), Natural Language Processing (NLP), Recurrent Neural Networks (RNNs), Large Language Models (LLM), Task Classification, Educational Technology

Abstract

Natural Language Processing (NLP) plays a crucial role in automating text classification tasks, particularly in the context of education and scientific experimentation. However, the classification of practical tasks, especially distinguishing between individual and collaborative worksheets in laboratory sessions, remains an open challenge. This work extends our previous research on collaborative virtual laboratories by introducing an intelligent classification model that automatically determines task modality from worksheet specifications, enabling adaptive pedagogical orchestration. This article investigates the performance of Recurrent Neural Networks (RNNs) and their variants, including LSTM, GRU, and bidirectional models, in addressing this issue. For contextual benchmarking, selected transformer-based models were also evaluated to compare performance and computational trade-offs. We aim to determine which architecture best balances classification accuracy and computational efficiency. RNN-based models were selected due to their efficiency in sequential text modeling and their suitability for real-time deployment in educational platforms. To enhance data diversity and improve model generalization, a data augmentation step leveraging a Large Language Model (LLM) was employed to synthetically enrich the training corpus while preserving semantic consistency. Multiple RNN architectures were trained and evaluated on a domain-specific dataset of chemistry-related worksheets, using both original and LLM-augmented data. Performance was assessed using accuracy, precision, recall, and F1-score metrics. Among the models, LSTM achieved the highest accuracy (95.02%), demonstrating superior classification capabilities. GRU models offered competitive performance with lower computational costs, while bidirectional architectures improved contextual retention but exhibited variable performance depending on the dataset features. Although LLM-based data augmentation marginally enhanced model efficiency, the dataset's inherent simplicity ensured strong baseline performance across all models. Importantly, classification errors do not compromise learning outcomes but only affect execution modality, making the approach robust for real educational deployment. Overall, the findings highlight the efficiency of deep learning models in classifying practical educational tasks and underscore the potential of LLM-assisted augmentation to enhance adaptive learning environments.