2Department of Computer Science, College of Computers and Information Technology, Taif University, Taif, 21944, Saudi Arabia
3Department of Computer Science, College of Computer Science and Engineering, Taibah University, Yanbu, 966144, Saudi Arabia
4Department of Computer Science, Faculty of Science, University of Tanta, Gharbia, 31527, Egypt
5Department of Business Analytics, Faculty of Computers and Data Science, Alexandria University, Alexandria, 21526, Egypt
6Department of Computer Science, College of Computer Science and Engineering, Taibah University, Yanbu, 966144, Saudi Arabia; Department of Computer Science, Faculty of Science, University of Tanta, Gharbia, 31527, Egypt
Abstract
In this paper, we introduce a comprehensive Arabic Sign Language (ArSL) recognition system targeting people with hearing disability to bridge the communication gap. Through semantic analysis and AI-driven optimization, we mitigate the challenge of not correctly recognizing sentences and optimizing computational efficiency. Gradient-based adaptive learning (GBL), hyperparameter tuning, and metaheuristic algorithms are integrated to optimize convergence of training and feature extraction to enhance computational efficiency. Automatic hyperparameter selection methods help for adaptive learning rates, leading to better performance of the model without excessive manual involvement. Such AI-driven optimizations result in lower processing overhead while achieving high accuracy while recognizing content. The methodology employs pre-trained transformer models (best practices for BERT and GPT), leading to strong contextual understanding and accurate recognition of full sentences in ArSL. By employing quantization-aware training and optimizing the pruning of models, computational improvements lead to significant memory consumption reductions of 40% and training time reduction of 29%, confirming the compatibility for resource-constrained environments. Comparison of different optimization methods shows that metaheuristics model configurations, e.g., Bayesian Optimization and Genetic Algorithms, present computational trade-offs, validating the choice of the current model configuration. The hardware adaptability is supported by implementation of low-power processing methods that make the system deployable on embedded edge devices. Performance of the system in various datasets on 100,000+ samples is reported to be state-of-the-art with 91% accuracy and 94% F1-score. By loading data incrementally, the model optimizes real-time execution as it is trained on the same dataset, resulting in faster inference times and less latency. At this stage, batch normalization and early stopping are also key to improving computational efficiency, leading to a decrease in training time of 19%. From performance points of view, the system can maintain a high frame processing rate and it has low latency in applications outside the simulation environment. The system also accommodates regional accents and heterogeneous data conditions—signifying scalability and applicability into education, public services and the industry. By filling practical gaps in ArSL recognition, this work demonstrates a robust, effective and inclusive methodology allowing easy transfer with a more reliable system in a realistic scenario and scalable future progress.
