Detection and Recognition of Bilingual Urdu and English Text in Natural Scene Images Using a Convolutional Neural Network-Recurrent Neural Network Combination with a Connectionist Temporal Classification Decoder

Back

Neurotechnology

Detection and Recognition of Bilingual Urdu and English Text in Natural Scene Images Using a Convolutional Neural Network-Recurrent Neural Network Combination with a Connectionist Temporal Classification Decoder

Sensors, Volume 25, Issue 16

Research Authors: Khadija Tul Kubra, Muhammad Umair, Muhammad Zubair, Muhammad Tahir Naseem, Chan-Su Lee

AIIM Authors: Ridhima Prasad, Owen Anderson

Approved by President Reda Riffi

Publication Date: Aug 19, 2025

Comprehensive Summary

The MDPI paper focuses on the detection and recognition of bilingual text, specifically Urdu and English, in natural scene images such as signboards and banners. Natural scene images are more challenging than scanned documents because of background clutter, poor lighting, blur, and varying orientations. To tackle these issues, the authors created a large bilingual dataset using data augmentation techniques and designed a recognition system that combines convolutional neural networks (CNNs) for feature extraction, recurrent neural networks (RNNs) for sequence modeling, and a Connectionist Temporal Classification (CTC) decoder. Different RNN architectures were tested, and bidirectional models (BLSTM-512 and BGRU-512) gave the best results, with accuracies around 98.5% for Urdu characters, 97.2% for Urdu words, and 99.2% for English characters. The study demonstrates a robust approach for multilingual text recognition in complex real-world conditions.

Outcomes and Implications

Although the research was not conducted in a medical context, its findings have potential applications in healthcare. Hospitals and clinics in multilingual regions often rely on signage, instructions, and safety notices presented in multiple languages. A system capable of accurately reading and interpreting both Urdu and English could improve patient navigation, accessibility, and understanding of critical information. Public health campaigns using posters and banners could also be more effectively documented, translated, and evaluated for outreach. Beyond signage, such technology could support telemedicine by helping clinicians interpret images of bilingual medication labels or patient instructions. It could also enhance accessibility for patients with low literacy or visual impairments by enabling text-to-speech systems in their preferred language. In emergency or crisis settings, accurate recognition of multilingual warnings and instructions could aid responders in providing timely and safe care. However, deploying such systems in healthcare would require adaptations for handwritten notes, stylized fonts, and packaging surfaces, where recognition errors could have serious consequences. With further development, bilingual text recognition could become a valuable tool for improving communication, safety, and patient-centered care in medical environments.

Our mission is to

Connect medicine with AI innovation.

No spam. Only the latest AI breakthroughs, simplified and relevant to your field.

Our mission is to

Connect medicine with AI innovation.

No spam. Only the latest AI breakthroughs, simplified and relevant to your field.

Our mission is to

Connect medicine with AI innovation.

No spam. Only the latest AI breakthroughs, simplified and relevant to your field.