“Watson Discovery can natively understand documents in Hindi and extract meaning, without having to resort to any translation to English,” says Dasgupta. “This is an outcome of our collaboration with IIT Bombay’s Center for Indian Language Technology
as a part of IBM’s AI Horizon Network.”
This way India has taken positive steps to establish a leadership role in AI. The IBM-IIT Bombay AI Horizon Network (AIHN) project is an important step - both for science and technology
and for the country’s progress.
Prof Pushpak Bhattacharyya, Professor-Department of Computer Science and Engineering, IIT- Bombay said the collaboration is focused on automation of cognition and perception. There is an emphasis on cutting edge research, high-quality publications and open and widely usable AI resources. There are social and commercial needs, whose servicing requires user interaction and information dissemination in Indian languages (IL). However, the complexity of Indian languages, low corpora and other constraints makes adoption and adaptation of English centric NLP very difficult for IL-NLP.
“Through our partnership with IBM, we have been able to use machine learning for IL-NLP and address challenges related to the low resource, understanding of Hindi language
sense, intent, sentiment and more,” says Bhattacharyya. “In two years, the endeavour has seen stellar publications coming out of the collaboration and creation of quality software (in NLP, speech and multimodal AI).”
These capabilities would help enterprises get more insights from their data and have immediate use in different areas. These also include customer care, where the frequent use of casual language has made accurate understanding, classification and fine-grained analysis difficult.
At a time when Covid-19 has wreaked havoc on humans and businesses globally, IBM’s Dasgupta also sees great applications of natural language processing technology to address the challenges posed by the pandemic.
The National Health Mission, under the Government of Andhra Pradesh, has collaborated with IBM to deploy Watson Assistant for citizens. It provides Covid-19-related information for citizens on the response efforts and measures by the Andhra Pradesh Government. The Watson virtual agent on the IBM public cloud brings together Watson Assistant, natural language processing capabilities from IBM Research, and enterprise AI search capabilities with Watson Discovery. This helps to understand and respond to common questions about Covid-19 in English, Telugu and Hindi.
Also, Indian Council of Medical Research (ICMR) collaborated with IBM to implement Watson Assistant on its portal to respond to specific queries of front line staff and data entry operators from various testing and diagnostic facilities across the country on Covid-19. The queries could be related to nature and process of data to be captured by test labs. This includes how to record inventory of test kits and reagents, the process of reporting to various Government agencies and references to the latest guidance, in addition to responding to queries on Covid-19 in general. “We helped their frontline agents actually respond to a deluge of requests that they were getting on the covid testing procedures,” says Dasgupta.
The innovation is important as less than 10 per cent of the population in the country knows the English language. This way the majority of the population is left out from the benefits of the technology.
“If you're serious about Indians, you have to be serious about Indian languages,” says Karthik Sankaranarayanan, senior manager, AI for Interaction, IBM Research India “Hindi which is being spoken by almost like 50 to 60 per cent of the population, was naturally our first goal post to go after.”
With the new technology, Watson can natively understand Hindi written in Devanagari and all such information available on the internet. This data includes text messages, emails and medical reports. The firm would be focusing on other Indian languages as well including Bengali, Kannada and Gujarati. The ability to understand any language natively is very different compared to learning a new foreign language like Mandarin or Spanish where one first translates it into English and then interprets the meaning.
Watson can presently understand Hindi in written format. Sankaranarayanan said IBM is advancing the AI technologies to understand spoken languages as well. “We're advancing Watson to understand spoken Hindi,” he said. “You'll see announcements related to advancements next year.”