CGNet Swara, IIIT Naya Rapipur, and Microsoft Research Lab have made progress on a new Interactive Neural Machine Translation app during lockdown that they hope will enable the Gond Adivasi community to rejuvenate youth engagement in the Gondi language.
Joint efforts among local Indian organizations and Microsoft Research Lab have developed an interactive neural machine translation (INMT) app that translates between Hindi and Gondi. Spoken by over two million people, the Gondi language is used primarily in several Indian states, including Madhya Pradesh, Gujarat, Telangana, Maharashtra, Chhattisgarh, Karnataka, Andhra Pradesh.
Although the language is prevalent in many of India’s central regions, it remains a predominantly spoken language. Furthermore, the language varies based on the state, with many dialects passed down orally over centuries.
The absence of written literature in the language, along with a shortage of local teachers who can instruct in the language, has led the non-profit CGNet Swara to develop ways for Gondi speakers to stay informed despite lack of access to Hindi services.
An Indian voice-based online portal, CGNet Swara has worked with communities in central tribal India by providing them a platform to report on local news through phone calls. Seeking more streamlined communications, the organization partnered with the International Institute of Information Technology Naya Rapipur and a Microsoft Research Lab to create the app to translate between Gondi and Hindi.
“We did one workshop at the Microsoft Research Lab office in Bengaluru in 2019. But the app was developed during the lockdown and is likely to be released later this month,” said Shubhranshu Choudhary, a former journalist who co-founded CGNet Swara. Since the onset of the pandemic, the company has worked with community members from several demographics to collect language data. To date, researchers have yielded at least more than 3,000 words and 35,000 sentences and phrases. The efforts come as organizations globally are making an effort to deliver translated resources to communities in need.
“Gondi is a very good language to use as a case study as it has a substantial speaker base across six states. It is not endangered and yet zero resources are available for the same. Through CGNet Swara, we became aware of the various issues that the Gond Adivasis face, and how access to the language could help the cultural identity of the community,” says Kalika Bali, principal researcher, Microsoft Research Lab.
Along with the translation app, Choudhary also mentioned the Chhattisgarh government’s announcement to shift education in the state to include instruction in tribal languages. To contribute to the transition, CGNet Swara has worked with Pratham Books on an ongoing project to translate 400 children’s books.
CGNet Swara,IIIT Naya Rapipur和微软研究实验室已经在一个新的交互式神经机器翻译应用程序上取得了进展,他们希望这个应用程序能够让Gond Adivasi社区让年轻人重新使用冈德语(印度方言,属于达罗毗荼语系)。
当地印度组织和微软研究实验室联合开发了一款交互式神经机器翻译(INMT)应用程序,可以在印地语和冈德语之间进行翻译。冈德语有200多万人使用,主要使用区域为印度的几个邦,包括中央邦、古吉拉特邦、特兰加纳邦、马哈拉施特拉邦、恰蒂斯加尔邦、卡纳塔克邦以及安得拉邦。
虽然冈德语在印度中部不少地区很流行,但它仍主要以口语的方式存在。此外,几个世纪以来,许多方言以口头流传的形式保留下来,因此各邦语言迥异。
由于缺乏语言文字,加上当地缺乏能用冈德语进行教学的教师,非营利性的CGNet Swara开发了一些方法,让冈德语使用者在不会印地语的情况下保持正常交流。
CGNet Swara是一个基于印度语音的在线门户,它与印度中部部落社区合作,为他们提供一个通过电话报道当地新闻的平台。为了寻求更精简的沟通,该组织与国际信息技术研究所Naya Rapipur和微软的一个研究实验室合作,开发了这款冈德语和印地语之间的翻译应用程序。
“2019年,我们在孟加拉的微软研究实验室办公室举办了一次研讨会。但这款应用程序是在疫情封锁期间开发的,可能会在本月晚些时候发布,“CGNet Swara的联合创始人暨前记者Shubhranshu Choudhary表示。自新冠肺炎疫情爆发以来,该公司与来自多个人口统计的社区成员合作,收集语言数据。到目前为止,研究人员已经得出了至少3000多个单词和35000多个句子和短语。与此同时,全球各地的组织都在努力向有需要的社区提供翻译资源。
“冈德语是一种很好的案例研究语言,它在六个邦都有大量的使用者。冈德语没有濒临灭绝,但却没有资源可供其使用。微软研究实验室首席研究员Kalika Bali说:“通过CGNet Swara也我们意识到Gond Adivasis人面临的各种问题,以及使用该语言如何对社区的文化认同产生帮助。”
除了翻译应用程序,Choudhary还提到恰蒂斯加尔邦(Chattisgarh)政府宣布为了涵盖部落语言的教学,要改变邦内的教育。为了促进这一转变,CGNet Swara与Pratham Books正在合作进行一个项目,翻译400本儿童书籍。
以上中文文本为机器翻译,存在不同程度偏差和错误,请理解并参考英文原文阅读。
阅读原文