KUDO embraces machine interpreting – a new paradigm for RSI platforms?


2023-01-24 09:10 Nimdzi Insights


Article by Rosemary Hynes. Language technology providers are scrambling to jump on the speech-to-text bandwagon which means users can view machine-generated live subtitles (translated from the original) as well as multilingual captions (monolingual transcripts available for different languages) of speeches in their preferred language. While this sounds great, some providers are taking it a step further. How so? By offering speech-to-speech translation (S2ST), otherwise known as machine interpreting (MI). The product release Today, on January 24, 2023, the first remote simultaneous interpreting (RSI) platform is set to release their very own MI feature. The household name, KUDO, is best known for providing the technology (and also, if needed, the interpreters) that facilitates the use of RSI in video conferences and at large events. The release of their proprietary MI feature is a smart move by KUDO. Video conferencing platforms with an RSI feature, like Zoom and Microsoft Teams, dominate the multilingual online meeting sphere. In order to stay relevant in this space, RSI platforms need to innovate and constantly improve their provision of services. That being said, although KUDO might be the first one in the RSI space to add their own MI solution, KUDO AI is certainly not the first MI solution out there. In fact, there are as many as 23 MI technologies listed in Nimdzi’s 2022 Language Technology Atlas, but only a handful provide MI for online events and conferences. The majority of MI solutions in the image below are handheld devices used for two-way communication. In the conference and events space, Wordly is currently the most specialized and well-known solution on the market. KUDO’s MI feature So what does KUDO AI entail? KUDO’s MI solution is available on the KUDO platform and on its partner events platforms like On24 and Hopin. In the coming months the company will then also integrate its MI solution with video conferencing platforms, such as Microsoft and Zoom. KUDO AI will be publicly available by the end of the first quarter 2023. The solution uses a cascade model and has been built on a combination of open-source technology plus in-house technological building blocks. It has been tested to the point where KUDO was satisfied with the quality of its output both for accuracy and fluency. Already, speakers can choose the gender of their synthetic voice before entering the meeting. Moreover, a voice cloning feature will be included in the coming months — meaning the original speaker’s voice will be retained in the synthetic, translated output. As it stands, there are five languages available in KUDO’s MI feature: French, Spanish, German, Portuguese, and English. KUDO intends on adding a new language every two weeks to cover the majority of European languages by the end of the summer. During a meeting with KUDO AI, listeners can select their preferred language from a drop-down menu or choose to listen to the original. Meeting participants will then hear the live translation of the original speech via a synthetic voice in the language they selected. In addition to the synthetic audio output, participants can also choose to see the written translation in the form of machine-generated live subtitles (or turn them off). Currently, one disadvantage of the subtitles is that they do not tag the speaker so it can be difficult to follow speaker changes. The same applies to the MI output, if two male participants are speaking interchangeably, the MI does not indicate a change of speaker. This combined with the considerable lag can make it difficult to follow a conversation between two or more speakers. Hence, in its current state, the use-case is more likely to be applied to one-to-many webinars or online training courses, where the exchange is limited. Smart move by KUDO By providing MI to their clients, KUDO can cater for the increased demand for multilingualism in virtual meetings. The mass shift to online meetings during the COVID-19 pandemic has stabilized somewhat, with the return of in-person meetings, however, virtual events have not gone away and, with that, the need for virtual solutions. Event organizers are increasingly looking to add languages to their meetings, but human interpreters are not always a viable solution for short events (such as one-hour long meetings) or smaller budgets. That is where machine-generated live subtitles and MI come into play. Users can select the language they want to read the subtitles in or listen to the synthetic voice. MI solutions are useful if event attendees want to listen to the meeting while doing something else. It also serves as an accessibility feature for people who are blind or partially sighted and want to partake in a multilingual event. So, although this software release by KUDO is not the first of its kind, it does set a precedent for other RSI platforms to potentially follow in its footsteps. KUDO prides itself on being at the cutting edge of language technology and this product demonstrates the company’s willingness to innovate. Effect on human interpreters As for human interpreters, the situation is likely not to change considerably. Long, more complex meetings will continue to require human interpreters, particularly when the stakes are high or the situation requires a certain degree of emotional intelligence the machines cannot (yet) provide. dMI will most likely fill the gap where interpreters were reluctant to provide their services in the first place — short, relatively simple, and low-budget online meetings.
文章由罗斯玛丽海因斯。 语言技术提供商正争先恐后地加入语音转文本的潮流,这意味着用户可以观看机器生成的实时字幕(从原文翻译而来),以及他们首选语言的多语种字幕(不同语言的单语转录本)。虽然这听起来不错,但一些提供商正在更进一步。怎么会呢?通过提供语音到语音翻译(S2ST),也称为机器翻译(MI)。 产品发布 今天,2023年1月24日,首个远程同声传译(RSI)平台将发布自己的MI功能。KUDO这个家喻户晓的名字最为人所知的是它提供的技术(如果需要,还提供口译员),方便了RSI在视频会议和大型活动中的使用。 发布他们专有的MI功能是KUDO的明智之举。具有RSI功能的视频会议平台,如Zoom和Microsoft Teams,主导着多语言在线会议领域。为了在这一领域保持相关性,RSI平台需要创新并不断改进其服务提供。 话虽如此,虽然KUDO可能是RSI领域第一个添加自己的MI解决方案的公司,但KUDO AI肯定不是第一个MI解决方案。事实上,Nimdzi发布的《2022年语言技术图谱》中列出的MI技术多达23种,但为线上活动和会议提供MI的却屈指可数。下图中的大多数MI解决方案都是用于双向通信的手持设备。在会议和活动领域,Worly是目前市场上最专业、最知名的解决方案。 KUDO的MI功能 那么KUDO AI意味着什么?KUDO的MI解决方案可在KUDO平台及其合作伙伴活动平台(如On 24和Hopin)上获得。在接下来的几个月里,该公司还将把它的MI解决方案与微软和Zoom等视频会议平台集成在一起。KUDO AI将于2023年第一季度末公开上市。该解决方案使用级联模型,并以开源技术和内部技术构件相结合的方式构建。它已经被测试到KUDO对其输出的准确性和流畅性都很满意的程度。发言者已经可以在进入会议之前选择他们合成声音的性别。此外,一个声音克隆功能将包括在未来几个月-这意味着原来的发言者的声音将保留在合成,翻译输出。 目前,KUDO的MI功能有五种语言可供选择:法语、西班牙语、德语、葡萄牙语和英语。KUDO打算每两周增加一种新语言,以便在夏季结束前覆盖大多数欧洲语言。在使用KUDO AI进行会议期间,听众可以从下拉菜单中选择自己喜欢的语言或选择收听原文。与会者将通过合成语音听到他们选择的语言的原始讲话的现场翻译。除了合成音频输出,与会者还可以选择以机器生成的现场字幕形式观看书面翻译(或关闭字幕)。 目前,字幕的一个缺点是它们不标记说话者,因此可能难以跟随说话者的变化。MI输出也是如此,如果两个男性参与者交替说话,MI不指示说话者的变化。这与相当大的滞后相结合,可能会使人很难听懂两个或多个说话者之间的对话。因此,在其目前的状态下,用例更可能应用于一对多的网络研讨会或在线培训课程,在这些地方交流是有限的。 工藤的聪明之举 通过向客户提供MI,KUDO可以满足虚拟会议中日益增长的多语言需求。COVID-19大流行期间,随着面对面会议的回归,大规模转向在线会议的趋势有所稳定,然而,虚拟活动并没有消失,随之而来的是对虚拟解决方案的需求。活动组织者越来越希望在会议中增加语言,但对于短时间活动(如一小时长的会议)或预算较小的活动,人工口译并不总是可行的解决方案。 这就是机器生成的现场字幕和MI发挥作用的地方。用户可以选择他们想要阅读字幕的语言或收听合成语音。如果事件参与者想在做其他事情的同时收听会议,MI解决方案是有用的。它还可作为盲人或部分失明者想要参与多语言活动的辅助功能。 因此,尽管KUDO发布的这个软件不是同类产品中的第一个,但它确实为其他RSI平台可能追随它的脚步树立了一个先例。KUDO引以为豪的是其处于语言技术的前沿,这款产品展示了公司创新的意愿。 对口译员的影响 至于人类口译员,情况可能不会有太大变化。长时间、更复杂的会议将继续需要人类口译员,特别是当风险很高或情况需要机器(尚未)提供的一定程度的情商时。dMI将很可能填补口译员最初不愿提供服务的空白--简短、相对简单、低预算的在线会议。

