BSC develops How2Sign, AI database for Sign Language Translation


2021-04-01 07:00 multilingual


阅读模式 切换至中文

Enlisting the help of artificial intelligence, the Barcelona Supercomputing Center has developed a database called How2Sign for the automatic translation of sign language. Lead by Amanda Duarte, PhD candidate and researcher at BSC’s Emerging Technologies for AI group, How2Sign is expected to debut this June at CVPR — Conference on Computer Vision and Pattern Recognition, ranked first among AI conferences by Google in 2020. Duarte, who completed her master’s in computer engineering in Brazil, outlines the goals of her work on her website: “My research aims at giving sign language users further access to information. Specifically, my work focuses on developing systems that enable automatic translations of online content into sign language representations.” After leading Speech2Signs in 2018 — a project aimed at improving the task of speech to sign language translation, awarded the Caffe2 Research Award by Facebook — Duarte spent more than two years compiling the recordings for How2Sign, which, at its debut, will offer 80 hours of sign language videos. Professional interpreters translate various different types of video tutorials, ranging from crafting to cooking recipes, in American Sign Language (ASL). Of the 80 hours of video, three were recorded at Panoptic Studio, located at Carnegie Mellon University, which is a singular dome-shaped multiview studio equipped with 510 cameras. The videos shot at Panoptic will allow researchers to reconstruct and learn from the interpreters’ three-dimensional postures and movements. How2Sign, referred to by Duarte as “the first large-scale continuous American Sign Language dataset,” is intended to give researchers within both the fields of computer vision and natural language processing insight into automatic sign language understanding and production, with the aims of facilitating technological accessibility to over 400 million Deaf or hard-of-hearing individuals around the world. A lack of adequate subtitling and captioning is endemic to video-sharing platforms such as YouTube and Facebook — with Speech2Signs and How2Sign, Duarte aims to solve an important problem that Deaf people face around the world, by providing a system that automatically generates sign language translation of the speech in any given video.
巴塞罗那超级计算中心在人工智能的帮助下开发了一个数据库,目的是用于手语的自动翻译,名为How2Sign。由博士候选人,BSC人工智能集团新兴技术研究员Amanda Duarte领导的How2Sign,预计将于今年6月在2020年被谷歌评为人工智能会议的第一名的CVPR-计算机视觉和模式识别会议上首次亮相。 杜阿尔特,在巴西完成了自己的计算机工程硕士学位,他在自己的网站上概述了工作目标:“我的研究目的是为了让手语使用者能获得更多的信息。具体地说,我的工作重点是开发一种能够将在线内容自动翻译成手语表示的系统。” 杜阿尔特在2018年领导了Speech2Signs项目之后,花了两年多时间为How2Sign编写录音,重点是为了改进语音到手语的翻译任务,这获得了Facebook颁发的Caffe2研究奖。How2Sign首次亮相时将会提供80小时的手语视频。专业口译员将用美国手语(ASL)翻译各种不同类型的视频教程,涉及内容包括从手工艺到烹饪食谱。在这80小时的视频中,其中有三个是录制于卡内基梅隆大学的Panoptic Studio,这是一个配备了510台摄像机的奇异的圆顶形多视角摄影棚。在Panoptic上拍摄的视频能提供研究人员重建并学习口译人员的三维姿势和动作的机会。 被杜阿尔特称为“第一个大规模连续的美国手语数据库”的How2Sign,旨在让计算机视觉和自然语言处理领域的研究人员深入了解手语的理解和自动生成,目的是促进全世界4亿多聋人或听力障碍者的技术普及性。字幕过少在YouTube和Facebook等视频分享平台是一个普遍现象。杜阿尔特意在通过Speech2Signs和How2Sign来解决全世界聋人面临的重要问题,那就是为聋人提供一个系统,自动生成任何给定视频中音频的手语翻译。