Hugging Face Raises USD 40m for Natural Language Processing Platform

Hugging Face为自然语言处理平台筹集4000万美元

2021-03-30 18:25 slator

本文共454个字,阅读需5分钟

阅读模式 切换至中文

Hugging Face, a machine learning startup with headquarters in Paris and New York, raised USD 40m in a series B round that closed March 11, 2021. As reported by VentureBeat, VC firm Addition led the round, with Lux Capital, A.Capital, and Betaworks participating. Notable individual investors included Dev Ittycheria, CEO of app development database MongoDB; Florian Douetteau, CEO of AI and machine learning company Dataiku; and Richard Socher, former chief scientist at Salesforce. Hugging Face plans to use the funds to grow an open-source community for language model development. This broad goal marks a significant departure from the startup’s original vision and product, a chatbot “friend” that can gauge users’ moods and modify its responses accordingly. Hugging Face now creates custom machine learning models for over 100 clients, including Bloomberg and Qualcomm. The same VentureBeat article quoted Hugging Face CEO, Clément Delangue, as saying, “We’ve always had acquisition interests from Big Tech and others, but we believe it’s good to have independent companies — that’s what we’re trying to do.” Since its 2016 founding, Hugging Face has raised a total of USD 60m. The startup was reportedly cash-flow positive in January and February 2021. The Hugging Face team has grown beyond just co-founders Delangue and Julien Chaumond (CTO) to 30 employees. The company describes itself as “growing fast and hiring for every position you can think of.” New developers and software engineers will likely work on machine translation (MT) in some capacity. Hugging Face made an ambitious first foray into MT with its May 2020 release of 1,000 MT models trained using unsupervised learning and the Open Parallel Corpus (OPUS). Before that, the company had launched Transformer model libraries for both PyTorch and TensorFlow. The libraries include AI models with the potential to improve MT, such as Google’s open-sourced BERT. Notably, Hugging Face has monetized its membership structure. “Contributors” can upload public models, access community support, and follow tags for new model alerts free of charge. “Supporters,” who pay a USD 9 monthly fee, can upload up to five private models. An annual fee of USD 108 might not sound like much, but with 100,000 existing community members and counting, it has the potential to add up. All this to say that Hugging Face believes that at least some participants will be willing to pay for the opportunity to further progress on key projects, such as “fine-tuning week for low-resource languages” for speech-to-text translation. “The goal is to provide Wav2Vec2 [or ASR] speech models from @facebookai in 60 languages to the community,” the company tweeted on March 17, 2021. “All languages should have access to SOTA!” Hugging Face will award a prize to the creator of the best model for each language.
总部位于巴黎和纽约的机器学习初创公司HuggingFace在2021年3月11日结束的B轮融资中筹集了4000万美元。 据VentureBeat报道,风投公司Addition领投,Lux Capital,A.Capital和Betaworks参与了此次融资。著名的个人投资者包括应用开发数据库MongoDB的CEO Dev Ittycheria;AI和机器学习公司Dataiku首席执行官弗洛里安·杜埃图(Florian Douetteau);和前Salesforce首席科学家理查德·索彻。 HuggingFace计划用这笔资金来发展一个用于语言模型开发的开源社区。这个宽泛的目标标志着与这家初创公司最初的愿景和产品--一个可以测量用户情绪并相应修改其反应的聊天机器人“朋友”--的重大背离。HuggingFace现在为超过100家客户创建自定义机器学习模型,其中包括彭博和高通。 VentureBeat的同一篇文章援引Hugging Face首席执行官克莱门特•德朗格(Clément Delangue)的话说:“我们一直都有来自大型科技公司和其他公司的收购兴趣,但我们相信拥有独立公司是好事--这正是我们正在努力做到的。” 自2016年成立以来,Hugging Face已经筹集了6000万美元的资金。据报道,这家初创公司在2021年1月和2月的现金流为正数。 Hugging Face的团队已经从联合创始人德朗格和朱利安•肖蒙德(CTO)发展到30名员工。这家公司将自己描述为“增长迅速,招聘你能想到的每一个职位。” 新的开发人员和软件工程师可能会以某种身份从事机器翻译(MT)方面的工作。HuggingFace在2020年5月发布了1000个使用无监督学习和开放并行语料库(OPUS)训练的MT模型,首次雄心勃勃地进军MT领域。 在此之前,该公司已经同时为PyTorch和TensorFlow推出了Transformer模型库。库中包括有潜力改善MT的AI模型,比如谷歌的开源BERT。 值得注意的是,Hugging Face将其会员结构货币化。“贡献者”可以免费上传公共模型,访问社区支持,跟随新模型提醒的标签。“支持者”每月支付9美元的费用,最多可以上传5个私人模型。108美元的年费可能听起来不多,但有10万名现有社区成员,而且还在不断增加,这是有潜力的。 凡此种种都要说,Hugging Face相信,至少会有一些参与者愿意为在关键项目上取得进一步进展的机会买单,比如语音到文本翻译的“低资源语言微调周”。 “目标是向社区提供来自@Facebookai的60种语言的Wav2Vec2[或ASR]语音模型,”该公司在2021年3月17号发推文称。“所有语言都应该可以访问索塔!” Hugging Face将为每种语言的最佳模型的创作者颁发一个奖项。

以上中文文本为机器翻译,存在不同程度偏差和错误,请理解并参考英文原文阅读。

阅读原文