Survey Examines Machine Translation Post-Editing Among Freelancers and LSPs

关于自由职业者和语言服务供应商的机器翻译后编辑的研究

2019-07-24 05:00 slator

本文共1052个字,阅读需11分钟

阅读模式 切换至中文

Two recent surveys conducted by PhD student Clara Ginovart have focused on practitioners’ views of post-editing of machine translation (PEMT). Supervised by Marina Frattino, Ginovart’s PhD at the Pompeu Fabra University, Spain, is in Training in Post Editing Machine Translation. Ginovart’s university directors are Carme Colominas from Pompeu Fabra University, and Antoni Oliver from Universitat Oberta de Catalunya. The granting institution is Agaur. The first survey aimed to look at the state of PEMT from a business perspective. The 66 respondents to this survey each had experience in outsourcing machine translation post-editing, and were mainly working in small to mid-sized LSPs. The survey opened in December 2018 and closed in February 2019. LSP respondents were mostly based in Spain (17), while the remainder were based elsewhere in Europe (42) and in Russia, Turkey, or other (7). LSP respondents said the most common source language for post-editing was English. Spanish, English, French, German, Italian, and Dutch were the most common target languages. The second survey aimed to look at the state of PEMT from a post-editor’s perspective. The 142 respondents were mainly freelance or independent translators (84%). The survey opened in January 2019 and closed in April 2019. Top locations for post-editor respondents were Spain, Italy, UK, Germany, and France. Top target languages represented in the survey were English, Spanish, Italian, German, French, and Dutch. English was, by far, the most common source language for post-editors, with French, Germany, Spanish, Italian, and others also represented. Slator reviewed the survey results and analyzed the findings, comparing the similarities and differences between answers from the two groups of respondents where relevant. LSPs had a variety of resourcing models for the post-editing task: 85% said they rely on a pool of freelance post-editors to some degree, while a small number (15%) said they do not outsource any post-editing tasks. Over half (58%) the LSP respondents said they do some form of post-editing in-house. Most LSP respondents (56%) said their end customers decide whether or not a post-editing workflow is used, while 41% said the decision is made internally. Of 58 LSP respondents, 60% said they inform the client when using MT workflows. Of 96 freelance respondents, the majority (78%) said that most of the post-editing work they do is for LSPs, while only 20% said that most of the post-editing work they do is for direct customers. Three quarters (75%) said the requester is responsible for deciding whether it is appropriate to use PEMT, while 18% said they themselves choose whether or not to use PEMT. 73% of LSP respondents and 62% of post-editors said that post-editing accounts for 25% or less of all translation work. A similar number of LSP respondents (21%) and post-editors (18%) said that post-editing accounts for between 26% and 50% of all translation work. 100 post-editors said they post-edit high-visibility content (for public consumption) and 67 post-editors said they post-edit low-visibility content (for limited dissemination). The majority of LSP respondents (73%) said post-editing to a human-professional standard is the most commonly used form of PEMT, while 21% said the service most commonly required is light or “good enough” PEMT. There was a discrepancy between the level of output quality observed by LSP respondents and post-editors: 73% of LSP respondents and just 42% of post-editors said the task of post-editing involves “improving medium (acceptable) quality raw output to publishable quality,” while 56% of post-editors said they are asked to improve poor quality output to either a publishable or acceptable quality. There was also a discrepancy around how much feedback post-editors were asked to provide on the quality of the MT output: 70% of post-editors said they are not asked for feedback on MT output quality, yet 73% of all LSP respondents said they ask the post-editor for feedback. Another discrepancy between the two groups was in the level of instructions respondents said were provided: 46% of post-editors said they are not provided specific guidelines for PEMT, while most LSP respondents (87% of 45) said they give detailed instructions to the post-editor, whether standardized for the company or tailored to the content type or language. Productivity levels observed by LSPs and post-editors were similar, although post-editors generally reported higher levels of productivity based on words post-edited per hour than LSP respondents. 55% of post-editors said that they track their PEMT productivity. Of 64 post-editors, around half use Excel to track their productivity, while around 10% use proprietary tools. Less than 10% use third-party project management software. Of 44 LSP respondents, some use internal productivity tracking tools (13) and others use Excel (10). Still others use project management software (9) and some use no tools at all (7). Both groups reported that the dominant payment model was per source word. 57% of post-editors and around 40 out of 56 LSP respondents said they use this model. Less than 10 LSP respondents said they pay per hour based on editor-reported time spent, while 25% of post-editors said they charge per hour. Post-editors were asked what MT systems they used most. Popular MT systems for post-editors included Google, DeepL, SDL (Adaptive MT, Language Cloud, or ETS), Amazon Translate, and SYSTRAN. The most commonly used productivity tool (CAT) for post-editors and LSPs is SDL Trados Studio. Other popular productivity tools for both LSPs and post-editors include memoQ, Memsource, Wordfast, MateCat, Smartcat, Across, Transifex, Localize, and GlobalSight, as well as proprietary solutions. Post-editors also used other tools in addition to those mentioned. The top QA tool for post-editors and LSPs is Xbench followed by “none.” Next preferred options for LSPs are Verifika and QA Distiller, while post-editors employ a wider range of QA tools. Both LSPs and post-editors had mixed opinions on the quality of existing PEMT training courses: 41 post-editors felt they are not adequate, 45 said they are adequate, and 56 do not know. Some post-editors (23%) had a training course provided by their company, a few had one provided externally (12%), or said they had done one at university (8%). 42% of LSP respondents felt that current PEMT training courses are not adequate, while 35% said they are. 53% of post-editors said they had never attended a PEMT training course, while two-thirds (67%) of LSPs said they had not yet organized specific training on PEMT.For complete survey results and additional information, contact Clara Ginovart at clara.ginovart@upf.edu.
最近由博士生 Clara Giovart 进行的两项调查关注了从业者对机器翻译后编辑的看法。在西班牙庞贝·法布拉大学 Giovart 博士 Marina Frattino 的指导下,他正在接受编辑后翻译机的培训。Giovart 的大学主管是 Pompeu Fabra 大学的 Carme Colonias 和 Oberta de Catalunya 大学的 Antoni Oliver 。授予机构是 Agaur 。 第一次调查的目的是从商业角度看 PEMT 的状态。这次调查的66位受访者都有过外包机器翻译后编辑的经验,主要从事中小型 LSP 工作。调查于2018年12月开始,2019年2月结束。 LSP 调查对象大多位于西班牙(17个),其余的则位于欧洲其他地方(42个)以及俄罗斯、土耳其或其他地方(7个)。LSP 的受访者表示,编辑后最常用的源语言是英语。西班牙语、英语、法语、德语、意大利语和荷兰语是最常见的目标语言。 第二次调查的目的是从后编辑的角度看 PEMT 的状态。142名受访者主要为自由撰稿人或独立翻译员(84%)。这项调查于2019年1月开始,并于2019年4月结束。 后编辑受访者的首选地点是西班牙、意大利、英国、德国和法国。调查的主要目标语言是英语、西班牙语、意大利语、德语、法语和荷兰语。到目前为止,英语是编辑最常用的源语,法语、德语、西班牙语、意大利语和其他语言也有代表。 Slator 回顾了调查结果并分析了调查结果,比较了两组受访者在相关情况下回答的异同。 LSP 为编辑后工作提供了多种资源配置模式:85%的人表示,他们在某种程度上依赖于一批自由撰稿人,而少数人(15%)表示,他们不会将任何编辑后工作外包。超过一半(58%)的 LSP 受访者表示,他们会在内部进行某种形式的编辑。 大多数 LSP 受访者(56%)表示,他们的最终客户决定是否使用了编辑后的工作流程,而41%的受访者表示,这一决定是在内部做出的。在58位 LSP 受访者中,60%的人表示他们在使用 MT 工作流程时会通知客户。 在96名自由职业者中,大多数人(78%)说他们所做的大多数编辑工作是针对 LSP ,而只有20%的人说他们所做的大多数编辑工作是针对直接客户。四分之三的人(75%)表示,请求者负责决定使用 PEMT 是否合适,18%的人表示自己选择是否使用 PEMT 。 73%的 LSP 受访者和62%的编辑表示,在所有翻译工作中,编辑占25%或更少。同样数量的 LSP 受访者(21%)和编辑(18%)表示,编辑工作占所有翻译工作的26%到50%。 100名编辑表示,他们发布了高可见度的内容(供公众消费),67名编辑表示,他们发布了低可见度的内容(用于有限的传播)。 大多数 LSP 受访者(73%)表示,以人性化的专业标准进行编辑是最常用的 PEMT 形式,而21%的受访者表示,最常用的服务是轻量级或“足够好”的 PEMT 。 LSP 调查对象和后编辑观察到的产出质量水平存在差异:73%的 LSP 调查对象和42%的编辑表示,后编辑的任务是“提高中等(可接受)质量的原始产出,以达到可发布的质量。”56%的编辑表示,他们被要求将质量差的产出提高到可公布或可接受的质量。 关于要求编辑提供多少关于 MT 输出质量的反馈意见,也存在分歧:70%的编辑表示,他们没有被要求提供 MT 输出质量的反馈意见,但73%的 LSP 受访者表示,他们要求编辑提供反馈意见。 这两个群体之间的另一个差异是,受访者表示提供的指导水平:46%的编辑表示,他们没有为 PEMT 提供具体指导,而大多数 LSP 受访者(45人中的87%)表示,他们向编辑提供了详细指导。是否为公司标准化或适合于内容类型或语言。 LSP 和后编辑观察到的生产率水平相似,尽管后编辑通常报告基于每小时后编辑的单词的生产率水平高于 LSP 受访者。 55%的编辑说他们跟踪他们的 PEMT 工作效率。在64位编辑中,约有一半使用 Excel 来跟踪他们的生产力,约有10%使用专有工具。不到10%使用第三方项目管理软件。在44个 LSP 受访者中,有些人使用内部生产力跟踪工具(13),另一些人使用 Excel (10)。还有一些人使用项目管理软件(9),有些人根本没有使用任何工具(7)。 这两个组织都报告说,主要的支付模式是每一个源词。57%的编辑和大约40%的 LSP 受访者表示他们使用这种模式。不到10名 LSP 受访者表示,他们根据编辑报告的时间每小时付费,而25%的编辑表示他们每小时收费。 后编辑被问到他们使用最多的 MT 系统。流行的后编辑 MT 系统包括 Google 、 DeepL 、 SDL ( Adaptive MT 、 Language Cloud 或 ETS )、 Amazon Translate 和 SYSTAN 。 用于编辑和 LSP 的最常用生产力工具( CAT )是 SDL Trados Studio 。其他流行的 LSP 和后编辑生产力工具包括:备忘问题、 Memesource 、 Worddfast 、 MattCat 、 Smartcat 、 Inter 、 Transifex 、 Localize 和 GlobalSight ,以及专有解决方案。除了上述工具外,编辑人员还使用其他工具。 后编辑和 LSP 的最高 QA 工具是 Xbench ,其次是“ none ”。LSP 的下一个首选选项是 Verifika 和 QA 蒸馏,而后编辑使用更广泛的 QA 工具。 LSP 和编辑对现有 PEMT 培训课程的质量都有不同的看法:41名编辑认为他们不够充分,45名认为他们足够,56名不知道。一些编辑(23%)参加了由他们公司提供的培训课程,一些人在外部提供了一门(12%),或者说他们在大学完成了一门(8%)。42%的 LSP 受访者认为目前的 PEMT 培训课程不够充分,35%的受访者表示。 53%的编辑表示,他们从未参加过 PEMT 培训课程,而三分之二(67%)的 LSP 表示,他们尚未组织过有关 PEMT 的具体培训。如需完整的调查结果和其他信息,请联系 Clara 的 Clara Giovart 。ginovart @ upf.edu 。

以上中文文本为机器翻译,存在不同程度偏差和错误,请理解并参考英文原文阅读。

阅读原文