| Subcribe via RSS 

What I Wish Translators Would Know about MT

November 24th, 2012 | by LTD

by Rubén de la Fuente

I find it very disheartening that every time translators discuss MT is either to insist it can’t replace humans or to laugh at its flaws, instead of exploring its potential (to boost productivity and profitability) and its threats (shift in business and compensation models are taking place now and it will not be in our best interest if we don’t get involved as soon as possible). MT is a game changer and in order to adjust, here are a few things every translator should know:

  • It’s a mistake to speak about MT in general: output can be very different depending on the system used (and hence more or less useful). Google Translate is not as representative of the state of the art as most people think. Google is generic (not customized per domain) and cannot be customized by users, while other systems can be.

    Rule-based MT systems have built-in grammars and dictionaries. They are easier to customize for people without a technical background. Their output is generally grammatically correct, but less fluent. Statistical MT systems analyze large bilingual corpora and use probability theory to suggest the most likely translation. Its output is generally more fluent, but they are more difficult to customize for people without a technical background.

  • Don’t be fooled by first impressions: many translators claim it takes less to translate from scratch than to post edit, but I don’t think many have actually done the experiment. In order to have a good idea of MT potential, you should use a domain-customized system, edit a sample of significant size (500 words or more), time yourself and see the amount and kind of edits (you can use open source SymEval for that). Then you can properly analyze if you have a business case for using MT or not.
  • MT will not steal jobs; instead, it can bring more work. Companies will only use unedited MT for content they would not send to human translation anyway (internal emails, highly technical content in Knowledge bases). Some companies re-invest the savings obtained with MT in buying more translations. MT developers or companies using MT will also need to hire linguists for several tasks: evaluating engines, post-edit, help improve engines.
  • It can be fun to PE. The post editor is not only there to clean the machine’s mess. S/he can give very valuable feedback to help the system improve. I find it rewarding when I manage to tweak a system so that it will handle properly a certain structure it would choke on before.

Rubén R. de la Fuente has a BA in translation and interpreting from the University of Granada. He has over 10 years of experience in localization in various capacities, including as a freelance and in-house translator, reviewer, project manager, and machine translation specialist. He is currently taking a graduate course on computational linguistics. He has taught several courses and workshops about translation tools for the Universidad Alfonso X and organizations such as the Institute of Localisation Professionals, ProZ, and ecpdwebinars.co.uk. He has written articles on translation tools for ATA’s Language Technology Division. You can reach Rubén at rubo@wordbonds.es.

7 Responses to “What I Wish Translators Would Know about MT”

  1. Michael Says:

    Thank you, Rubén, for your contribution. There is one point you make that I strongly disagree with, though. As a translator, not a techie, I cannot see how it would ever be “fun to PE.” If language is your business, not algorithms and automation, the task of post-editing machine translated text will drive you to jump off a bridge. See http://goo.gl/nr1yT with links to the opinions of others.

  2. Ruben de la Fuente Says:

    Thanks for the comment and the link, Michael. Maybe the wording is not right. For me it is fun to find ways to improve machine translation output, it’s like solving a puzzle. I’m very techie-inclined though, so I think translators of a similar profile will share that thought and the rest will not.
    I’ve done PE for a while and did not find it that terrible, but the quality of MT I was receiving was good.

  3. Ana Guerberof Says:

    Thank you, Rubén, for your perspective on this. I do believe that translators and customers do need to know more about the different variables affecting MT in order to have constructive discussions and agreements. I just wanted to say that what puts a lot of translators off is the fact that customers send raw output without any evaluation and expect high discounts. So, maybe the frustration lies on the fact that there is a mismatch between expectations and quality of the output. Possibly from both sides as well. We definitely need more research in translation!

  4. Maria Carolina Says:

    Hi, Ruben.
    I do think “it takes less to translate from scratch than to post edit” because while typing (and I do type very fast) I already THINK about what I’m translating and then I only have to check for mistypings, whereas if I MT the text I have to stop and analyse word by word what the machine did (sometimes GT ignores some words from the text, for instance).

  5. Ruben de la Fuente Says:

    @Ana, great to have you here, thanks for stopping by :).
    @Maria Carolina, I understand you can think that, but have you tested it properly? Impressions can be deceiving. And also, Google is generic non-customizable MT, very different from domain or client-customized MT.

  6. Martín Ariano Says:

    Hi Ruben:

    I’ve just recently finished my M.A. studies and I based my dissertation on MT-oriented pre-editing rules for the Spanish > English pair. Since that moment, I’ve become particularly interested in MT.
    I agree with most of what you say. However, one of the problems I often find in experiments on post-editing is that researchers expect to find significant results by post-editing only shorts texts (sometimes less than 1000 words). You say at least 500 words would suffice to make an MT business case or not. I think you need much more than that to get an idea of the potential MT tools may have. I would suggest at least 2000 words.

    I also consider myself a technology-oriented translator, however, after taking part in a 6-month experiment on post-editing, I think it’s not healthy to post-edit over long periods of time. Ideally, you should alternate it with some translation work.

  7. Ruben de la Fuente Says:

    Hi Martín,

    Thanks a lot for your feedback. 500 words could be enough depending on how uniform the content is. But maybe it is a good idea to ask for a bigger sample and spot-check randomly.

    It’s a very good point that post-editing full-time might not be healthy and it’s true, ideally you should mix PE with other kind of work. But translators should not turn PE down on principle, it can be an interesting source of additional revenue.

Leave a Reply