Google Translate

April 9, 2008

Let’s get this straight. I’m a big machine translation fan. Many of my colleagues think machine translation (or MT, as they call it) is the devil. They worry that MT might be after their jobs, or that MT might be giving a bad name to the general concept of translation. As a specialist, I see MT the way that a litigation attorney might see a website that will write your Last Will and Testament for $9.95 — it’s in the same general field, but it’s hardly a threat to my business.

If anything, in the world of patent translation, MT has increased the demand for human translation. Because of MT, foreign documents that nobody would have looked at in the past are being read. Some of these documents turn out to be significant. The significant ones get sent to people like us for a real translation.

You can imagine, then, that I was pretty pumped to hear that Google, the most beloved resource of all translators, is offering something new in the MT world. The truth is that Google has been offering machine translation for some time, but they recently revamped their systems in a way that has been causing a stir.

The new system works with a modified version of the currently fashionable statistical machine translation (SMT) method. Although I worked in MT development in the late 80s, and Patent Translations Inc. even offered a machine translation service for a while, I don’t have much expertise with SMT. I guess, as a child of the 80s, I still think of MT in terms of rules. I did, however, see a presentation on the state of the art of SMT at the 2006 ATA Translation Companies Division Conference in New Jersey. I’ve got to say, I wasn’t impressed.

SMT sounds great on paper. It works by looking at how human translators have translated things in the past and applying what it finds to new translations that it is asked to perform. The problem is that, in doing so, because rules are subordinate to statistical trends, it tends to forget about the grammar that bound the words in the original sentence together. Worse, some words can be left out of the translation all together, just because they were not used in the translation corpus that the program is basing its decisions on. The translated sentences often appear to make sense, but when compared to the source, it is clear that the original meaning was very different.

When I heard about Google’s new system, I gave it a French patent as a test. French is usually the best language for MT, and patents are fairly MT friendly. At first pass I was really impressed. There were some wacky bits, of course, but the overall readability was very good for MT, and the system had done what looked like an outstanding job at translating multi-word technical terminology — something that rule-based systems are notoriously poor at. Within a few minutes, however, my happy surprise had turned to dismay. Upon closer examination, I saw that identical technical terms were being parsed and translated in radically different ways at different places in the document. Completely extraneous material, which had not been so much as suggested in the source, was to be found here and there throughout the translation. And time and time again, important lexemes were missing. Not much headway has been made since 2006.

I’m not going to post the patent — it’s too long for one thing — but we can get an idea of both the advantages and disadvantages of the system by taking a look at how it handles the front page of Libé (a popular French paper) . Google translates the first headline as, “The students’ struggle to save their parade profs.” One is left wondering what a “parade prof” is but, as a liberal arts major, it’s not too hard for me to imagine, and the sentence flows well enough, so I’m likely to be willing to go with the flow and hope that I’m getting the gist. Unfortunately, the original French was, “Des lycéens «en lutte» défilent pour sauver leurs profs.” Systran’s rule-based MT translates the same sentence as “High-school pupils “in fight” ravel to save their Profs.” While it is clearly more common to see students unravel than ravel, at least the original grammatical meaning of the sentence is more or less preserved. And if we don’t know what the computer means by “ravel,” at least we know that we don’t know. (We could even go and ask a real translator, who would translate “défiler” in this context “march.”)

The next bit of text — the lead — shows Google Translate in a better light. The source is “Plusieurs milliers de lycéens se sont rassemblés à Paris cet après-midi pour protester contre les suppressions de postes d’enseignants. Une nouvelle manifestation doit avoir lieu mardi.” Systran (rules based) gives this as, “Several thousands of high-school pupils gathered in Paris this afternoon to protest against the removals of posts of teachers. A new demonstration must take place Tuesday.” While the meaning is clear enough, it is definitely unpleasant to read. The readability is much better with Google, which gives it as, “Several thousand students gathered in Paris this afternoon to protest against the abolition of posts of teachers. A new event is scheduled to take place Tuesday.”

The last sentence is particularly impressive. It’s smooth. It’s slick. It conveys the gist of the source text. It even sounds like it was written by a native speaker of English. There is only one problem: that’s not what the source text said. There was no specific mention of scheduling, and the thing that was to take place was not a generic event — it was very specifically a demonstration.

Some readers may think that I am splitting hairs, and I would be the first to admit it. I wrote av entire chapter on hair splitting for the ATA Patent Translator’s Handbook. That is because patent practice — the prosecution and litigation of patents — is all about splitting hairs. You get whole teams of lawyers working through the night on arguments such as, “You said circle, but this is an oval,” or “Moving something is totally different from transporting something.”

That makes Google Translate woefully inappropriate for patent attorneys. When reviewing prior art –which is the only thing that patent attorneys use MT for — what an attorney wants to know is whether certain technical ideas have been described. To do that, they need to know if specific elements have been mentioned. Google Translate, while producing relatively readable output, adds (”scheduled”) and removes (”demonstration”) specific elements according to the whims of its statistical heart. And that just won’t do.

For now, patent practitioners are better off sticking to the built in MT engines available on the EPO and JPO websites and the rule-based Systran system.

For other people, Google Translate is certainly going to be a useful offering. The readability and the (somewhat unreliable) capacity to convey gist will make things written in foreign languages more accessible. Google translate also comes with some pretty nifty bells and whistles. My favorite is the ability to see what the source text actually said, just by moving your mouse over the text. It also gives you the capacity to search the (literally0 World Wide Web in foreign languages by typing in a word in English and letting Google translate that for you before using it in a search and then returning the hits in translated form.

So while the new kid on the block is unlikely to rock the world of patent practitioners, it at least makes a cool toy to play with.


ATA Patent Translator’s Handbook

February 19, 2008
handbook-large.jpg

After a lot of hard work, especially on the part of Alison Carroll, our editor, The Patent Translator’s Handbook is now on sale. The book is a compendium of knowledge by some of the world’s most experienced and knowledgeable patent translators and translation managers and includes an introduction by a patent attorney who is also a translator.

Although there are a number of books written in Japanese on the subject of patent translation, and the ATA publishes the Japanese Patent Translation Handbook, which is written in English, The Patent Translator’s Handbook is the first book written in English on the subjection of patent translation from all languages into English. A number of the contributors translate from German and there are even some German-English glossaries, but the book’s authors also translate from French, Japanese and Spanish and the chapters are written with translators of any language in mind.

The book is, first and foremost, a how-to guide for patent translators, with ample introductory information for the novice, but it also provides a formal definition of literal translation in the context of patents and a methodology for producing and evaluating translations according to this methodology. We can expect this to become a touchstone for law firms that manage their own translation and a reference in litigation where translation is an issue. There is also a chapter on managing translation for litigation, which every patent paralegal who works on multinational cases should read. In fact, I have already had calls from a few large law firms asking where they could buy a copy.

To answer that question, the best way is to order it online from the ATA. If you are wary of entering your credit card information online, you can also call the ATA at 03-683-6100.

My own chapter, which is on literal translation can be viewed online in the Resources section of the Patent Translations Inc. web site. I also contributed the glossary of patent terminology, and I hope to make that available online soon.

I am planning to blog in more detail about the individual sections and chapters but, for the moment, I’ll just post the table of contents to pique your interest — assuming that you are the sort of person whose interest is piqued by this sort of thing.

 

Contents

v Preface
vii Introduction
 
 PART I: THE ART AND PRACTICE OF PATENT TRANSLATION
3 Approaches to Patent Translation: Many Ways to Build a Mousetrap Kirk Anderson
11 An Introduction to Patent Translation Nicholas Hartmann
19 Literal Translation of Patents Martin Cross
29 Industrial Property Considerations for Patent Translators R. Vivanco Cohn
 
 PART II: TOOLS AND RESOURCES FOR PATENT TRANSLATORS
41 Internet Resources for the Translation of Patents Into English Steve Vlasta Vitek
49 Developing a Lean, Mean Patent Translation Memory Suzanne Friis Gagliardi
 
 PART III: PATENT LITIGATION
57 Managing Patent Litigation Projects Alison Carroll and Lillian Clementi
 
 PART IV: INDUSTRYSPECIFIC RESOURCES FOR PATENT TRANSLATORS
75 Translating Biotech Patents Alice M. Berglund
85 Intellectual Property and Biotechnology Patents Patricia Thickstun
97 Translating Automotive Patents Gabe Bokor
 
 PART V: CONCLUSION
115 Live and Learn: Lessons from a Veteran Patent Translation
Team: An interview with Jan McLin Clayberg and Olaf Bexhoeft
Conducted bv Alison Carroll and Lillian Clementi
 
 
119 Glossary of Patent Terms Martin Cross
129 German-English Glossary of “Patentese” Jan McLin Clayberg
135 Biotechnology Glossary for Patent Translators Patricia Thickstun
151 Index

Attorneys Working Smarter

February 8, 2008

What once was a rate event has become a common occurrence. Attorneys now call us up several times a week to ask us to answer specific questions about what is or is not disclosed in a foreign publication or to have just specific sections of a document translated.

In days of yore, 97% of the requests we got were for complete translations and the other 3% were for translations of just the claims. Something is changing. I know I have done my part by sending out newsletters and making suggestions when I get an attorney on the phone. But the change is the result of more than advertising alone. A fair number of the people who contact us have never dealt with us before. They just call up saying that they want to find out what is in some documents and that they would rather not order a full translation if they don’t have to.

I’ve been asked by colleagues whether I resent this trend. After all, a full translation can bring in thousands of dollars, while summary reports and partial translations tend to run a couple hundred at most. In the long term, however, these types of orders always lead to more work. Firstly, because the attorney can gain access to foreign texts without racking up large disbursements, they are simply more likely to use foreign material. And when these cost-effective means turn up something of real value to the case, full translations follow. The second reason why smart-translation orders mean good business is that working this way requires trust. You cannot take this sort of request to a harried project manager at a translation factory that outsources texts of every description to students and housewives. These assignments require expertise, and being able to provide that expertise pays off in both customer loyalty and word of mouth referrals.

There are still plenty of situations in which the attorney knows from the start that they will need a full translation or knows that their fastest route to understanding is by perusing the complete document. That said, it is nice to see that those same attorneys are coming to realize that they have options.


Their worst work is my best work

January 10, 2008

If you ever get a translation of a published patent from me that is full of run-on sentences, inconsistent terminology and weak logic, you may just be looking at my best work.

When translating patents for information or litigation support, our job of is like that of a court interpreter — we reproduce what was said without omission or embellishment and strive to make ourselves invisible. Our clients would not be well served if we added matter to fill in the gaps in an incomplete disclosure, or if we took on the role of editor so that the translated claims seemed better supported by the specification than they were in the original. And though it might be tempting to unify disparate terminology, by doing so, we could be denying our client a useful argument against the patent or — if our client is on the other side — producing a false sense of security that risks being shattered by a more accurate translation when used in court.

To the non-translator, turning bad writing in one language into bad writing in another might appear to be a simple task, perhaps even easier than producing a polished final product from a high quality original, but nothing could be further from the truth.

Imagine, if you will, a carpenter who is given a bookshelf and asked to produce a copy of it. If the joints are square and the screws are driven straight, then all that will be required of the craftsman is good carpentry. But if the original workmanship is shoddy, the task of faithful reproduction becomes much more difficult. The slope of a crooked shelf must be exactly matched. Nails that were carelessly bent over by a badly wielded hammer must be meticulously bent into that same shape with pliers. Finally, the overall structure of the copy, which is the sum of the individual flaws, must be just as rickety as the original, but no more so.

It’s the “not more so” part that is really hard. The translation must be readable . The information must be conveyed as clearly as possible. But as with any other form of reproduction, translation necessarily results in signal-to-noise loss, and part of the translator’s job is to compensate for this, so that the clarity of communication in the translation is comparable to that in the original. As anyone who has photocopied a faded fax will know, the loss is always tends to be more pronounced if you start with a low-quality original. That is why people who are new to translation sometimes produce gobbledygook that sounds as if it were written by the inmate of a mental hospital and, when questioned, reply with the familiar words, “But that’s what it says in the original.” The true skill of the translator lies in being able to reproduce the original content, without omission or embellishment, while maintaining the clarity and internal logic of the original, no matter how sketchy that may be. If you are interested in a methodology for doing this, read on.

Matters are made worse by the fact that sloppy writing is often the handmaiden of sloppy thinking. Badly worded specifications are also likely to include conclusions that do not follow from their premises, internal contradictions, misclassifications and straightforward misstatements of fact. A good translator will be extremely reluctant to reproduce such problems without first carefully double-checking and then seeking a second opinion from a colleague, to be sure that the error is, in fact, in the original writing and not in their understanding of it. In addition, when the technology being described is complex, these problems can make it much more difficult for the translator to fully grasp the invention being described.

Unfortunately, for monolingual purchasers of patent translations, it is very hard, if not impossible, to determine whether the badly written document on their desk is a well written specification that was badly translated, or a badly written specification that was expertly translated. In this regard, one can rely only on long relationships and trust. But if it comes off my desk, and it is less than elegant, rest assured that it is a carefully crafted labor of love.

Martin Cross
Japanese Patent Translation


There goes Europe

September 13, 2007

It looks more an more likely that European country will soon be foregoing the requirement for translating EPO patents into the language of each of the member states that you file in. That is, of course, sad news for all those European translators who earn their living from this activity, but ultimately it frees up funds for more aggressive and comprehensive IP activities, which in tern will lead to more research and more litigation, which should give those translators plenty to do.


Too many cooks

August 21, 2007

I once translated a priority document together with another translator. We used a translation of the priority document, which had long ago been filed with  the USPTO, as our common terminology reference (that way both of us would use the same terms in our translations).

The translation published by the USPTO was probably done by a junior staff member in a Japanese law firm. I say that because the wacky English could never have been produced by a native speaker. I say ‘junior’ staff member because the translation was not consistent. In places, really bad expressions had been fixed, but they had been fixed into English that is still idiomatically incorrect, which suggests review by a senior member of the Japanese firm.

Where it got interesting was in the claims, where the corrections had clearly made by a US attorney. While you could see how what what was written in the translation could have lead a monolingual reader to imagine that was what was being said, they didn’t actually match up with the original Japanese.

The end result was an application with claims for something other than the client had original claimed. What is more, the claims as drafted in English were not supported by the translated description in the specification as filed.

This was a job that was being done at at distance and I know nothing about the party who requested the translation, so I don’t know whether they were challenging the patent (and therefore delighted by what they saw in the accurate translation) or defending the patent (and therefore crestfallen).

At PTI we have a multi-person editing system as well, sometimes with as many as four people (lead translator, checker, technical expert, legal expert, proofreader, etc.) making changes to a translation, but as the second-last step in the editing process is a translation checking step that includes verifying consistency of both terminology and substantive meaning,  problems like these could never occur.

Japanese patent law firms could achieve a similar effect just by having the Japanese attorney who reviewed the junior staff member’s translation read the US attorney’s version of the claims against the original.  It would be interesting to know how often this happens.


I can’t help myself

August 16, 2007

I know I promised to stop posting the silly offers that were sent to me but you have to see this one from a proofreading service.

All you have to do is send us your document as a word attachment with the deadline and we will guarantee delivery of a perfectly written document to give you complete confidence when you submit your work.

I guess they meant a “Word attachment.” Love it!

Martin Cross
Japanese Patent Translation


10%

August 13, 2007

Some law firms are keen on saving as much as they can on translation costs. A difference of 10% can often be the deciding factor in choosing a translation provider. But I wonder how many compare the ratio of the number of words in the source text to the number of words in the target text.

Recently I shared a translation with anther translator. It was one of those frantic rush jobs for a court submission deadline and the law firm requested that, in order to save time, the two of us work independently without a common editor. When it came time to paste the finished translations into the template, I was surprised to see that the other translator’s work would not fit at the specified font size. After a bit of poking around, I found that the other translator was producing a target text with about 10% more words per unit of Japanese source text than mine. It was just a matter of writing style, and the problem was easily solved by reducing the font size, but it was interesting to note that, while we were probably both charging the same rate, one of us was more expensive than the other.

Martin Cross
Japanese Patent Translation


Resources

August 9, 2007

At long last the resources section is back online in our new website. There are a couple of new things, including Power Point presentations from some talks I gave and an article on managing translation costs that every patent attorney really should read.

Martin Cross
Japanese Patent Translation


Not a Patent!

August 7, 2007

I don’t know how they do it — ordinary translators that is. I had to translate some draft technical specifications (as in, specs, as opposed to a spec) today for a device that shall remain nameless. I say “remain” nameless because that’s the way it started out. I got 11 pages of technical description and a model number, but no hint as to what the device was and, of course, no helpful sections like, “Technical Field” or “Background.” Nor did I have the option of looking up the patent family. Nothing! Nada! Just a bunch of words on a piece of paper.

In the end I figured out what the mystery device was by Googling combinations of terms used in the document until I hit on a class of device that matched the description, but golly gee willikers — as Donald Rumsfeld would say — that sure is a lot of work.

And then there is this business of the author saying things that are not explict and surrounded by comforting redundant phrases. And no reference numerals!

I understand that I am spoiled, what I cannot figure out is why the gates to the patent translation specialization are not overrun by eager candidates fleeing the field of general technical translation.

Martin Cross
Japanese Patent Translation