LexWorks http:// Machine Translation that Works Sat, 07 Mar 2015 12:47:01 +0000 en-US hourly 1 http://wordpress.org/?v=3.5.1 The Languages of Love http:///translation-blog/the-languages-of-love/ http:///translation-blog/the-languages-of-love/#comments Sun, 26 Jan 2014 02:39:57 +0000 Lori http:///?p=3656 Love is international, so here are more than 400 ways to say that one, all important, phrase: I love you500彩票网安全下载. From the Eskimo language of Aleut (fewer than 300 speakers left) to Zulu (tens of millions of speakers but a small internet footprint), this collection shows the richness and diversity of human communication – and [...]

The post The Languages of Love appeared first on LexWorks.

]]>
Love is international, so here are more than 400 ways to say that one, all important, phrase: I love you500彩票网安全下载. From the Eskimo language of Aleut (fewer than 300 speakers left) to Zulu (tens of millions of speakers but a small internet footprint), this collection shows the richness and diversity of human communication – and why it’s important to preserve our languages. After all, what more basic human need could there be, than to be able to say “I love you500彩票网安全下载” and be understood?

How to say I love you500彩票网安全下载 in different langaugesCrowd-Sourcing “I Love You” in Different Languages

Many sources came together to create what I believe is the world’s most extensive crowd-sourced collection:

  • the translators, post-editors and international staff
  • our partner translation companies in
  • (TWB), a non-profit founded by Lexcelera
  • the fine you500彩票网安全下载ng translators working out of the in Nairobi, Kenya
  • the on Facebook
  • internet
  • .

Not all the languages below are real languages. Some are invented: including two Elvish languages from and one angel dialect.

This is a work in progress. Discrepancies in spelling may be due to a lack of standardization, or an alternate system of transliteration. Or it could, quite simply, be an error. Leave you500彩票网安全下载r comments below if you500彩票网安全下载 have corrections to suggest, or language translations you500彩票网安全下载 would like to add. If you500彩票网安全下载’re interested in knowing more about the languages themselves, each name links to a Wikipedia page.

Language Translation of I Love You
Mon ko lo fon
U’m wloloho
Aembu (or ) Ningwendete
Ko kiciyoh
Ek het jou lief
Të dua
Txin Yaktakuq
Ich hoan dich gear
or Sugpiaq) Nakɫekamken
Ewedishalehu — እወድሻለሁ/አፈቅርሻለሁ (to a female)
Ewedihalehu — እወድሃለሁ/አፈቅርሃለሁ (to a male)
Sheth shen zhon
Arabic ()

Ana baħibbak – ٲنَا بحِبَّك (to a male) / Ana baħibbik – ٲنَا بَحِبِّك (to a female)

Arabic ()

Ouhibouka — أُحِبُّكَ (to a male) / Ouhibouki — أُحِبُّكِ (to a female)

Arabic ()

Uḥibbuk – أحبك

Arabic ()

Bəḥibbək – بحبك

Arabic () Ni ridiki
Arabic () Nhebbik
T’amo
Yes kez siroum em – Ես Քեզ սիրում եմ
Ti voi
Moi tumak bhaal pau – মই েতামাক(আেপানাক) ভালপাওঁ
Ono grohamno lakh (to a female) / Ono grohamno lokh (to a male)
Quiérote
Ki sakihitin
Min bou la yé
Avallaen (constructed language) Vüväloiek dü quuo
Men seni sevirem
Mengweswe
Né bi fè
mi klôa
Min khine yaratau – мин хинэ яратау
Maite zaitut
Me gwes wè
Holong rohangku di ho
Ya tabe kahayu
Ndikufuna
Ami tomake bhalo bashi – আমি তোমাকে ভালো বাসী
Hamlagh-kem (to a female) / Hamlagh-k (to a male)
Ta ole be
(Star Trek dialect) Imzadi
Ham tose pyār karila – हम तोसे प्यार करीला
Namumutan ta ka
Binary (computer dialect) 011010010010000001101100011011110111011001100101 00100000011110010110111101110101
Nahigugma ako kanimo
Ma kia bé nà
Anin sijalad
Volim te
:..:| ..:| |..-.. .::”:.., :.:;
Karout a ran ac’hanout or Da garan or Da garout a ran
Nkwaagala
& Kinyala (Ndombi) (Luhya dialects) Nakhusima or Ndakhukhera
Običam te – Обичам те
, Tsotso & Khisa (Nyole) (Luhya dialects) Ndakhuyanza
Chit pa de
(Ndyuka) Mi lobi you500彩票网安全下载
Mi aime jou
Ti vogliu beni
or Khmer Soro lahn nhee ah
T’estimo
Gihigugma ko ikaw
Ai ranam dai (to a female) / Dai ranam ai (to a male)
Hu guaiya hao
Tere bina main ji nahi pauga
Sun ho ez (to a female) / Sun ho vez (to a male)
Gvgeyui – ᎬᎨᏳᎢ
Ne mohotatse
Ndimakukonda
Ngo oiy ney a
Chinese (or Traditional Chinese) Ngo oiy ney – 我愛你
Chinese (or Simplifed Chinese) Wo ai ni – 我爱你
Chinese (dialect) Gwa ai li
Chinese (dialect) Ngo ai nong
J’t'aquiers
Ai tong ngonuk
Ich liibe-dich
U kamakutu nu
My a’th kar
Ti tengu caru (to a male) / Ti tengu cara (to a female)
Kisakihitin
Mi aime jou
Men Seni süyemen мен сени севем
Volim te
Miluji tě
Chantechiciya
Jeg elsker dig
(Persian dialect) Male tu ra dost darom
Yin nhier
M’bi fê
Na tondi wa
(lifou) Eni a hnimi eö
(Gansu dialect) Veh ai ni – Вә э ни
Dungan (dialect) Ngeh ai ni – Ңə э ни
Siuhang oku dia
Ik hou van jou
Nga cheu lu ga
Mma fi
(constructed by ) Amin mela lle
I love you500彩票网安全下载
(angelic dialect) Olani hoath ol
Mi amas vin
Ma armastan sind
Me te wa ding
Me lonwo
Ma ding wa
Ma dzing wa
Eg elski teg
or Persian Doset daram
au lomalei iko
Minä rakastan sinua
(Western) ‘k zien je geeren
Un nyi wan nu we
Je t’aime
(North) Ik hääw de liif
(Saterland) Iek hääb die ljoo
(West) Ik hâld fan dy
O ti vuei ben
Mido yidouma
Ti ami
Me sumar bho
Ta gra agam ort
Quérote
J’sea un diot do tae
Mi ko me
Me shen miqvarkhar – მიყვარხარ
Ich liebe dich
German (dialect) I mog di
German (dialect) Ik hou fan dei
German (dialect) Isch habb disch libb
German ( dialect) Ick heb di leev
German (dialect) Isch liebdsch
German (dialect) I mog di ganz arg
German (dialect) I liäbe di
S’agapo – Σ’ αγαπώ
Philo se
Asavakkit
Ik hol van die
Rohiyu
Hun tane prem karun chhun – હું તને પ્રેમ કરુ છું
Mwen renmen’w
Ngai oi nyi
Ina sonki (to a female) / Ina sonka (to a male)
Aloha Au Ia`oe
Ani ohev otach –אני אוהב אותך (to a female) / Ani ohevet otcha – אני אוהבת אותך (to a male)
Guina higugma ko ikaw
maiṅ tumse pyār kartī hūṅ– मैं तुमसे प्यार (to a female / maiṅ tumse pyār kartā hūṅ – करता/ करती हूँ (to a male)
Kuv hlub koj
Wa ai lu
Nu’ umi unangwa’ta
Szeretlek
Pip-piyan tana
Pip-piyan taha
Aku sayau nuan
Mmu ma fien
A hurum gi nanya
Ég elska þig
(Luhya dialect) Nakhuyanza
A huru m gi n’anya
Ayayatenka
Ay-ayaten ka
Palangga ko ikaw
– Bahasa (informal) aku cinta kamu
– Bahasa (poetic) Saya cinta padamu
(constructed language) Mi esthe philo tu
Negligevapse
(Eastern) Nagligivagit – ᓇᒡᓕᒋᕙᒋᑦ
(Western) Takuksugusugivagit
Nakuagigikpin
Piqpagiyagit
Inuvialuktun (West ) Nagligivagit
Gráím thú
Ti amo
N’gné kanou
Anata ga suki desu – あなたが好きです
(informal) Aku tresno kowe
(literary) Aku tresno marang sliramu
J’t'aime
Hamlagh-kem (to a female) / Hamlaghk (to a male)
Ningwendete
Laylaydek sik a
Naanu ninna preetisuttene – ನಾನು ನಿನ್ನನ್ನು ಪ್ರೀತಿಸುತ್ತೇನೆ
Kaluguran daka
meh chi chain maai
Myen syeni sooyom – Мен сені жаксы коремін
Nactinra
Bang srolaïgn ôn – បងស្រលាញ់អូន (to a female) / Ôn srolaïgn bang – អូនស្រលាញ់បង (to a male)
Mono ke’ zola nge’
Ningwendete
Ami nkuswele
Ndagukunda
, Nandi, Tugen, Sabaoti & Keiyo (dialects) Achamin
Men seni sueum
Ndagukunda
Nin’ gwachete
Nəsƛ̕éʔ cxʷ
Moo ?ams ni stinta
(constructed, Star Trek) qamuSHa´
I diany gnè
Tu magel moga cho
Saranghaeyo –사랑해요
Nga lungse kom
I walikana
Mi koriou
Ez te hez dikim
Men seni syuem
/Ladin (Dolomites):

Te ei gen – טי אמו

Nga naw hta ha ja
or Teton Sioux Thečhíȟila
Khoi huk chau
Malene datneme eahtsam
Te amo
Es mīlu tevi
Bahibak
Te véuggio bén
Ik hald van dich
Na lingi yo
Aš tave myliu
Mi do prami
Te ami
Ik hou van ju
Nkwagala
Ndakhuyanza
Aheri
Nkwendha
Ech hun dech gäer
Kanyor nanu
(formal) Ve sakam – Ве сакам
(informal) Te Sakam – Те сакам
Hawm ahāṃ se prem karechi – हम अहाँ स प्रेम करेछी
Tia anao aho
Aku cinta padamu
Njan Ninne Premikunnu – ഞന നിണെ പ്രെമികുണു
Inhobbok
Bi shimbe hairambi
Ei nang-bu nung-si
Ma ngal o
Ta graih aym ort
E aroha ana ahau ki a koe
Inchepoyeneimi
(Kalenjin dialect) Achamineny
Me tula prem karto –
Hinenao au ia oe
Yokwe yuk
(creole) Mwen enmen’w
Mo konten twa
‘In k’aatech
Tere del devesteme
Bènan ndjala wè
Mi ding wo
Meng ne nkoung ô
Me ko ou
Ninkwendete
Kesalul
Un lon o
(Alcozauca) Ku toulló ñeloosí
Ka hmalegaih che
A-t vói bèin
Kanbhik
Ngoah mweoku kaua
Te iubesc
Bi chamd khairtai – Би чамд хайртай
Tshemenuadeden
Volim te – Волим те
Mam nong-a fo
Tanbghik تنبغيك
(in English) .. .-.. — …- . -.– — ..-
Mu zola ngé
Nga nint ko chit dae
Mi tonda wè
Ni mits neki
Te voglio bene
Ayóó ánííníshí
(constructed, Avatar) Oe tìyawn ngenga nìftxavang
Ndebele () Ngiyakuthanda
Ndebele () Ngiyakuthanda
Ma tapainlai maya garchu – म तपाइलाइ माया गर्छु।
Jè t’anor
Norwegian Jeg elsker deg
Norwegian Eg elskar deg
Me ama vu
Han nhok han ji
Inuktitut) Ungagivagit
Miye wawe
Ninatemba
T’aimi
Shichusaa – 好ちゅさ
Old English () Ic lufie þec
Old English () Ic lufie þe
Mu tumoku bhala paye
Ani sin jaladha
Æž dæ waržyn – æз дæ уарзын
(Mezquital) Hmädi
Mujhe Tumse Muhabbat Hai
A Kultoir er Kau
Syota na kita
Inaru Taka
Mi ta stima bo

Ta sara meena kowm – زه ستا سره مينه کوم

Kiyahurata
(Farsi)

Dustat dâram – دوستت دارم

Te ami
(constructed) Iay ovlay ouyay
Ih mwauhkin uhk
(Kalenjin dialect) Achaminyi
Kocham Cię
Portuguese – Te amo
Portuguese – Amo-te
Mbe de yid ma
Mẽ tenū̃ piār kardā hā̃ – ਮੈਂ ਤੈਨੂੰ ਪਿਆਰ ਕਰਦਾ ਹਾਂ।
Nyu rondi
Mung jane’
Quechua – Qanta munani
Quechua – Canda munani
Quechua – Munakuyki
Tye-meláne
Hanga rahi au kia koe
A-t vói bèin
Kamaù tut
Te iubesc
Jau t’am
Ya tebya l’ubl’u – Я тебя люблю
Ndagukunda
Mun rahkistan du
Kaasham iye/Oleng
Ou te alofa outou
Mbi yé mô
Snihyāmi tvayi – स्निह्यामि त्वयि
M’tari
Sardinian () Deu t’amu
Sardinian () Deo t’amo
Tha gaol agam ort
Mô mi dènè
Volim – te волим те
Mi nowinda
Ke a mo rata
Ke a go rata
Zahou mitiya anaou
Ngam hwandzo
Nisuhuvendza
Ndinokuda
Piniqamken
Elvish (constructed by J.R.R. Tolkien) Gi melin
Maa tokhe pyar kendo ahyan
Mama oyata aadareyi – මම ඔයාට ආදරෙයි
Techihhila
Ľúbim Ťa
Ljubim te
Waan ku Jecelahay
Aye ga banin
Na moula
Sotho () Ke a go rata
Te amo
Mi lobi joe
Ira fan ma
Nakupenda
Ngiyakutsandza
Jag älskar dig
I liäbe di
Bhebbek (to a female) / Bhebbak (to a male)
(Filipino) Mahal kita
Ua Here Vau Ia Oe
Nakukunde
(Hokkien) Wa ga ei li
Jigarata bihrum duhtari hola (to a female) / Tra lav dorum (to a male)
Nān unnai kādhalikkiren – நான் உன்னை காதலிக்கிறேன்
Min sini yaratam – мин сини яратам
Nenu ninnu premistunnanu – నేను నిన్ను ప్రేమిస్తున్నాను
Hau hadomi o
Pom rak khun (male speaker) ผมรักคุณ / Chan rak khun (female speaker) – ฉันรักคุณ
Na kirinla gaguidou – ང་ཁྱེད་རང་ལ་དགའ་པོ་ཡོད་
Yfetwekiye – ይፈትወኪ’የ! (to a male) / Yfetwekaye – ይፈትወካ`የ! (to a female)
Mi lavim yu
Ofa atu
Där mi yetix – Даьр ми йетих (to a female) / Där mi etix – Даьр ми этих (to a male)
Ndji mukunanga
Ndi a ni funa
Ke a go rata
N’habbek
Kamina ayong iyong
Seni Seviyorum
Seni söýärin
Men seni ynakshir – мэн сэни ынакшир
Me dor wo
Twi () Me pe wo
Mon tone jaratiśko
Ya tebe kahayu – Я тебе кохаю

mein ap say muhabat karta hoon – میں آپ سے محبت کَرتا ہوں (to a female) / mein ap say muhabat karti hoon – میں آپ سے محبت کرتی ہوں (to a male)

(formal) Men seni sevamale – мен сени севаман
(informal) Men seni yahshi ko’ramale
Na lia
Te vullk
Ndi a ni funa
T’amo
Minä armastan sindai
Anh ye^u em (to a female) Em ye^u anh (to male)
Ik hue van ye
(constructed) Läfob oli
Ma wou ndoune
Eau ofa ia koe
Dji t’veû vol’tî
& Marama (Luhya dialects) Ndakhuchama
‘Rwy’n dy garu di
Mwen enmen’w
Da ma la nope
Ndiya kuthanda
Ndza ku rhandza
Min eyiigin taptyybyn – Мин эйиигин таптыыбын
Gu ba’adag em
Men nkon’ wou
Y hob ti

Ikh hob dikh lieb – איך האָב דיך ליב

Ni wu rondi
Mo nifẹẹ rẹ
(Maya) ‘in k’aatech
Assiramken
Nadxiie lii
Ez tora hesken
Ezhele hezdege
Ngiyakuthanda
Tom ho’ ichema

The post The Languages of Love appeared first on LexWorks.

]]>
http:///translation-blog/the-languages-of-love/feed/ 1
Technology Agnostic Case Studies http:///translation-blog/mt-agnostic-case-studies/ http:///translation-blog/mt-agnostic-case-studies/#comments Thu, 14 Nov 2013 22:01:29 +0000 Lori http:///?p=3596 How does it work to be technology agnostic in practice? While there are general rules of thumb – for example, that SMT works better with user-generated content – LexWorks has found it valuable to test assumptions pretty extensively. We often test all three approaches before launch of a major project, whether a customer support site [...]

The post Technology Agnostic Case Studies appeared first on LexWorks.

]]>
How does it work to be technology agnostic in practice? While there are general rules of thumb – for example, that SMT works better with user-generated content – LexWorks has found it valuable to test assumptions pretty extensively. We often test all three approaches before launch of a major project, whether a customer support site in 9 languages or documentation in 17.

To give real-world examples of our findings, here are some case studies showing why we chose one engine over another. Being neutral about what technology to use is important to us. As John Papaioannou, the CEO of Lexcelera-LexWorks, says: “Being technology agnostic means using the very best technology for the task, without being bound by a supplier monopoly”

Machine Translation in the Real World: SMT vs RBMT vs Hybrid

Factory, Novocherkassk Russia

Challenge: The challenge of this three-year project was to translate two to 200 pages of English to Russian each and every day. The content was mainly technical specifications and contracts.

Constraints: There was no bilingual data available at project start to train engines.

Solution: RBMT. Without data to train an SMT engine, a rules-based engine was the de facto choice. In any case, we often pair a rules-based or hybrid engine with Russian, as it is a morphologically complex language.

Japanese eDiscovery

Challenge: To translate 30,000 pages, mostly emails, technical reports and meeting minutes from Japanese to English in order to identify information that could be considered a smoking gun.

Constraints: Content was written with little attention to grammar or spelling, and was highly colloquial.

Solution: Hybrid. We chose a hybrid engine because the SMT part works best with grammatically incorrect and colloquial sentences, while the RBMT part tends to perform best in the Japanese-English pair.

Response to a 3400-Page Technical RFP in one week

Challenge: To translate 3400 pages in one week from French & English to Brazilian Portuguese for a response to a Request for Proposals (RFP).

Constraints: Limited data at project start, and limited time for training. The content came in many different files, and there were multiple passes on each file as the customer rewrote while the translation process was going on.

Solution: Hybrid. The SMT component of the Hybrid was helpful in allowing us to input TMs as training material and also adapt to changing source text. The RBMT component allowed us to enter key terminology and to save on post-editing time.

Online Customer Support Site

Challenge: To make dynamic content available on a customer support website in 9 languages in order to solve customer issues before they became a call to the help desk.

Constraints: Extremely colloquial user-generated content with little attention to correct grammar and spelling, extensive use of abbreviations and content unlike what is found in product documentation. The server needed 24/7 uptime.

Solution: Online SMT. To ensure that the system was trained with both in-domain and out-of-domain material (the latter including sentence constructs not found in user documentation) and also available online 24/7, we chose the Microsoft Translator Hub widget. The widget was customized with product names, Do Not Translates, and the results of post-editing spot checks.

Self-Service MT Server for 200,000 Employees

Challenge: One of the top five banks in the world needed an MT system behind their firewall so that their employees would not send sensitive information out to Google for translation.

Constraints: This customer has many different business units such as investment, insurance, construction and automotive leasing, with sometimes competing terminology.

Solution: Hybrid. To manage the very domain-specific terms we needed an RBMT engine to organize and rank terminology by business unit. Having a large amount of bilingual corpora to train the statistical component enabled us to choose a hybrid engine which offers both. The hybrid server is now on the client’s premises, which means we are able to maintain and update remotely.

If you500彩票网安全下载 would like more information, or to ask about a particular use case, click here to send me an email.

The post Technology Agnostic Case Studies appeared first on LexWorks.

]]>
http:///translation-blog/mt-agnostic-case-studies/feed/ 0
Machine Translation Made Easy http:///translation-blog/machine-translation-made-easy/ http:///translation-blog/machine-translation-made-easy/#comments Sat, 26 Oct 2013 14:09:17 +0000 Lori http:///?p=3581 Some time ago, Lexcelera (including the LexWorks subsidiary) made the decision that “easy” was our goal. Most language providers nowadays tout quality, but quality is a given! So we asked ourselves what else is unique about us? Our conclusion, based on customer interviews and some internal soul-searching, was that we make it easy for our [...]

The post Machine Translation Made Easy appeared first on LexWorks.

]]>
Some time ago, Lexcelera (including the LexWorks subsidiary) made the decision that “easy” was our goal. Most language providers nowadays tout quality, but quality is a given! So we asked ourselves what else is unique about us? Our conclusion, based on customer interviews and some internal soul-searching, was that we make it easy for our customers to do business with us.

And that’s how EASY became our new rallying cry.

What do we mean when we say, “We make machine translation easy”?

One of the areas impacted by our new positioning was our machine translation strategy: making it easy meant not overwhelming anyone with our technology.

For one thing, we don’t ask our customers to be Do-It-Yourselfers. We don’t ask customers to go into our engines and train them for their content. Making it easy means “You give us you500彩票网安全下载r files, you500彩票网安全下载r translation memories and you500彩票网安全下载r glossaries, and we will do the rest.”

Some of our customers don’t even have to do that much. We go into their systems, retrieve the corpora we need, and at the end of each process we go back into their systems to upload finished projects as well as updated TMs and glossaries.

What else? Making it easy means quotes that are clear and RFP responses that list deliverables, timelines and productivity guarantees. And no fancy chit chat.
It means delivering clear metrics to help analyze the results and establish ROI.

Most of all, it means that we master the technology on our side, so our customers don’t have to. After all, they have their core business to manage, and usually, machine translation is not it.

To be clear, we’re not saying that machine translation is easy. While it’s true that to run a file through an MT engine isn’t that hard, on the other hand getting good results does require some nuance.

When we say we make it easy we mean that the technology headaches are ours and ours alone. What our customers receive is the deliverable, whether it’s trained engines or fully post-edited content that is ready to deploy.

Not exactly ‘No fuss, no muss’, but pretty darn close.

The post Machine Translation Made Easy appeared first on LexWorks.

]]>
http:///translation-blog/machine-translation-made-easy/feed/ 0
Who Pays the Price for Poor MT? http:///translation-blog/who-pays-for-poor-mt/ http:///translation-blog/who-pays-for-poor-mt/#comments Thu, 17 Oct 2013 14:03:31 +0000 Lori http:///?p=3559 The first time I heard this, it sent a chill down my spine: “It doesn’t matter if the machine translation output is [insert 4-letter expletive starting with an 'S']: the post-editors will clean it up.” It’s not the MT vendor who suffers when MT is bad, nor is it the end customer. The truth is [...]

The post Who Pays the Price for Poor MT? appeared first on LexWorks.

]]>
The first time I heard this, it sent a chill down my spine: “It doesn’t matter if the machine translation output is [insert 4-letter expletive starting with an 'S']: the post-editors will clean it up.”

It’s not the MT vendor who suffers when MT is bad, nor is it the end customer. The truth is that it is the post-editors who pay the price for amateurish MT. By the time MT output gets to the end customer, any major flaws will have been fixed. By humans.

For those who rely on post-editors to fix bad MT, what process you500彩票网安全下载 use to generate MT doesn’t matter at all. It doesn’t matter which MT engine you500彩票网安全下载 use and whether or not it is trained well, because in the end there is always one secret weapon to make things right: the post-editors.

The post-editors, hired to repair machine translation, are the ones who pay the price for faulty processes. They carry the full brunt of a badly-trained MT engine, or of an engine that does not fit the content thrown into it. Post-editors work as hard as they need to so that customers get the right quality, regardless of what shape their material was in when the MT engine spit it out. And they do this at a steep discount over their normal rates.

The problem with this system is that it is not sustainable. For one thing, every post-editor unfairly paid to fix bad MT errors means that there may well be one less post-editor who will say yes to the next project. I believe that poorly-managed MT is the main reason that the pool of those translators who are willing to be post-editors is not growing as it should.

Respecting the time of post-editors is not the only reason, however, that companies should strive for quality MT output. Given that the quality of the raw output determines how quickly a post-editor can progress, ensuring higher quality from the get-go means the kind of productivity gains – and thus cost savings – that make machine translation the most attractive technology today for lowering overall translation spend.

If we go beyond the idea that with MT engines, “One Size Fits All”, and treat MT as an industrial process rather than aims at identifying the highest performing engine in a given context and trains the heck out of it, we can reduce the pain felt by post-editors. And that will herald in a new era of quality MT.

The post Who Pays the Price for Poor MT? appeared first on LexWorks.

]]>
http:///translation-blog/who-pays-for-poor-mt/feed/ 0
Do You Have an End-to-End Playbook for Your Content? http:///translation-blog/do-you500彩票网安全下载-have-an-end-to-end-playbook-for-you500彩票网安全下载r-content/ http:///translation-blog/do-you500彩票网安全下载-have-an-end-to-end-playbook-for-you500彩票网安全下载r-content/#comments Tue, 15 Oct 2013 19:42:10 +0000 Lori http:///?p=3547 We hear a lot about content strategy today, but what is it exactly and why should it matter to you500彩票网安全下载? A content strategy is you500彩票网安全下载r end-to-end process for how you500彩票网安全下载 plan, create, translate, deliver and manage you500彩票网安全下载r digital content. Here are some signs that you500彩票网安全下载 may need a multilingual content strategy: People across you500彩票网安全下载r organization [...]

The post Do You Have an End-to-End Playbook for Your Content? appeared first on LexWorks.

]]>
We hear a lot about content strategy today, but what is it exactly and why should it matter to you500彩票网安全下载?

A content strategy is you500彩票网安全下载r end-to-end process for how you500彩票网安全下载 plan, create, translate, deliver and manage you500彩票网安全下载r digital content.

Here are some signs that you500彩票网安全下载 may need a multilingual content strategy:

  • People across you500彩票网安全下载r organization are writing similar types of content
  • Similar content is translated at different times
  • Your content is not easy to find and share within you500彩票网安全下载r organization
  • Online searches in other languages cannot easily find you500彩票网安全下载r content
  • Your international customers don’t see the same quality of content that you500彩票网安全下载r domestic customers do

A good content strategy aims at achieving you500彩票网安全下载r business goals by maximizing the impact of content that is easy to read and understand, consistent, easy to find and in the right language.

A good content strategy not only increases the impact of you500彩票网安全下载r content – it also increases the impact of you500彩票网安全下载r budget by ensuring that you500彩票网安全下载 write it once and translate it once.

Today human translation as well as human-optimized machine translation both have a place in the enterprise global content strategy.

To help you500彩票网安全下载 optimize you500彩票网安全下载r multilingual content – and you500彩票网安全下载r budgets – we offer:

  • Tools that ensure you500彩票网安全下载 will only translate you500彩票网安全下载r content once, regardless of where in you500彩票网安全下载r organization the translation request comes from
  • Adaptation of the same content for different audiences
  • Centralized glossaries of you500彩票网安全下载r terminology to ensure consistency across you500彩票网安全下载r organization
  • Translation dashboard indicators so you500彩票网安全下载 can track you500彩票网安全下载r translations as well as you500彩票网安全下载r re-use, total expenditure, requesters within you500彩票网安全下载r organization, real-time progress and more
  • Multilingual search engine optimization (SEO)
  • Human translation
  • Human optimized translation automation

The post Do You Have an End-to-End Playbook for Your Content? appeared first on LexWorks.

]]>
http:///translation-blog/do-you500彩票网安全下载-have-an-end-to-end-playbook-for-you500彩票网安全下载r-content/feed/ 0
The Founder’s Dilemma http:///press/press-releases/the-founders-dilemma/ http:///press/press-releases/the-founders-dilemma/#comments Tue, 24 Sep 2013 22:16:34 +0000 Lori http:///?p=3529 A Personal Essay from the Founder of Lexcelera-LexWorks How do you500彩票网安全下载 know when is the best time to step down from managing a company you500彩票网安全下载 have founded? Lori Thicke founded the translation company Lexcelera (the LexWorks parent company) in France in 1986. Recently John Papaioannou, a long-time Lexcelera customer while he was a director at [...]

The post The Founder’s Dilemma appeared first on LexWorks.

]]>
A Personal Essay from the Founder of Lexcelera-LexWorks

How do you500彩票网安全下载 know when is the best time to step down from managing a company you500彩票网安全下载 have founded? Lori Thicke founded the translation company Lexcelera (the LexWorks parent company) in France in 1986. Recently John Papaioannou, a long-time Lexcelera customer while he was a director at Bentley Systems, took the helm of both Lexcelera and LexWorks. In this highly personal essay, Lori Thicke describes the process of growing you500彩票网安全下载r company by passing on the baton.

Click here for the pdf of the full article as published in the October 2013 issue of Multilingual Magazine: The Founder’s Dilemma – by Lori Thicke

The post The Founder’s Dilemma appeared first on LexWorks.

]]>
http:///press/press-releases/the-founders-dilemma/feed/ 0
Top Requests for Machine Translation ‘A La Carte’ http:///translation-blog/machine-translation-trends/ http:///translation-blog/machine-translation-trends/#comments Tue, 24 Sep 2013 16:54:07 +0000 Lori http:///?p=3469 What’s the top trend in machine translation this year? For LexWorks, the answer, hands down, is diversity. Diversity of engines, diversity of services. In years past, our major request was for machine translation as a turnkey solution. That is, providing the gains of machine translation (turnaround speed, quality and, of course, price) but without the [...]

The post Top Requests for Machine Translation ‘A La Carte’ appeared first on LexWorks.

]]>
What’s the top trend in machine translation this year? For LexWorks, the answer, hands down, is diversity. Diversity of engines, diversity of services. In years past, our major request was for machine translation as a turnkey solution. That is, providing the gains of machine translation (turnaround speed, quality and, of course, price) but without the need for our customers to be hands-on with the tools. Translated and post-edited documents were the deliverables, and machine translation merely the enabling technology.

#1 Request: Building Engines

The most unexpected trend this year is that enterprises with a great deal of their own MT expertise, and who are themselves pioneers in machine translation implementations, have been asking us to build their next engines. The reasons for this are as varied as the enterprises themselves:

  • Adding new languages where they don’t have internal resources.
  • Releasing the internal resources who, up until now, have been dedicated to training and maintaining the engines.
  • Building a set of engines and providing the training so that a company with no prior MT experience can manage MT themselves.
  • Establishing a set of engines where they don’t have expertise: for example, a company well versed in Moses that would like to add some Hybrid technology.

#2 Benchmarking Existing Approaches Against Other Engines

Asking us to pilot engines is nothing new. We’ve been doing that for years. But what is new this year is asking us to pilot a second approach for new content. For example, a customer that has been using the Systran Hybrid for their documentation might ask us to pilot an SMT approach such as Microsoft Translator Hub for their customer support site. What we are seeing in the market is that mature MT customers are becoming technology agnostic.

#3 Managing Engines Remotely

Where using Google Translate could potentially expose confidential information, we are being asked to set up servers robust enough to handle the translation needs for tens of thousands of employees; this involves our staff working remotely to build and maintain engines that are safely behind company firewalls.

Other A La Carte Machine Translation 500彩票网安全下载

Other standalone services we’re seeing growing requests for include:

  • Data cleaning
  • Creation and maintenance of TMs
  • Creation and maintenance of glossaries
  • Authoring assistance
  • Pre-Editing
  • Post-Editing
  • Quality assessment
  • Engine updates

The Market Matures

It’s interesting to see that as the market matures, the range of MT services has gone from an all-in-one service, to mix and match packages of services à la carte.

If you500彩票网安全下载 would like more information, or to ask about a particular service, click here to send me an email.

The post Top Requests for Machine Translation ‘A La Carte’ appeared first on LexWorks.

]]>
http:///translation-blog/machine-translation-trends/feed/ 0
RBMT – SMT – Hybrid Engines Compared http:///translation-blog/rbmt-smt-hybrid-compared/ http:///translation-blog/rbmt-smt-hybrid-compared/#comments Fri, 13 Sep 2013 22:04:19 +0000 Lori http:///?p=3431 “Being technology agnostic means using the very best technology for the task, without being bound by a supplier monopoly” — John Papaioannou, CEO of Lexcelera-LexWorks. Here’s what I would add to that: In order for machine translation to make any sense at all, it has to yield the highest quality that is ‘machinely’ possible. This [...]

The post RBMT – SMT – Hybrid Engines Compared appeared first on LexWorks.

]]>
“Being technology agnostic means using the very best technology for the task, without being bound by a supplier monopoly” — John Papaioannou, CEO of Lexcelera-LexWorks.

Here’s what I would add to that: In order for machine translation to make any sense at all, it has to yield the highest quality that is ‘machinely’ possible.

This is why our approach to machine translation is not tied up to any particular engine. Years of working in a variety of environments have taught us that MT’s benefits rely on coaxing high performance out of you500彩票网安全下载r engines. And one size doesn’t fit all! A big part of succeeding with MT is having an open mind and relying on objective measures to match the right engine to the right content.

Being technology agnostic means not promoting just one engine or just one approach. Since LexWorks doesn’t sell any particular technology, we can be totally objective in choosing the best-of-breed solution for the particular content type. Rules rules-based (RBMT), statistical (SMT) or hybrid (HMT): each system has advantages and disadvantages and will perform better in certain situations. Language pair, content type and corpus availabile for training will all impact engine suitability. The only way to be sure you500彩票网安全下载 have the best-of-breed is to benchmark all three approaches. And yes, that means before starting any large scale, long-term project, we build three test engines. Once the best performing engine has been selected, we measure and improve on an ongoing basis. In a nutshell, that’s the secret of our success.

Below is a brief discussion of the three engine types.

RBMT, SMT and Hybrid Engines Compared

Rules-based (RBMT) systems come “off the shelf” with grammatical rules hard coded for the source and target languages, and thus customization of RBMT systems aims to embed specific terminology through the application of user dictionaries. Linguistic skill is required to tune RBMT systems. RBMT can be tuned to perform best in narrow (e.g. product level) domains with set terminology. RBMT systems respond particularly well to post-editing because the errors are predictable. Since the terms in the user dictionary will always prevail over any other terminology, post-editing RBMT focuses on improving sentence structure. Significant productivity gains are possible when controlled language is applied. Improvement cycles in RBMT can be implemented weekly, and even daily, as corrections from the post-editors are fed back into the system in near to real-time.

Statistical (SMT) systems are particularly well suited to languages not covered by a rules-based engine because SMT systems are trained on a language pair and domain at the same time. Engineers are mainly responsible for tuning SMT systems. Based on algorithms that parse millions of segments of bilingual and monolingual text to find the most probable translations, SMT is less predictable in what terminology it will deliver, and thus in what kind of errors will result, making it less easy to post-edit. However, SMT sentences tend to be more fluid than RBMT sentences. A big advantage of SMT for user generated content, including FAQs, forums and so on, is that spelling and syntax errors don’t throw SMT off. In fact, if it has been well trained with sufficient in-domain and out-of-domain data, SMT outperforms RBMT for uses such as online customer support, which tends to rely on informal language. SMT improvement cycles tend to be infrequent – once or twice a year – as a large amount of data is needed to (re)tune the system.

Hybrid (HMT) systems tend to combine the best of both approaches. Terminology is predictable and sentences more fluid. Training of a hybrid engine is based on both customizing terminology and processing large quantities of training data. For optimal hybrid quality, two skill sets/profiles are needed for training — linguistic and engineering. Hybrid engines may be improved frequently, without the need to wait for extensive new data sets before being able to improve output with retraining, which is another advantage.

I hope this is helpful. Contact Lori Thicke if you500彩票网安全下载 have any questions.

The post RBMT – SMT – Hybrid Engines Compared appeared first on LexWorks.

]]>
http:///translation-blog/rbmt-smt-hybrid-compared/feed/ 0
What Does a Technology Agnostic Offer Look Like? http:///translation-blog/mt-packages/ http:///translation-blog/mt-packages/#comments Tue, 27 Aug 2013 08:39:03 +0000 Lori http:///?p=3342 Today’s buyers of translation and localization are more savvy than ever before. And they are looking beyond straight Translation-Editing-Proofreading (TEP) to a full panel of language services that will help them tap into global opportunities. As for that game-changing advance, machine translation (MT), today our tech-savvy customers are asking for mix-and-match services that can integrate [...]

The post What Does a Technology Agnostic Offer Look Like? appeared first on LexWorks.

]]>
Today’s buyers of translation and localization are more savvy than ever before. And they are looking beyond straight Translation-Editing-Proofreading (TEP) to a full panel of language services that will help them tap into global opportunities. As for that game-changing advance, machine translation (MT), today our tech-savvy customers are asking for mix-and-match services that can integrate seamlessly with their internal business processes. Here are some of the packages we offer for rules-based (RBMT), statistical (SMT) and hybrid systems:

Hosted MT (Software as a Service)

This is our full MT package offering of engine selection, training, hosting of the custom-built engines and ongoing updates and maintenance.

Managed MT Service

This package suits customers who for whatever reason prefer to host their own engines. In this type of installation, LexWorks provides engine training and maintenance remotely.

Post-Editing

With one of the largest pools of experienced post-editors in the industry, LexWorks provides post-editing as part of other packages or as a stand-alone service. Post-editing can be light or heavy, depending on whether the quality objectives are ‘understandable’ or ‘publication-ready’.

Annual Maintenance

For enterprise customers who have already built strong MT engines, LexWorks offers an annual maintenance subscription service to update terminology and product lines.

MT Pilots

For enterprises wishing to test one or more MT technologies, LexWorks conducts impartial pilots. Deliverables include post-edited and/or raw output along with a full ROI analysis, report and QA benchmarking. Pilots can be conducted on one engine or on a range of engines, within a single approach or comparing all three: RBMT, SMT, Hybrid.

Data Cleaning

Data cleaning is important for training both statistical and hybrid engines, and is offered as a stand-alone service or as part of one of the above packages.

Platform

We integrate with existing platforms or provide our own for real-time translation management and reporting.

Bing Customization

Bing, also known as Microsoft Translator Hub, is a fully customizable solution of particular interest for online content such as customer support sites and forums. LexWorks offers both engine customization and annual maintenance. By accessing Bing through LexWorks, it is possible to have full data protection and confidentiality.

 

The post What Does a Technology Agnostic Offer Look Like? appeared first on LexWorks.

]]>
http:///translation-blog/mt-packages/feed/ 0
Post-Editing Best Practices Identified by ACCEPT http:///translation-blog/3-sources-for-post-editing-best-practices/ http:///translation-blog/3-sources-for-post-editing-best-practices/#comments Thu, 27 Jun 2013 02:26:43 +0000 Lori http:///?p=3278 The EU-funded ACCEPT Project brings together five leaders in machine translation: the University of Edinburgh (inventors of the Moses engine), the University of Geneva, Symantec, Acrolinx and Lexcelera, the LexWorks parent company. A recent ACCEPT report, prepared by Symantec, identifies three sources of post-editing best practices. The resulting guidelines for post-editors are summarized below. The [...]

The post Post-Editing Best Practices Identified by ACCEPT appeared first on LexWorks.

]]>
accept-logo

The EU-funded brings together five leaders in machine translation: the University of Edinburgh (inventors of the Moses engine), the University of Geneva, Symantec, Acrolinx and , the LexWorks parent company. A recent ACCEPT report, prepared by , identifies three sources of post-editing best practices. The resulting guidelines for post-editors are summarized below.

The Challenge of Post-Editing

Post-editing machine translated content is rapidly becoming an industrial process, and, with the growth of MT usage in the translation industry, the need for post-editing services is growing. However, the process is not well understood, the tools are limited, and, outside the use cases where the post editing process is conducted by professional translators, relatively little work has been published.

Post-Editing Best Practices

The ACCEPT (Automated Community Content Editing PorTal) project has identified three sources of post editing best practice guidelines.

I. (National Institute of Standards and Technology)

Goal: Make the MT output have the correct meaning, using understandable English, in as few edits as possible.

1. Make the MT output have the same meaning as the reference human translation. No more and no less.

2. Make the MT output be as understandable as the reference.

3. Capture the meaning in as few edits as possible using understandable English. If words/phrases/punctuation in the MT output are completely acceptable, use them (unmodified) rather than substituting something new and different.

4. Punctuation must be understandable, and sentence-like units must have sentence ending punctuation and proper capitalization. Do not insert, delete, or change punctuation merely to follow traditional optional rules about what is “proper.”

II. (TAUS)

1. Aim for semantically correct translation.

2. Ensure that no information has been accidentally added or omitted.

3. Edit any offensive, inappropriate or culturally unacceptable content.

4. Use as much of the raw MT output as possible.

5. Basic rules regarding spelling apply.

6. No need to implement corrections that are of a stylistic nature only.

7. No need to restructure sentences solely to improve the natural flow of the text.

III. (Midori Tatsumi)

Goal: The post-edited text needs to be easily understandable by the readers. In order to achieve that goal, the text needs to convey the correct meaning of the source text, and conform to Japanese grammar.

However, speed is another important requirement for post-editing processes. Therefore, it is not necessary to spend time aesthetically refining the text; please avoid editing for stylistic sophistication.

A. What needs to be fixed:

1. Non-translatable items, such as command and variable names, that have been translated. Please put them back to English.

2. Inappropriately translated general IT terms.

Guidelines for both Monolingual and Bilingual Post-Editing

From the guidelines expressed above, Symantec selected the following minimum set of best practice advice for their forum users.

Guidelines for Monolingual Post-Editing

  • Try and edit the text by making it more fluent and clearer based on how you500彩票网安全下载 interpret its meaning.
  • For example, try to rectify word order and spelling when they are inappropriate to the extent that the text has become impossible or difficult to comprehend
  • If words, phrases, or punctuation in the text are completely acceptable, try and use them (unmodified) rather than substituting them with something new and different.

Guidelines for Bilingual Post-Editing

  • Aim for semantically correct translation.
  • Ensure that no information has been accidentally added or omitted.
  • If words, phrases, or punctuation in the text are completely acceptable, try to use them (unmodified) rather than substituting them with something new and different.

The post Post-Editing Best Practices Identified by ACCEPT appeared first on LexWorks.

]]>
http:///translation-blog/3-sources-for-post-editing-best-practices/feed/ 0