Documente Academic
Documente Profesional
Documente Cultură
Tech
multilingual.com
New Book Coming Soon
Rockstar. Translator. We wrote a book, Move the World with Words, to put a spotlight
on the human element of global commerce: translators.
Create electric content with Flavio, a This is our first-ever publication, and to celebrate its release in
September, we’re hosting events in New York and London that
translator by day and rockstar by night. are entirely focused on translation. Together, we will discover
what it means to #movetheworldwithwords.
Smartling.com/book
2019 SR1
Pos t Ed i t i n g
info@multilingual.com
https://multilingual.com
MultiLingual (ISSN 1523-0309), Copyright © 2019 by MultiLingual Computing, Inc., is pub-
lished bimonthly: Jan/Feb, Mar/Apr, May/Jun, Jul/Aug, Sep/Oct, Nov/Dec by MultiLingual
Computing, Inc., 319 North 1st Avenue, Suite 2, Sandpoint, ID 83864-1495. Business and
Editorial Offices: 319 North 1st Avenue, Suite 2, Sandpoint, ID 83864-1495. Accounting
and Circulation Offices: 319 North 1st Avenue, Suite 2, Sandpoint, ID 83864-1495. Call
(208) 263-8178 to subscribe. Periodicals postage paid at Sandpoint, ID and additional
mailing offices.
POSTMASTER: Send address changes to MultiLingual, 319 North 1st Avenue, Suite 2,
Sandpoint, ID 83864-1495.
C o n te n t s
Focus:
Tech
26 30 36
Is AI everywhere in Intelligent terminology Predicting
the language services Francois Massion the unpredictable 44
industry? Olga Melnikova
Hélène Pielmeier
26
41 44 48
A horse with two legs Fluent: Firefox’s new Wikidata
Jerome Selinger localization system gets wordier
Jeff Beatty & Staś Małolepszy Christian Lieske
& Felix Sasaki
36
54 59 62
IT context of Emerging technologies Past is prologue all
human language and the cost of video over again
translation localization Jim Compton
David Filip George Zhao & Sharon Lian
62
6 September/October 2019
Con tent s
I social m e d i a I
LinkedIn Update
8
Libor Safar
Up Front Columns Global Digital Marketing | Localization
12
September/October 2019 7
Re cap s
8 September/October 2019
Recaps
For those who are worried about speech recognition for nonnative is. Wayne Bourland of Dell pointed
the future, “even the dark side” of speakers by training the system to out that just because something is
technology and our modern era pres- recognize alternative pronuncia- known as innovative in our indus-
ents opportunities, Aguilera said. tions of words. try, it doesn’t mean everyone — or
Tech was, as always, heav- A panel June 13 on the innovator's even the majority — is using it. So
ily featured at the conference. dilemma looked at what the adop- as advanced as tech gets, adoption
Speech technology in particular tion of agreed-upon “innovations” is still an open question.
was an emerging trend. Nishant
Rai of Adobe presented June 13 on
how his team is improving in-app
September/October 2019 9
Re cap s
10 September/October 2019
Recaps
F e a t u re d R e a d e r
September/October 2019 11
Re cap s
The third Game Global Summit can entirely take over the
was held this year in Lisbon, Portugal, task of testing.
at the DoubleTree by Hilton Hotel Following the expert
on June 18-19, 2019. The conference panel, there were a series
hosted game localizers and testers of game-specific sessions
from some of the biggest names in and round tables where
gaming. the speakers and attendees
The conference incorporated discussed many different
traditional sessions along with round topics together. Dinner was held at
tables and some hybrid discussions. TodoMundo, a unique venue cen-
It began by asking the attendees tered on art, food and music from
(developers, publishers, service Portuguese-speaking places around
providers and others) a couple of the world. After this, there was an
questions about where they work and after-party in the club portion of
what kind of games they like. When it TodoMundo.
comes to favorite games, Tetris came The second day had a heavier
out first, with Mario Kart as a close focus on strategy, including a panel
second. on strategy as well as a session on lation to content strategy and cultur-
The first major session began with global content strategy. Additionally, alization. Whether it was comparing
a panel of experts including Bjorn there were round tables discussing development for touch screens vs.
Holste from Square Enix, Robert game localization followed by open TV screens and how it affects work-
Masella from Rare (Microsoft) and sessions. flows, or discussing the role of auto-
Matt Wilson from Sony Interactive Conference speakers and attend- mation in the future of game testing,
Entertainment. The panel discussed ees came from all different kinds of the attendees and speakers were able
the different ways that each of their development companies ranging to leave the conference having dis-
teams implement automation in test- from big AAA creators to smaller cussed new insights.
ing. One of the major topics in this web-based developers. The topics Game Global Summit 4 will be
panel was whether or not automation discussed varied from machine trans- held this fall in Silicon Valley.
12 September/October 2019
Recaps
ALC held in DC
September/October 2019 13
N ews
14 September/October 2019
News
similarity between source and target, how enterprises can development and the localization processes.
use one locale as a pivot for MT to reach similar target Lingoport, Inc. www.lingoport.com
languages or dialects, and which locale combinations are
suitable. Smartling redesigns translation
CSA Research https://csa-research.com management platform
Smartling, Inc., a provider of language services, has re-
Products and Services designed its translation management platform to include
new features such as Dynamic Workflows, Smartling Draft
XTM v12.0 and a feature that automatically selects the best machine
XTM International, developers of a translation man- translation option from leading engines based on the
agement system and computer aided translation tool, source and target languages.
has released XTM v12.0. The latest version introduces a Smartling, Inc. www.smartling.com
redeveloped translator environment, two new CMS con-
nectors and a new machine translation engine connector. Memsource Editor for Mobile
XTM International https://xtm.cloud Memsource, developers of an AI-powered translation
management system, has released a new version of its
Language I/O Chat for Zendesk mobile app that includes the launch of Memsource Edi-
Language I/O, a provider of software that combines hu- tor for Mobile, the first computer-assisted translation tool
man and machine translation to automate the translation available in a mobile app.
of customer support content, has introduced Language Memsource www.memsource.com
I/O Chat for Zendesk. The plugin allows English-speaking
support agents to communicate with customers in more Clients and Partners
than 150 languages directly inside the Zendesk live chat
window. SAS selects GlobalLink
Language I/O LLC www.languageio.com SAS Institute, Inc., a provider of analytics software and
services, has selected GlobalLink by Translations.com, a
InContext Translation provider of translation-related technology products, to
Lingoport, Inc., a provider of software internationaliza- support the launch of its website in Thai.
tion tools and services, has released InContext Translation SAS Institute, Inc. www.sas.com
as a new addition to Lingoport Resource Manager, a soft- Translations.com www.translations.com
ware product intended to bridge the gap between software
September/October 2019 15
C al e n d ar
ENTRAD 2019
www.star-group.net October 7-11, 2019, João Pessoa, Brazil
Brazilian Association of Translation Researchers, https://bit.ly/2K61lCl
SLSP 2019
October 14-16, 2019, Ljubljana, Slovenia 25 Years of Excellence
Institute for Research Development, Training and Advice in East European Languages
http://slsp2019.irdta.eu
September/October 2019 17
W h i te Pape r
producers have committed to implementing the standard landscape and not just with your current provider. New
– and many already have. The standard is also being sup- features needed to serve the latest requirements can be
ported by several CAT tools. easily added. The Xillio Localization Hub is in itself the
TAPICC host. This means there is no need to wait until all
TAPICC the players in the content management industry provide a
The TAPICC initiative by GALA, the Globalization & TAPICC output.
Localization Association, also tackles the lack of a The Xillio Localization Hub is based on a unified API. This is
common baseline. Both the void and the plethora of extremely important for implementation, because once your
APIs undermine interoperability and force invest- CMS system has been integrated for localization purposes
ments, such as in code maintenance. The TAPICC ini- using this unified API, you can easily add other creators or
tiative aims to create a pre-standardization model to “consumers” of content into the mix with no additional con-
address these challenges and lead to just one API need- nectors – and no further investment – required.
ing to be supported, rather than hundreds. It has an
open source legal framework and is supported by major Connecting Content, the localization-faced middleware
industry players and associations. After all, true digital by Kaleidoscope, takes care of the “specialties” of the
transformation will only become possible when com- translation systems and workflows, in particular for SDL
panies can connect their workflows directly to the Trados Studio, SDL GroupShare, and SDL WorldServer.
translation market. Connecting Content acts as a TAPICC client or host and
receives TAPICC (or other) calls or packages. It automat-
But however well TAPICC might develop, it will prob- ically creates projects, pre-translates and analyzes them,
ably not solve one basic question: what is the incentive sends out translation jobs, or uploads them to Group-
for a CMS provider to provide a TAPICC host func- Share or WorldServer. Localization workflows can be
tionality? Without pressure from their clients, there defined in Connecting Content and triggered through the
probably isn’t one. business metadata contained in the TAPICC call. After
the localization process is finished, Connecting Content
Our Solution pushes the result right back to the Localization Hub and
That is why Kaleidoscope and Xillio have teamed up to on into the source CMS repositories.
combine their respective middleware platforms to solve this
problem based on the close-to-final TAPICC project status. Conclusion
Xillio has developed a content integration platform, the So there you have it: the combination of the Xillio Local-
Localization Hub, with 23 (and growing) connectors. As ization Hub and Kaleidoscope’s Connecting Content
the Xillio CMS connectors are integrated with a “neu- brings an end to the Babylonian interface confusion and
tral” middleware platform, your investments made into makes both your content and translation system ready
automation can be reused across the entire translation for whatever the future will bring.
About Kaleidoscope
Taking your content global — with Kaleidoscope your product
will speak every language! The combination of decades of exper-
tise, our software solutions developed in-house, and select soft-
ware from market-leading technology partners has been making
this a reality since 1996. Coupled with the full-service approach
from eurocom, Austria's largest and most innovative translation
agency, Kaleidoscope offers a unique and unrivaled synergy of
language and technology.
www.kaleidoscope.at
sponsored information
sponsored information multilingual.com/white-papers
September/October 2019 19
Column
Client Talk
RPMGlobal
Terena Bell
Terena Bell is senior director of communica-
tions for Lionbridge. However, this article
was written while she was an independent
reporter covering translation for The Atlantic,
The Guardian, MultiLingual and others.
Welcome to Client Talk, where we ask translation buyers when US English-speaking developers,
professional services are worth it and why. By connecting away Taylor says she has “to edit out
from the sales environment, we hope to discover what really some z’s and add in some u’s.” Her
core focus, though, is on overseeing
drives localization purchasing. For the last two years, every issue
localization into Russian, French and
of MultiLingual has featured a different company. Some compa- Spanish. “We have some software
nies hire professionals, others don’t, but your challenge is to find products in more languages,” she
the similarities. What patterns do all buyers share? What do their explains, noting one that’s in 12. But
answers tell us about the way clients see our industry as a whole? as RPM’s first localization manager,
it’s Taylor’s job “to prepare for more
The client translation in the future — translat-
Mining software company RPMGlobal ing product content which typically
is our first Australian profile, based in hasn’t been translated in the past,”
Brisbane with clients in 125 countries. as well as new and forthcoming
But don’t let the word “software” fool you product acquisitions. Russia,
— RPM clients don’t mine for data, but Kazakhstan, South America, Africa
rather ore and minerals from the ground. and Canada are critical markets with
“Like any other industry,” manager of occasional Portuguese and Chinese
product internationalization Kirsty Taylor requirements. Indonesian is also a
says, “mining has its set of industry termi- growing need.
nology that can be quite foreign to those In the past, Taylor oversaw transla-
not exposed to it.” And this vocabulary tion for software company ABB.
can vary by commodity — as in coal vs.
gold; local history and technique; above The need and solution
or below ground. A LocWorld conference regular,
The company operates primarily in Taylor calls RPM’s needs “moderate”
Australian English, but because it employs Kirsty Taylor in relation to peers, explaining she
20 September/October 2019
oversees one to two projects per understand tonne vs ton from a
month, each around six files. The locale and spelling perspective, but
company doesn’t use translation that there’s a ‘short ton’ as well.”
management software, so Taylor In the event that readers are not
isn’t able to track average project familiar with these distinctions,
word count. Right now, she says in the United States, a short ton
her work really is concentrated on of 2,000 pounds is usually known
preparing “to improve [RPM’s] i18n simply as a ton. There is also the
[internationalization] capabilities, tonne (1,000 kilograms or 2,204.62
so that they are ready to localize pounds), also known as the metric
when our strategy determines the ton, not to be confused with the long
time is right.” ton (Imperial ton) of 2,240 pounds.
Strings get localized, but not To confuse things further, even in
always software content or online the United States, some applications
help. “Only the product UI,” use tons to mean long tons (for
she says, and “training material example, naval ships) or metric tons
when it was being delivered in a (world grain production figures).
particular region and required the
local language.” The company also Emerging patterns
performs consulting for mines and Unlike many clients with a small
mining investors, so reports require initial spend, Taylor already under-
translation as well. stands the benefits of optimized
The solution is to use transla- translation. But that doesn’t mean
tion services. Taylor says, “Almost she can’t see the industry’s flaws:
everything that I work on goes to “Buyers — or rather, other stake-
professional translators. Only a holders in businesses — seem to
couple of our software deliverables expect that translators can perform
are translated in-house, mostly due magic,” she says, expecting them
to having native speakers who are to translate “disembodied software
very familiar with the software in the strings” accurately or to “derive
software development teams. These meaning from file names or the ‘con-
are edge cases and I’m planning to text’ of a string when you’d actually
move them to more of a reviewer/ need to be a programmer working
subject-matter expert role than an on that product to make meaning
in-house translator role.” of it.” She adds that companies also
unrealistically expect translators to
On a scale of 1-5, how use buyer-preferred tools.
important does Taylor rank Vendors don’t get off easy,
professional translation? though, as Taylor has some
She says 5. However, as in the advice there too: communicate.
April 2018 profile of Papa John’s, Buyers need to know how much
Taylor makes the distinction translation will cost and when
between a true professional and projects will be done. Yet Taylor
simply working for an agency: “The still receives quotes “not including
most critical element of successful some kind of indicative duration,”
translation for us is the experience she says. “I know it will depend
of the linguists that our vendor on the resources etc, but I often
uses. They must be very familiar have internal stakeholders who are
with the mining industry; otherwise interested in two answers — the
our terminology will be largely cost and the duration. Don’t make
confusing to them. We have to me chase you for the duration if
ensure not only that the translators I’ve asked for it.” [M]
21
Column
Community Lives
Bridging industry
and academia
Jeannette Stewart
Jeannette Stewart is the former CEO of CommuniCare, a
translation company for life sciences. An advocate for the
language industry, she founded Translation Commons, a
nonprofit online platform facilitating community collaboration.
Teacher/student relationships have taken on many forms industry. On the panel were Max
throughout history and across the globe. Our younger years are Troyer, professor at MIIS; Ludmila
focused on becoming a part of this community, acquiring knowl- Golovine, president of MasterWord
edge and skills... and then what? We transition to the business Services, representing the Professors
Group at Translation Commons
community and put our knowledge and skills to profitable and
(TC); and Patrick McLaughlin from
satisfying use. Eventbrite sharing his experience
Except the equation is not that balanced, is it? Enterprises may devour of the last three years as adjunct
the products that education delivers, but is their appetite satisfied? This is a professor at MIIS. The interest and
complicated question, and specifically in the language industry, the answer participation of the audience — lead-
is a tad complicated too. After all, we keep hearing about skills shortages and ers in tech organizations — indicated
the difficulties enterprises have in feeding their ravenous machine. So, if this the need for communication and
simple process of supply and demand isn’t working, we need to investigate establishing permanent links.
why. We have seen initiatives that attempt to tackle this. For example, under TC has also responded to the
Ulrich Henes’ lead, LocWorld ran the prescient Attracting and Developing solid groundwork laid down by
Talent (ADT) initiative. Much has been learned from forging links between ADT in forming a professors’ group
education and commerce, but the problems are perennial and we need to tasked to provide a widespread
build stronger and more productive relationships that will satisfy requirements forum for exchanging ideas and
effectively and continue to do so on an ongoing basis. experiences that can be adopted to
Many academics attend conferences and language-industry events and tailor translation and interpreting
business people are often guest lecturers at their courses. Together they create courses to industry needs. The
relationships that are crucial in helping develop language programs and tech- Professors and Lecturers Group
tools that fulfill real-world requirements. One very tangible benefit of such is a space where instructors from
initiatives is that the Middlebury Institute of International Studies (MIIS) is different universities all over the
responding by reclassifying their localization program as a STEM degree (sci- world can collaborate and share
ence, technology, engineering and math) curricula that will equip its graduates resources in order to achieve an
with skills that our tech-based enterprises require. Effective communication, excellent quality of education in
transparency and a willingness to innovate for change are critical components localization, translation and inter-
that will ensure success in this major language-industry venture. pretation. This group also provides
I recently attended the GILT Leaders Forum, hosted by Airbnb in a bridge between academia and
San Francisco. One of the sessions was on the educational needs of our the professional world, by helping
22 September/October 2019
Colum n
educators promote the profession young talent. Mutual understanding in institutions of higher education
of translation, interpreting and between widely differing groups for over 35 years. She has extensive
localization, creating contacts for in communities is fundamental to academic administration experi-
possible internships, exchanges and their cohesion, health and fitness. ence; has performed governmental,
opportunities for students to get This has led to the identification of technical and literary translations;
hands-on training. four objectives for the group. First, and has served as an interpreter in
This all started in LocWorld38 the need to facilitate professional technical, business and government
Seattle, at the TC booth, where communication between professors arenas. Noland holds the position of
trainer and company owner Ludmila from different universities and Houston Interpreters and Transla-
Golovine mentioned how much countries will lead to excellent tors Association president and is a
we need professors to share their quality of education in translation member of Texas Chapter of Women
knowledge, maybe with webinars. and interpretation with a more in Localization Leadership Team.
Career advisor at BYU Doug Porter, distant goal of understanding a kind Noland shared with me: “Translation/
assistant professor Cynthia Jones of common curriculum. Leading interpretation is a profession like
from Weber State and I jumped in, this subgroup is Irina Maas, adjunct engineering, or nursing, or any other.
and after a few minutes of enthusi- professor at Houston Community One must get professional education
astic brainstorming, we created the College and vice president of the to produce high quality translation
group on the spot. We already had translation and interpretation and interpretation. This educa-
professors volunteering in different (T&I) program for alumni. She has tion must be delivered by higher
groups or directly on the TC website extensive international experience as educational institutions in strategic
with their students. The power of a language professional and business partnership with the industry to meet
community was strongly in evidence administrator, having worked for its workforce expectations.”
as we put the idea out there and major automotive and oil and gas As a third objective, taking oppor-
ended up with a first list that was companies, and now found a new tunities as a catalyst of innovation,
about 40 strong. passion in sharing her knowledge hence progress, they are seeking to
It is interesting to note how with others. As she says: “Communi- create solid relationships with industry,
the strong symbiotic relationships cation is key to our joint success!” companies and professionals to better
between academic scientists, As a second objective, it is a connect students with opportunities.
engineers and major corporations vibrant and given part of language I can personally attest to the talent
are becoming de facto. The strong study that travel fosters under- among students whose ambitions in
bonds between academics and tech standing of cultures as well as the the language industry are an asset that
enterprises in Silicon Valley since languages they speak. Exchange is far too valuable to squander on sink
the 1950s played a critical role in programs, which have a history that or swim situations when they first enter
transforming the US economy. Such stretches back to classical times the workforce. This truly is a situation
is the measure of how essential tech- (think Attila the Hun being educated in which everybody wins. Heading this
nology has become to our activities. in Rome!), are an excellent means of subgroup is Golovine, a TC advisor and
Now that businesses are global, the achieving this, even if history does the driving force behind MasterWord,
need for multilingually capable tech show that diplomacy is not 100% a world-ranked top 50 multimillion-
is finally ensuring a place at the table guaranteed. The aim of the Profes- dollar company. She serves on various
for the language industry — maybe sors and Lecturers Group is to create boards and chairs many advisory
not quite the top table yet, but give a consortium that will focus on committees and chapters.
us time! When enterprises finally exchanges. But not just that. Sharing The challenge of getting indus-
understand just how much properly professors’ knowledge, without try to acknowledge, understand
implemented globalization strate- borders and across institutions, will and prize the skills of linguists is
gies can enhance revenues, we will enable students to have a more com- long overdue for priority atten-
receive due recognition of our value. plete view of all the potential paths tion. We need to help industry
At least, we should. they can follow after they graduate. recognize the professionalism of
With these thoughts in mind, the Leading this subgroup is Natalia qualified translators and interpret-
need to forge strong bonds between Noland, professor and creator of the ers as well as promote the transla-
academia and industry is brought T&I program at Houston Com- tion and interpretation profession
sharply into focus. Educators are munity College. Noland has a PhD itself. I’ve often said it — we have
no longer just providers of raw, in linguistics and has been teaching fullstack engineers with a plethora
September/October 2019 23
Column
of skilltrees describing their many their academic skillset into enterprise can now reach a global audience
activities; where is our equivalent? requirements smoothly. Together thanks to streaming and other online
Imagine the benefits if there is a with these experts, professors can resources. I’d like to take webinars as
broad, accepted understanding ensure that students receive both an excellent example that illustrates
of the full range of language academic and professional experience how a previously unimaginable net-
competences that our community that will help them enter the job worked audience is being reached.
possesses. market successfully. It is important We use Google’s generously-donated
Is this simply another hyped-up that those in top globalization facilities to deliver guest lectures
initiative that makes all the right positions share their knowledge with to any number of participating
noises, but delivers little more? Are professors and academic institutions. institutions allowing all manner of
we truly addressing a real-world By assisting and influencing the content-sharing to keep students up
requirement? We emphatically academic curriculum with ideas to date. This may well be old hat in
believe we are and we believe our of what current needs are in large the business world, but in education,
doubly inclusive approach has enterprises, we ensure that new talent we’re only getting started.
massive potential. Industry changes will be appropriately trained. There is But why restrict lectures to
quickly and we as trainers of the new no better way to provide candidates academics? We are also using the
breed of professionals need to adapt with the correct skillset. medium to bring in industry profes-
rapidly to their needs. At the same We now have an opulent com- sionals to give students the benefit
time, industry experts work on the munications tool bench at our of cutting-edge insights before they
front line and have first-hand knowl- disposal that puts the dusty lecture even embark on a career. This can be
edge of what the real needs are for halls of the past into the dimming critical in guiding students toward a
incoming talent to be able to integrate light of receding history. A lecturer path that interests them. The talent
brimming in our young student
community is there for the shaping.
We are also creating a job-posting
page encouraging enterprises to
create internship opportunities for
current students and new graduates.
As part of the TC commitment to
mentoring, Doug Porter with a small
team of collaborators is creating best
practices for internship programs
as well as other projects that the
community asks for.
TC is a volunteer-managed com-
munity. We are proud to be omni-
inclusive. No linguist, no language,
no language provider, no client is
ever excluded. If anyone has an idea
for any way in which to help pro-
mote strong, healthy communication
between all of us, there is a place for
that idea. We know that life for some
language professionals can seem a
bit lonely at times. But languages
communicate, and as an Argentinian
friend always tells me, it takes two to
tango. What is a professor without
their students and what are students
without their professors? A business
with global ambitions will remain
mute without us. [M]
24 September/October 2019
Silicon Valley | Nov 6-8, 2019
Fairmont Hotel, San Jose, California
Go Global,
Keynote Speaker:
Vitaly Golomb
Be Global Global Tech Investor
Is AI everywhere in the
language services industry?
Hélène Pielmeier
Hélène Pielmeier is a highly accomplished language services industry
executive. Her specialties include project and vendor management,
quality process development and improvement, and sales strategy and
execution. As an analyst, she provides research and advisory services for
CSA Research's language service provider platform.
Artificial intelligence (AI) has become a vice providers (LSPs) that use AI as a marketing
true buzzword in the language services industry, boosterism tool, it is really hard to avoid get-
and just about everywhere else too. From nay- ting tangled in all the misleading messages you
sayers who frown upon anything related to AI hear out there. When every tool and process is
to technology vendors and some language ser- labeled AI, the term loses a lot of its meaning.
26 September/October 2019
Focus
So let’s go back to the basics and and select vendors based on project that are modeled on brains — for
explain what AI truly is, and then specifications. In this ideally touch- example, today’s neural networks
dive into some data to define the less environment, human interven- are still most closely modeled on
actual depth of AI deployments in tion occurs when the system flags a the visual ganglia of insect brains.
the language services industry. need for it. For example, the software This is the technology behind neural
may discover a shortage of vendor machine translation.
Levels of automation options that can handle the work in What constitutes AI is constantly
Automation is about using soft- the assigned turnaround time. changing. 30 years ago, rules-based
ware to define and execute a series of •Generic artificial intelligence is systems were considered AI, but
actions that run when instructed to a more advanced form of automa- today they are not. 20 years ago,
do so or when a trigger or condition tion. It refers to technologies that expert systems were the rage in
dictates (Figure 1). learn from data to perform tasks AI, but are now considered simply
•Rule-based systems deliver the that would otherwise require human software, not AI. And while statisti-
most basic automation. That's what intelligence, but that require instruc- cal MT was central to AI ten years
you use when you process files in tion and feedback from humans. In ago, some people already no longer
a rule-based machine translation these “supervised” systems, humans consider it part of AI.
(MT) engine that replaces each tell the machine how to learn from
term with its translation and applies the data. For example, an AI-driven Automation through MT
grammar rules to it. Similarly, it’s system can predict timelines based In a survey of language service
what you use when a translation on actual translator performance for providers, CSA Research found that
management system (TMS) applies specific types of content. The soft- 51% of LSPs in our dataset already
a preconfigured project plan for a ware flags the odds of a translation use some form of MT software, but
specific client relying on a prede- passing a preset quality threshold fully 80% of these companies have
termined price sheet, translation based on analysis of events such as tried neural machine translation,
memory and translator list. When whether the linguist opened a pro- a deep-learning-driven form of
the system encounters an undefined vided glossary. machine translation. We applied our
situation, such as a new customer, or •Machine learning is a further sub- LSP Metrix capability-competency
new language for a client, the pre- set of artificial intelligence in which model to the use of machine trans-
configured workflow stops and calls the so-called “unsupervised” systems lation and found that MT adoption
for human intervention. can learn by analyzing the data with- more or less follows the natural
•Expert systems go a step beyond out being explicitly told how to. What maturity curve of LSPs.
preloading data for each scenario. happens is that you use automation to •LSP maturity equates to a high
Instead, they apply complex rules create further automation. level of MT capability. In our analy-
and conditions to quote rates and •Deep learning is yet another sis, we discovered that 93% of the
turnaround times, choose workflows subfield that uses neural networks most mature LSPs are MT-capable,
meaning that they have tried out the
technology. Those that haven’t exper-
imented with MT are interpreting-
centric companies, where automated
interpreting software for spoken lan-
guage is not as market-ready as it is
for written language. At the opposite
end of the spectrum, when we apply
our Metrix model to the least mature
LSPs, we find that only about one-
third of companies at Stage 0 have
some MT capability in-house.
•MT capability may not mean
LSPs are using it in production.
Deployment at the project level
remains minimal for most. We
Figure 1: Form of Automation. Copyright 2019 CSA Research, Inc. found that 57% of respondents using
September/October 2019 27
Fo c u s
success
smaller LSPs that don’t have the
+421 2555 68 939
+421 2555 68 938 ability to invest in data scientists
+421 907 707 222 www.traductera.com to benefit from advanced levels of
il: bratislava@traductera.sk
automation. [M]
120,65x90,25_jachting.indd 1 07.06.13 17:27
28 September/October 2019
Wh ite Paper
Giulia Tarditi is a visionary on a mission to change the way companies conceive of and execute localization. Currently
head of localization for mobile banking app Monese, she has spent the past 11 years advising venture-backed start-
ups on how to unlock the full growth-driving potential of global content. Tarditi’s interest in the relationship between
language and the brain has allowed her to develop a forward-thinking approach that saw her winning the Process
Innovation Challenge at LocWorld Portugal 2019. Having worked on everything from starting off multiple language
sites and apps as a startup to scaling to hundreds of millions of users, she is one of a handful of people who have had
the chance to do localization on this scale. To find out what makes Tarditi’s approach a game-changer for companies
that want to go global, get in touch at www.linkedin.com/in/giuliatarditi.
Intelligent terminology
Knowledge-aware terminology databases
can help translators and improve AI
30 September/October 2019
Focus
Although the term artificial intelligence (AI) was coined more technologies to report terminology
errors in a translation even if the term
than 60 years ago at a Dartmouth conference in 1956, it has only
used by the translator is actually cor-
recently reached the radar screen of translators, interpreters or
rect (the so-called “false positives”).
language workers. Since then, most language professionals have Most terminology databases used
been struggling to assess which effects AI will have on their pro- today do not provide a mechanism to
fessional lives. respond to different usage situations.
The generally accepted understanding
It is too early to measure the full extent of this transformation, but we of concept-building among termi-
know already that the impact of AI will be twofold. On the one hand, AI will nologists is that concepts have one
revolutionize the work of language professionals by taking over some of their definition, and this definition summa-
tasks and assisting them in many others. On the other hand, it will create new rizes the key features of the concept.
demands and service opportunities. Translators and other language specialists However, realistically, there is more
will be able to augment AI systems with their unique knowledge and skills. This than one way to define a concept. First
will be particularly the case in the field of intelligent terminologies, aka onto- of all, reality is not perceived the same
terminologies and knowledge-rich terminologies, which have become part of way by everyone: culture, language
the tool landscape. Intelligent terminologies belong to the family of augmented and individual experience play an
translation technologies that are inspired or driven by artificial intelligence. important role here. The purpose of
They model knowledge by creating conceptual networks using relationships. terminology work is also important.
These intelligent terminologies are of great benefit when it comes to discover- While in many cases, terminology
ing knowledge hidden in documents and translations. entries decontextualize concepts and
Terminology is an integral part of the work of translators or interpreters. To a formulate definitions that are valid
great extent the challenge of translation amounts to understanding the meaning for as many users as possible, others
of special terms and finding their equivalents in the target language. Terms can pursue specific legitimate goals when
be ambiguous, erroneous or have no match in the target language. The reasons they select and define terms. As an
can be multiple: a concept does not exist in the target language (take for example example, you can look at a bike as a
the Japanese “capsule hotels,” unknown in many other cultures) or one language means of transportation, as a tool to
structures the reality differently, giving the translator several translation alterna- improve your fitness or as a product
tives depending on the context. This is the case with the French verb télécharger, that you sell to customers. Depending
which can be translated either as upload or download. As a result, translators on your intention, the elements of
or interpreters spend much of their time researching specific terms and their your definition and the equivalents in
equivalents. Many organizations or companies are therefore building multilingual other languages will vary.
terminologies to improve communication and support the translation process. Another factor that influences the
Today, terminology repositories managed by language service providers or use of a concept and its definition is
language departments are concept-based, meaning they start from an abstract its degree of granularity, or in other
concept and collect for each language respectively all terms (words, abbrevia- words, its degree of precision. In some
tions or phrases) that describe this concept. The theoretical foundation of this languages, a concept is more finely
approach is the semiotic triangle of reference, as first published by Ogden and structured than in others. An example
Richards in their book The Meaning of Meaning (1923), a book that still deserves of this is the classification of cars in
being read today. This general approach is shared by most terminologists and is the US and in the EU (compare the US
well documented in multiple standards such as ISO 704:2009. “subcompact car” and the European
Over the years, various schools of thought have criticized the shortcomings “B-segment small cars”).
of the semiotic triangle from different perspectives, pointing to aspects of cogni- Thus, we have to consider a concept
tion, intention and communication. However, they failed to deliver a pragmatic as a generalization for a wide range of
terminology model which could be used by practitioners in their daily work. The possible uses for an object or an idea.
semantic triangle doesn’t indeed offer the flexibility to describe “soft” or variable This is a good starting point, but in the
features of a concept — such as the communication situation, the context, the end what matters is how terminology
objectives, the experience, the culture and so on. The semiotic triangle itself is can help us to understand terms in real
static and represents a frozen definition of a concept. This is the reason for which situations. To do so, we need additional
terminology entries do not always reflect the specific situation in which a term information beyond the definition.
is used. It is not uncommon for translators to reject a translation suggested by What are the possibilities? It is
their terminology tool because it does not fit the context or for quality assurance of course possible to explain the
September/October 2019 31
Fo c u s
representative situations in which as marketing material, user interface, intelligent terminologies is the relations
the term can be used in a comment legal documents), the department of a they use to connect concepts. The idea
field, but this kind of information is company (sales, production, develop- is that people do not understand terms
only available to humans and cannot ment) or the intended target audience in an isolated way, but always together
be effectively processed by software (such as government, end user, medi- with other terms. The meaning of a
applications. The first option is to cal specialist), to mention a few. These concept depends to a large extent on
work with attribute fields in which attributes work well, but also have the context in which it occurs. This is
standard values represent typical their limitations. a phenomenon with which machine
usage contexts of a term. These can The good news is that there are translation algorithms are struggling
be the type of documentation (such more options. The central element of because they use statistics and favor
Relation graph for entry 1 ("income"), differentiating it Relation graph for entry 1 ("Einkommen").
from entry 2, the same term in English.
Relation graph for entry 2 ("income"). Relation graph for entry 2 ("Ertrag").
32 September/October 2019
Focus
the rule of the largest number. They select the most frequent It is a challenging task to build semantic relations
meaning or translation that yields the highest value in an between concepts as this requires time and in-depth
algorithm, while a less frequent word or meaning may be the domain expertise. One way is, of course, to have subject
best option in a specific context. specialists use their personal knowledge to connect
For example, try to translate the word container using concepts one by one. This approach is the best in terms
this definition: “an object for holding or transporting of quality, because the knowledge modeled in the ter-
something.” Unless you see it or have a detailed description minology database has been hand-picked and validated
of the situation, you have no chance of knowing exactly on-the-fly by specialists. However, this can be very time-
what a container is. It could be a box used to transport consuming and requires a high investment. Companies
a few books, a larger box used to ship goods overseas, or or organizations with a large number of concepts can
could also be a recipient for a liquid. Depending on this, combine different methods to achieve the same result in a
the translation will be very different. But if you see it more affordable time frame and budget.
associated with other terms (as in: “Place the container of On the other hand, they can use different tools and
dough on the table and pour a cup of water into the glass.”) methods of natural language processing (NLP) and AI to
you will probably find the right translation. identify terms that are used together in the same context.
This phenomenon has been studied for almost a century by Machine learning algorithms analyze word embeddings
different scientists, whether they come from the cognitive sci- that are vector representations of words in context. With
ences, linguistics, computer sciences or neurosciences. “You this type of information, they discover semantic relations
shall know a word by the company it keeps,” as English linguist between concepts, such as words that influence each
J. R. Firth noted. Recently, neurosciences publications explain other, have a similar meaning or behave in the same way.
the building of semantic networks in the brain, shaped by the Co-occurrence matrices are also used in NLP to identify
cognitive experience of individuals. There is even a “seman- related words.
tic atlas of the brain” which is the result of research work Four successive steps are required to build intelligent
published in 2016 by a team of UC Berkeley researchers and technologies:
available online. In the field of cognitive linguistics, Charles J. 1. In the beginning there is a collection of terms
Fillmore developed his theory of frame semantics in the 1970s extracted from reference documents.
by modeling frames as a recurring use of related terms. 2. These terms are then merged into concepts. For
Intelligent terminologies are organized as collections example, software and application are merged into the
of concepts that are interconnected through relations. common Multilingual
conceptPMof an object
Ad 3.5x3.75 used 1to26/07/2019
85.75x91.875.pdf “(instruct)
09:38:09 a com-
Different types of relations are used, such as hierarchical puter to do specific tasks” (www.techopedia.com).
relations, part-whole relations and associative relations.
Focus PM
The type of relations depends on the subject matter. A
medical scientist will need other relations than an auto-
motive engineer. In general, knowledge-based terminol-
ogy systems will display a concept map with one concept
in the core and related concepts around it (see Relation
Elia’s event on project management
graphs). Translators or interpreters can use this informa- C
September/October 2019 33
Fo c u s
3. The concepts are enriched with • Visualize a context for connected in the variety of relations they model
additional information and metadata relevant terms, such as income > equity and the methods they use to imple-
— a definition, an illustration, a sta- > taxes (as opposed to the income > ment them. In addition to the lack of
tus or usage attributes. sales > goods, that may require a dif- a common denomination for this type
4. The concepts are linked together ferent translation in some languages). of terminology repository, there is
according to predefined relation cat- • Highlight terms with a special currently no standard format for the
egories. Hierarchical categories usu- usage attribute (such as prohibited exchange of data that would ensure
ally reflect some sort of classification translation). the interoperability of intelligent ter-
or taxonomy. • Highlight categories of terms minologies. The TermBase eXchange
Typical usage contexts can be based on their properties (such as: standard cannot represent relations,
modeled either with the help of type of task, text classification, UI and the Research Description Frame-
attributes or of relations between the object or dialog, part of speech). work-based vocabulary organized with
term and other concepts as is the case There are several tools available on the Simple Knowledge Organization
in situations where more than one the market for text annotation. Some System can only be used for a limited
translation is possible. of them can tap intelligent terminol- range of relations.
Intelligent terminologies can be ogy databases directly. An annotated There is still some work to be done,
used in multiple ways. Here are some text is very valuable for translators, but the exciting thing is that intelligent
examples: interpreters or researchers who try to terminologies have emerged and that
• To discover and visualize quickly identify and retrieve the most they are changing the paradigms of
knowledge hidden in documents or important information and informa- terminology work. Ontologies and
translations. tion categories in a document. terminology databases have long lived
• To store knowledge. Annotation can also add markup separate lives. Intelligent systems that
• To check the correct use of to content for further processing by try to model and understand natural
terms and translations. diverse applications. In this way, intel- language — natural language under-
Intelligent terminologies are par- ligent applications such as chatbots standing — usually use statistical and
ticularly useful in situations where or smart assistants can “understand” probabilistic algorithms and ontolo-
information needs to be extracted, annotated content, recognize the gies that are designed to be processed
as is the case with a document to elements with relevant information primarily by software applications. On
be translated. A document as such and output the required results. This the other side, terminology databases
is only a collection of words. Before can already be seen in areas such as are directed at humans. With the rise
they start translating, the translators technical support for products or in of intelligent terminologies, a new
must analyze the text — identify the marketing and sales, when connected category of terminology products has
subject, spot the ambiguities, rec- products such as flight, hotel and rental arrived on the market that combines
ognize the relations between words, car are offered to the user as a package. both approaches and offers challeng-
understand the concepts transported Similarly, semantic markup from ing opportunities for translators and
by the text. This process can be time intelligent terminologies supports knowledge workers.
consuming, especially when dealing authors who wish to check the con- Big data and especially language-
with large documents. However, this tent they produce. For example, they related big data suffers from the fact
process can be accelerated with intel- can see which concepts are linked that existing algorithms and methods
ligent terminologies. Terminology through hierarchical relations, whole- are not very good at processing
entries can be regarded as building part relations or cause-effect relations the many minute facets of natural
blocks of knowledge and can there- and check whether they have forgot- language. Ontologies can indeed be
fore automatically make knowledge ten to mention important informa- very efficient in this respect, but
visible in unstructured text using tion in a technical manual. they require expert resources like
techniques such as annotation and Translators or quality assurance knowledge engineers to build them
highlighting. An annotation tool uses technologies connected to intelligent and this can become complex and
terms and metadata from the termi- terminology repositories can check prohibitively expensive. Intelligent
nology database to highlight or mark context-dependent translation vari- terminologies can fill a gap here, and
up terms in the document. Relations ants and identify mistranslations in a create entirely new service opportu-
between the concepts in the database particular context. nities for language specialists while
and the term attributes make it pos- Intelligent terminologies are still helping them to perform their work
sible to do the following: relatively new. Existing solutions differ as translators more efficiently. [M]
34 September/October 2019
MISSON BRIEFING INCOMING...
XP: 32/100
... START MISSION ...
PROGRESS...
BOOSTS...
x3 x12 x7 x9
Fairmont Hotel
San Jose
Fo c u s
Predicting
the unpredictable
The future of localization
Olga Melnikova
Olga Melnikova has been in the industry for 11 years, first as a Russian linguist and then as a project
manager and localization professor. She holds an MA in translation and localization management
and is currently pursuing a graduate degree in computer science with a Canadian university.
36 September/October 2019
Focus
We all want to know what the future will in the twentieth century, which was also exponential. As
of 1900, the total population of the world was about 1.6
bring so that we are better prepared for it. How-
billion people. In 2000, it reached 6 billion people. As of
ever, it becomes harder to reliably predict the
March 2019, it was at 7.7 billion, and counting.
future even a few years ahead. This is because There are two important factors that add to this: first,
technology is progressing faster than ever. the share of educated people increased considerably. Sec-
ond, scientific knowledge is now much better and much
The advancement of technology in the past several decades more accurate than 100 years ago.
has changed everyone’s lives. Many of us still remember the This means that at the time when Albert Einstein
time when we did not have high-speed internet, or even the published his famous article about the theory of relativity
time when we had no internet at all. This seems unbeliev- (1905), considerably fewer people were living on Earth
able now, but it was only a couple of decades ago. Internet, (compared to now), and a very small number of them were
social media, artificial intelligence (AI), machine learning educated. Last but not least, the science was very outdated,
and artificial neural networks in their modern state are all as opposed to what we have now. Since each new genera-
quite recent developments, and we have witnessed our lives tion of technology improves over the previous one, the pace
change drastically due to these innovations. of progress from version to version speeds up. It means
that all technological change becomes exponential. One
Law of Accelerating Returns of the main concepts describing this exponential change is
Technology is moving so fast that its growth is not linear Moore’s Law that, in its current form, states that the perfor-
anymore — it is exponential, as defined by Ray Kurzweil’s mance of computers roughly doubles every two years.
Law of Accelerating Returns. Should we then be surprised by the exponential growth
What does it mean for our industry? As for all other of technology? Probably not, because the number of highly
industries, no one knows what to expect even in the near- educated engineers, scientists and researchers who simul-
est future. taneously work on new technologies is so high now that
This is due to the nature of the exponent. There are this inevitably results in the emergence of mind-blowing
many domains that develop in a linear fashion (biology, technologies at a dizzying pace. The companies owning
neuroscience, computer science, physics and so on), then this tech are trying to cut time-to-market in order to win
they reach the point where they start empowering and the consumer war against their competition.
enhancing each other in a way no one could predict. This As a result, many of these new technologies never achieve
process results in emergence of new unforeseen technolo- the point of sufficient maturity. Once an innovation is
gies that start dominating the market. released at the market, there is already another one waiting
One example is artificial neural networks that emerged in the pipeline.
at the intersection between neuroscience and computer
science to later become a transdisciplinary approach
now organically integrated into every industry, including
localization.
According to Kurzweil, by the year 2045, this exponen-
tial growth will go beyond any control and will culminate
in technological singularity. This will be a point of the
utmost unpredictability where “our old models must be
discarded and a new reality rules,” according to science fic-
tion author and computer science professor Vernor Vinge.
It is expected to manifest in the recursive self-improvement
of AI that will reach the point where machine intelligence
exceeds that of humans.
We are still far from 2045, so let us try to analyze cur-
rent trends to see what to expect in the near future and
what this will mean for the translation industry.
September/October 2019 37
Fo c u s
38 September/October 2019
Focus
be as powerful as human intelligence, of light), but our brain then does its the identification of abstract catego-
super AI would surpass all human complex, multi-layered work that ries associated with sub-objects and
intelligence, and its limits will be consists in building bigger shapes out objects which are parts of the image.”
unknown and barely conceivable by a of smaller elements. Photons hitting However, this is the only aspect of
human mind. the eye retina send a signal to the brain function that is used in machine
This is exactly what Ray Kurzweil’s visual cortex that starts processing learning. Overall, our brain is still a
technological singularity is about: the new information. The brain first “black box” and our understanding of
recursive self-improvement of the builds small dots, it then moves to how it works is still very limited. Not
strong AI that will result in the super building edges, then primitive shapes, for long, though.
AI being born. Machine intelligence then object parts and, finally, objects. On April 2, 2013, the Obama
will exceed human intelligence and This whole process of moving administration announced the
go beyond our control. from teeny-tiny photons up to gradu- BRAIN Initiative, aimed at full
Will this actually happen? No ally more complex visual shapes is understanding of brain function.
one knows. We do not even know if mirrored in machine learning. This is a medical initiative in the first
strong AI is possible, not to mention If we take image recognition (one of place: its goal is mapping neural con-
super AI. Will we reach singularity? the focal points of AI research), this is nectivity in the brain in order to treat
Will AI reach superpower, meaning exactly how artificial neural networks “neurological and psychiatric dis-
that machines will be smarter than operate. According to the technical orders, such as Alzheimer’s disease,
humans? No one can tell. If you are report “Learning Deep Architec- Parkinson’s disease, autism, epilepsy,
curious, just wait until 2045 — there tures for AI” by Yoshua Bengio, they schizophrenia, depression, and trau-
are only 26 years left, after all. transform “the raw pixel representa- matic brain injury.”
tion into gradually more abstract The timeline of the project estab-
BRAIN Initiative: One representations, e.g., starting from lished in 2014 is as follows:
possible scenario the presence of edges, the detection of • 2016-2020: technology develop-
of the future more complex but local shapes, up to ment and validation
One of the subjects receiving
broad interest now is mapping neural
connections of the human brain. This
technology is known as connectome.
It used to belong exclusively to the
medical domain, however, with the
advent of artificial neural networks,
the brain has become the foremost
inspiration for the AI field and all the
industries where AI is being used.
Applied to our industry, this tech-
nology resulted in a boost of neural
machine translation (NMT). Quality
of NMT is now substantially higher
compared to previously used statistical
and rule-based MT models. However,
the current state of machine learning
is only using one dimension of brain
functioning, specifically multilayered
neural networks performing trans-
formations of sensory inputs, exactly
as our brain does when it processes
information through multiple stages
of transformation and representation.
One of the examples is how we
build visual images: all we see is The movie Avatar (2009) was the first digital 3D movie that was going to revolutionize
actually a raw sensory input (photons the entertainment industry, which never happened.
September/October 2019 39
Fo c u s
• 2020-2025: application of these technologies in an right: according to various sources, the accuracy score
integrated fashion to make fundamental new discoveries of Kurzweil’s predictions vary from 50% to 89%. Does it
about the brain mean singularity will happen for sure?
This means that if all the brain mysteries get unlocked, No, it does not. Due to the nature of technological
we will be able to extrapolate the laws of brain function- progress that we are witnessing now, no one knows what
ing to artificial neural networks, exactly as we did with the will happen even in five years, not to mention ten or 25
multlayer principle. years from now.
This will result in emergence of strong AI, which means We can only speculate about the future. We can assume
that artificial intelligence will become much smarter and that artificial neural networks, machine learning and
reach the point where it is as capable as the human brain. automation will develop further, but no one knows how
Applied to our industry, it will mean that the quality of far AI will go, what form it will take and what will be the
machine translation output will be comparable to human next groundbreaking technology to change the world in
translation quality. It will also mean other AI-powered tech- the same fashion as internet, social media and big data did.
nologies, such as automated simultaneous interpretation, However, instead of being afraid of what the unknown
will considerably improve and reach human levels of quality. future holds for us, we can assume an explorative and
Which puts us only a step away from singularity. inquisitive approach. Isn’t it exciting to see so many disrup-
tive technologies going viral one after another, changing
Conclusions our world forever? Isn’t it captivating to be witnessing the
This was only one of many possible scenarios. As men- rapid growth in technological advancements that science
tioned earlier, no one knows how AI will develop. fiction writers could only dream of a couple of decades
Bill Gates named Ray Kurzweil “the best person I know ago? I think it is quite fascinating, no matter what the
at predicting the future of artificial intelligence.” He is future brings, and we are all very lucky in this regard. [M]
K A L E I D O S C O P E »Te r m i n o l o g y
TA K I N G YO U R C O N T E N T G LO B A L
»Translations
»Quality Management
»Global Content
Management CHECKTERM
EUROCOM
Validate, discuss, and define terminology effectively. Communicate
terminology to corporate target audiences. Enable effective verifi-
SMARTQUERY
cation of terminology.
GLOBALREVIEW
» EASY ONLINE ROLLOUT
» W W W. K A L E I D O S C O P E . AT
40 September/October 2019
Focus
September/October 2019 41
Fo c u s
Picture yourself 20 years ago. You are chasing your IT engi- horse with only two legs — it can
limp along, but not very efficiently.
neer for installer CDs, picking massive paper user guides off
the shelf, and crawling under your desk to check if your soft- Translation tools need to
ware dongle is in place. To further complicate things, you’ve catch up
just updated a translation memory on your computer, so you’re Indeed, continuous delivery
asking your team of internal reviewers to stop working and go methodology presents a challenge
for a coffee, while you’re zipping the data and emailing it (hello, for the whole localization ecosys-
US Robotics 56k modem!) to your freelancers. And you’re hop- tem. And the weakest link is usually
found in translation tools. Many
ing that all the stakeholders involved in the translation process computer aided translation (CAT)
are using the latest and greatest. tools and translation management
system (TMS) technologies are not
If you’ve been in this industry for as long as I have, these images will suited to purpose; too many still
resonate with you and bring back fond memories. For others, this seems operate offline and are file-centric.
prehistoric. Either way, they illustrate how quickly technology, tools and That means enterprise localization
processes evolve, along with the best practices that go with them. project managers and language ser-
What constitutes today’s best practices quickly becomes obsolete in vice providers (LSPs) waste time on
tomorrow's world. Tools and processes are changing and evolving at an low value-added tasks like handling
increasing speed. In software enterprises, engineering teams adapt faster zips and files on local hard drives,
than localization professionals do — and that includes enterprise localiza- fighting version control issues and
tion departments. That creates a gap, which causes workarounds, conver- converting files in order to bridge
sions, delays, and potentially, show stoppers. the gap by passing plates between
systems that do not speak the same
Engineering is leading the way language. Online repositories and
Continuous delivery is a good illustration of that evolution. It has become TMS/CAT tools can hardly com-
a best practice in the software industry in recent years and is becoming an municate, which slows the whole
established norm. In a continuous delivery environment, months become process, while introducing the
days, and days become minutes. Continuous delivery is all about speed and potential for human error. On top
flexibility: develop, test, debug, translate, release, rinse and repeat. of this, the amount of work neces-
How do engineering teams achieve that magic? By maintaining complex, sary to process a translation batch is
multi-branching environments, and working in virtual teams across the the same, regardless of the number
globe, yet using centralized resources: one source code can live in an online of words to translate: files must be
repository in multiple instances without any conflict, like coexisting time extracted from the online repository
capsules in an ever-changing environment. Each time capsule, or “branch,” and prepared in one tool. Then they
contains a snapshot of the code base. Engineers are assigned to perform are pushed to translators in another
specific tasks, like fixing bugs or developing new features, only in that tool to be reviewed and have QA
branch. When the branch is complete, it’s ready to be merged to the master applied. Once that’s done, files have
code, or “trunk,” and be released at the engineering team’s will. This pro- to be converted and committed back
cess is extremely flexible, and lets teams release software products — SaaS to the repository. All the while, doz-
or mobile, or both — quickly, in a granular fashion. ens of emails have been exchanged
Product managers love continuous delivery too, because they can meet to monitor the process.
their customers’ needs so quickly. A new feature can be live in weeks, In the end, this compulsory
if not days. Bugs get fixed and released seamlessly. Continuous delivery preparation work sometimes takes
reduces risk too; by releasing new features incrementally, bugs and other more time than the actual transla-
defects are less likely to be left in the code base. Since quality assurance is tion work does. By following this
performed thoroughly right before each release, not months after coding, cumbersome process for each
engineers can more easily spot errors in their freshly-made code. branch, LSPs and the software
Faster time to market, better quality, granular control, continuous inno- teams that rely on them lose time,
vation, flexibility. On paper, there are only benefits in adopting continuous money and sometimes patience.
delivery. However, this well-oiled machine can grind to a halt once resource There has to be a better way. What
files hit the localization world. The process becomes something like a could this new world look like?
42 September/October 2019
Focus
The essence of continuous deliv- and file-handling local tasks, which and redeliver files to customers
ery is to process software piece would no longer be necessary with through APIs. If such a technology
by piece. This creates significant tools that connect translation pro- were available, LSPs would spend
overhead work that should — must cesses with engineering branches. less time handling projects, trans-
be — automated. That is a pain Bye, bye, minimum charge model. lators would be more productive,
point for translators and everyone Outdated CAT/TMS technology and the whole process would be
else involved. Workarounds exist and the minimum charge model more streamlined and efficient. In
but are inefficient and not a good are showing their limits when a nutshell, you would see savings in
use of everyone’s time. If engineer- what matters most is speed and time and money all the way down
ing can catch up in this speed- connectivity. the production line.
oriented environment, so should Continuous delivery can work
localization. Another way to crack Connectivity could efficiently only in a highly con-
the whip on LSPs, right? Not neces- empower LSPs to invest nected environment. SaaS, open
sarily, if they’re provided with the in tech standards, centralized resources,
right tools. Along with the development of connectivity and application pro-
new continuous delivery-friendly gramming interfaces are the keys
Minimum charges could CAT tools, LSPs have a unique to bring the engineering and local-
become a thing of the past opportunity to become more ization worlds together into a more
In light of these obvious limita- software-oriented. Most freelanc- streamlined ecosystem.
tions, LSPs are on the brink of a ers work for multiple agencies or The need for speed demands
revolution too. Minimum charge is clients. Why should they pay for connectivity, which leads to greater
a hurdle in this fast-paced environ- multiple CAT tools? The challenge efficiencies, more automation and
ment. What enterprise would pay represents an opportunity for LSPs less monitoring. The result? Faster
$30 for five new strings per lan- to join forces and build a SaaS plat- time to market, and big reductions
guage several times a month? Who form that allows both themselves in costs and overhead for everyone
would wait a week for these strings? and freelancers to concentrate on involved. But if we want to see con-
Let’s look at the LSP side for a their core competency: providing tinuous delivery’s true capabilities
minute. Minimum charge applies translation services. Using con- realized, TMS and CAT technology
for one simple reason: to pay for nectivity and open standards, that needs to catch up — and catch up
all those preparation, conversion platform could also post-process quick! [M]
ISO/IEC 27001:2013
Certification translation@hetermedia.com www.hetermedia.com A Company of HM International Holdings Limited
September/October 2019 43
Fo c u s
Jeff Beatty
Jeff Beatty is the head of local-
ization at Mozilla, the makers of
the popular open source web
browser Firefox. He holds an MS
in multilingual computing and
localization from the University
of Limerick.
Staś Małolepszy
Staś Małolepszy works with hun-
dreds of volunteers around the
globe who continue to deliver
top quality localization of Firefox
in nearly 100 locales to over 400
million users worldwide.
44 September/October 2019
Focus
One of the constant challenges in develop- tion challenges. Using traditional localization solutions,
these are difficult to overcome. We’ve found that software
ing global software is reducing technical debt
localization has been dominated by some outdated para-
and legacy code. Ruthless prioritization takes digms, which introduce significant problems.
place when it becomes clear that an organization 1. Translations map one-to-one to the source language.
needs to replace its legacy code with something 2. Users receive localization updates in the form of
more efficient and modern. Very often, legacy new executable builds.
code that affects internationalization (i18n) and 3. User language preferences are binary, with English
localization (l10n) is one of the last areas of the as the default fallback locale.
codebase to be prioritized in this effort. This is With a broad, long-term vision, we began working
the situation we found ourselves in at Mozilla. on Fluent, a modern localization system that not only
addresses Firefox’s legacy i18n/l10n code, but also aims
Firefox and its rendering engine, Gecko, had to overturn these paradigms for everyone who develops
become bloated and filled with legacy code global software.
that needed refurbishing. Thanks to the Firefox
Quantum release in 2017, this is no longer the Problem paradigm: Translations map
case. However, part of that legacy codebase was one-to-one to the source language
i18n/l10n. In fact, prior to 2018, this part of The grammar of the source language, which at Mozilla
the codebase hadn’t been altered or updated in is English, imposes limits on the expressiveness of the
nearly 20 years. As a result, Firefox had a number translation. Consider the following message that appears
in Firefox when the user tries to close a window with more
of significant i18n/l10n problems:
than one tab:
• Yellow Screen of Death (YSOD): users were con- tabs-close-warning-multiple =
fronted with a YSOD XML parsing error when a translated You are about to close {$count} tabs.
string was malformed, effectively rendering their browser Are you sure you want to continue?
useless. The message is only displayed when the tab count is
• English fallback: if a string was untranslated, it would two or more. In English, the word tab will always appear
appear in English, whether the user understood English as plural tabs. An English-speaking developer may be
or not. content with this message. It sounds great for all possible
• Single-locale builds: users struggled to find Firefox in values of $count.
the right language due to there being over 100 different
builds of Firefox to choose from for download and install.
• Source strings had global impact: monolingual
developers were expected to craft source language strings
in syntax that, while natural-sounding in English, affected
all target language translations and produced unnatural-
sounding translations.
• No pseudolocalization: as a practice, pseudolocal-
ization was nonexistent. I18n problems were discovered In English, a single variant of the message is enough
manually, often post-release. for all values of $count.
• Multiple string formats in one product: requiring Many translators, however, will quickly point out that
developers and localizers to know how to form correct the word tab will take different forms depending on the
strings in both .dtd and .properties files for one single exact value of the $count variable.
product, introduced high onboarding costs and a high risk In traditional localization solutions, the onus of adapt-
for errors (which would produce YSOD). ing this message to other languages is on developers.
• Long wait time for localization updates: users had to They need to account for the fact that other languages
wait for the next version of Firefox (between 6-18 weeks) distinguish between more than one plural form, even if
before localization errors would be corrected. English doesn’t. As the number of languages supported
While some of these challenges are unique to Mozilla, in the application grows, this problem scales up quickly
many of them plague every company out there creating — and not well.
global software. With almost 100 supported languages, • In some languages, nouns have genders that require
Firefox faces many unique and common industry localiza- different forms of adjectives and past participles. In
September/October 2019 45
Fo c u s
French, connecté, connectée, connectés and connectées all of their language. With Fluent, the Czech translation can
mean connected. now benefit from correct plural forms for all possible val-
• Style guides may require that different terms be ues of the $count variable.
used depending on the platform the software runs on. In Czech, $count values of 2, 3 and 4 require a special
In English Firefox, we use Settings on Windows and plural form of the noun.
Preferences on other systems, to match the wording of At the same time, no changes are required to the source
the user’s operating system. In Japanese, the difference code nor the source copy. In fact, the logic added by the
is more stark: some computer-related terms are spelled Czech translator to the Czech translation doesn’t affect
with a different writing system depending on the user’s any other language. The same message in French is a
operating system. simple sentence, similar to the English one:
• The context and the target audience of the applica- tabs-close-warning-multiple =
tion may require adjustments to the copy. In English, Vous êtes sur le point de fermer
software used in accounting may format numbers differ- {$count} onglets.
ently than a social media website. But in other languages, Voulez-vous vraiment continuer ?
such a distinction may not be necessary. The concept of asymmetric localization is the key
There are many grammatical and stylistic variations innovation of Fluent, built upon 20 years of Mozilla’s his-
that don’t map one-to-one between languages. Support- tory of successfully shipping localized software. Many key
ing all of them using traditional localization solutions ideas in Fluent have also been inspired by XLIFF and ICU’s
isn’t straightforward. Some language features require MessageFormat. Asymmetric localization doesn’t stop at
trade-offs in order to support them, or aren’t possible at plurals, however. Fluent translations can vary depending
all. on the gender, the grammatical case, the operating system
Fluent turns this localization paradigm on its head. and many more variables. All of this happens in isolation;
Rather than require developers to predict all possible the fact that one language benefits from more advanced
permutations of complexity in all supported languages, logic doesn’t require any other localization to apply it. Each
Fluent keeps the source language as simple as it can be. localization is in control of how complex the translation
We call this idea asymmetric localization, and it makes becomes.
it possible to cater to the grammar and style of other
languages, independently of the source language. Problem paradigm: Users receive localization
Consider the Czech translation of the “tab close” mes- updates in the form of new executable builds
sage discussed above. The word panel (tab) must take one According to the traditional software localization pro-
of two plural forms: panely for counts of 2, 3 and 4, and cess, a localized product is produced as a result of building
panelů for all other numbers. static language resources into an executable file, which is
tabs-close-warning-multiple = {$count -> then distributed to users. Any update to these language
[few] Chystáte se zavřít {$count} resources requires a new executable file and for the distri-
panely. bution chain to carry that to users. Because of this, most
Opravdu chcete pokračovat? software companies elect to postpone localization updates
*[other] Chystáte se zavřít {$count} from the moment they’re available to a time in which they
panelů. can be bundled with other improvements to the software.
Opravdu chcete pokračovat? While this is a cost- and effort-efficient means of produc-
} ing software updates, it also treats localization, and users of
localized products, as second-class citizens by prolonging
the user’s exposure to broken or unintelligible localization.
With Fluent, this process can be decoupled, allowing for
localization updates to ship independent of a broader release
schedule. Rather than language resources being part of the
software package alone, they’re delivered to users via secure
API calls when they start up the software. Even better, these
API calls make it possible to deliver localization updates
without intervention from the user — no need to manually
initiate an update or even restart the software. For web apps,
Fluent empowers translators to create grammatically the process is even more efficient: users see updates imme-
correct translations and leverage the expressive power diately, without even needing to refresh the page.
46 September/October 2019
Focus
September/October 2019 47
Fo c u s
Wikidata
gets wordier
DBpedia(enDE)
DBpedia(enUS)
Wikidata •Entities
•Entities •Facts
•Facts
Wiktionary (enDE)
Wikimedia Commons
Wiktionary (enUS) Figure 1:
•Graphics/images
Wikidata in the context
•Words of Wikimedia galaxy •Videos
•Sounds/audio
Wikipedia (enDE)
Wikipedia (enUS)
•Knowledge
Christian Lieske
Christian Lieske is involved in SAP language technologies. He has worked with the
World Wide Web Consortium (W3C) and has contributed to the European Com-
mission’s MultilingualWeb initiative and standards such as the XLIFF. He has a for-
mal education in computer science, natural language processing and AI.
Felix Sasaki
Felix Sasaki's field of interest is the application of web technologies for rep-
resentation and processing of multilingual information. He has worked for
the W3C and DFKI on internationalization AI. He recently joined the German
publisher Cornelsen Verlag as a content architect.
48 September/October 2019
Focus
Wikidata is a nonprofit knowledge base that One motivation for Wikidata is the possibility to avoid
inconsistencies between single-language Wikipedias (see
anyone can edit and use. Because of this, AI can
Figure 2). The idea is simple: entities and their properties
be shaped to a certain degree by anyone.
(such as demographic data about a country) are stored
Backed by the Wikimedia foundation, a vibrant ecosystem only once in Wikidata. The different Wikipedia versions
helps Wikidata to make a mark on modern content processes. (for example, the German and the Chinese one) refer to
Its coverage (56 million items in April 2019), intuitive tools this single source of truth for example via special Wikipe-
for end users and powerful interfaces for programmers make dia templates. If a fact in Wikidata changes, all referring
it a versatile tool for a large variety of usage scenarios — such Wikipedia articles automatically reflect this change.
as knowledge discovery, content enrichment, terminology Wikidata’s license (Creative Commons CC0), governance
work and translation. In autumn 2018 Wikidata enhanced its model and collaboration opportunities hold the promise of
capabilities to capture information related to words, phrases synergies and shared quality control. It can therefore be
and sentences in many of the world’s languages. considered a valuable tool for anyone involved in creating
and processing knowledge/content. In scenarios where
The galaxy any knowledge can be shared, the need to operate a private
A look at the start page of Wikipedia at www.wikipedia. knowledge infrastructure is reduced. End user tools such as
org and Figure 1 shows Wikidata in the context of the Reasonator (see Figure 3) provide easy access to Wikidata
Wikimedia galaxy. Often, discussions of the Wikimedia information, including multimedia content such as images.
galaxy include entities that are not part of the galaxy in a
strict sense. A significant example is DBpedia. Words in Wikidata
Wikidata is like Wikipedia because anyone can consume Since day one, Wikidata Items could be associated with
(read) or modify (write) it. The key differences from Wiki- labels and descriptions in any number of languages. Accord-
pedia are that Wikidata stores information in a structured ingly, several general Wikidata tools for end users are related
manner, while information in Wikipedia is stored mostly to language. Some examples are Ask Wikidata, which pro-
unstructured — the semi-structured info boxes are the excep- vides a chatbot-like interface to questions readers want to
tions to this. Additionally, there is only one Wikidata, while ask, and Wikidata Translate, which uses the “power of Wiki-
there are approximately 300 single-language Wikipedias. data to translate a term between two or more languages.”
Figure 2: Inconsistencies between Wikipedia in different languages. The population listed in the English version differs from
the population listed in the German version.
September/October 2019 49
Fo c u s
Figure 3: Reasonator.
Additionally, the presence of ontological information lexicographic capabilities?” The brains behind the Wikibase
(“subclass-of ”) in Wikidata allows the generation of tax- Lexeme extension — the technology that incarnates the
onomies and other knowledge organization tools. enhancement — put it like this: “The Wikibase Lexeme exten-
Tools like Reasonator, Ask Wikidata, Wikidata sion provides improved modeling for lexical entities such as
Translate and Wikidata Taxonomy realize useful usage words and phrases. While it would be theoretically possible
scenarios related to Wikidata. The full power of Wiki- to model these things using Items, a more expressive special-
data, however, is only accessible via SPARQL. SPARQL ized model helps to reduce complexity, and improve re-use
is a family of standards for programs related to linked and mappings to other vocabularies.” A statement from Jorge
data and the Semantic Web — a concept that underpins Garcia (lead of the W3C Ontolex Working Group), made
Wikidata. The examples for the SPARQL query end user in the context of a seminar for the European Commission
interface demonstrate this, and show how to work in Directorate-General for Translation, on the topic “Linguistic
domains such as medicine, computer science, art, history Linked Open Data for Terminology,” captures the underlying
or sports. An interesting feature of the interface is the modeling. In the linked data paradigm, any element of the
different options for visualizing results, including tables, lexicon can become what Garcia dubs a “first class citizen,”
diagrams, (for certain types of data) timelines, maps (see becoming the center “of a graph-based structure, which will
Figure 4) and so on. allow for many other possible arrangements and views on the
“Wikipedia and Wikidata tools” (see http://arxiv.org/ information. Linked Data has proved to be useful for language
pdf/1602.02506v1) explains how to gather information for resources in general, particularly when it comes to terminolo-
a given item. Some examples related to language, which gies and dictionaries.”
include conceptual knowledge, are: The 2018 Wikidata enhancement thus facilitates capturing
• Taxonomy (via SPARQL) – all subclasses of computer information on words, phrases and sentences — for many
science http://tinyurl.com/y83uub8h languages, described in many languages. A major piece of
• Taxonomy (via “Wikidata Taxonomy”) – all this enhancement was the introduction of “lexeme” as a third
subclasses of Knowledge Organization System so-called entity type (the existing ones being “item” and “prop-
http://jakobvoss.de/wikidata-taxonomy/?id=Q6423319 erty”). This entity type allows important features of lexemes
• Items in context – computer science and its super- (such as lemmas, forms or senses) to be captured easily based
classes https://tinyurl.com/y74j9an3 on the general entity type provisions for properties, qualifiers,
• Domain specific word lists – labels for diseases references and so on (see Figure 5).
https://tinyurl.com/ydercoqn The contribution of lexeme-related information is in full
• Multilingual word lists (1) – labels for diseases in swing. Nearly 45,000 lexemes (see Figure 6) ranging from
English and German http://tinyurl.com/y74jj6yj what one could call rudimentary, to advanced (see Figure 7),
• Multilingual word lists (2) – translations of the term to stunningly rich. A query to retrieve the “biggest” lexemes
tuberculosis https://tinyurl.com/y7o6e9pm yields quite a number of lexemes with information for more
Looking at the examples from the previous sections, one than 100 features covering etymology, senses, grammatical
may wonder “Why in the world was there a need for enhanced information and more.
50 September/October 2019
Focus
Figure 4: SPARQL end user query interface (map with sculptures in Paris).
Figure 5: General data model (below left), and data model for lexicographic data (below right).
September/October 2019 51
Fo c u s
Where to?
Again, Wikidata has many applica-
Figure 6: Statistics on lexicographic data in Wikidata (via Ordia). tion areas. Thus, it does not come as
a surprise that a search in arXiv — a
More sample queries (especially Since SPARQL is a technology for repository of preprints of scientific
for getting started with your own the world of web services, it can be used papers — provides some hints on
experiments that specially target with any of the programming languages machine learning and AI areas for
lexicographic information) include that are used to build web service to which the use of Wikidata already is
a bar chart of the ten languages create anything from single purpose being investigated. Examples include
with the most lexemes (see Figure apps, to powerful, flexible solutions. human-bot collaboration and gen-
8) at http://w.wiki/3TF. You might The possibilities for language-related erating Wikipedia summaries for
also check out the number of applications drawing on Wikidata seem underserved languages.
lexical entries in Wikidata (http://w. endless. Some ideas of what to do are: One Wikidata area that is picking
wiki/3Rx); the types of lexical a. Match terminology for a domain up speed is related to import and map-
properties in Wikidata (http://w. (e.g. www.agilealliance.org/agile101/ ping. Wikidata allows users to “map”
wiki/3Rz); example statements for a agile-glossary/ or www.scrum.org/ data sets (see the Wikidata “Data
lexeme (http://w.wiki/3R$); composi- resources/scrum-glossary) against Import Guide” at www.wikidata.org/
tion of an example lexeme (http://w. Wikidata to find existing translations, wiki/Wikidata:Data_Import_Guide,
wiki/3S2); information about for instance. step 8). Among other things, this
lexical forms for an example lexeme b. Match a text against Wikidata enables automatic content enrich-
(http://w.wiki/3S4); and grammati- to (e.g. via https://tools.wmflabs.org/ ment. As an example, Wikidata con-
cal features of lexical forms for an ordia/text-to-lexemes) get grammati- tains identifiers of the “Gemeinsame
example lexeme (http://w.wiki/3S6). cal information for the text’s tokens or Normdatei (GND)” and links them to
52 September/October 2019
Focus
16K
14K
12K
10K
8K
COUNT
6K
4K
2K
0
English French Swedish Basque Nynorsk German Polish Czech Danish Japanese
Wikipedia. The GND identifier for an author thus can be Interested constituencies and individuals could
linked to his biography in Wikipedia. Discussions around become active in Wikidata in a number of ways. For
this touch on terminology, and emoji, for example. Wiki- example, they could systematically integrate data cat-
data does already relate to terminological artifacts such as egories relevant to a certain domain into Wikidata, or
ISO 12620 and ISOcat and its successor DatCatInfo. An adapt existing Wikidata data categories to the needs
example is www.wikidata.org/wiki/Property:P2263. of that domain. Perhaps they might systemize the
Within the Unicode consortium, a discussion has been mapping between Wikidata properties and domain-
started to use the Wikidata numbering system (“QID”) to specific data categories. Or perhaps they could explain
create a system of emoji encoding that lies outside core the added value of mapping for a certain domain (for
Unicode regulation (see http://twitter.com/jenny8lee/ example, access to multimedia assets). The possibilities
status/1123335017919336451 and www.unicode.org/L2/ are numerous, and as varied as the language industry
L2019/19082-qid-emoji.pdf ). itself. [M]
September/October 2019 53
Fo c u s
IT context of
human language
translation
David Filip
David Filip is a researcher in next generation localization project and process management and an interoper-
ability standardization expert. The underlying research was supported by Science Foundation Ireland as part
of the ADAPT Centre at Trinity College Dublin. He is a current member of the MultiLingual editorial board.
54 September/October 2019
Focus
Ireland has been perceived since the 1980s and resulted in the foundation of modern symbolic
logic in the 19th and 20th centuries. Without symbolic
as a global capital of industrial translation and
logic and formal logical methods, the foundation of
localization. This happened largely because
computer science by Alan Turing and company in the
global multinationals such as Oracle, Microsoft, 1940s is simply unthinkable. Each and every modern
Google and Facebook were happy to headquar- programming or data modeling language currently in
ter their globalization and internationalization existence uses a subset of symbolic logic notions stem-
efforts (for Europe, the Middle East and Africa ming from this tradition, such as conditional (if then),
if not globally) in a friendly, English-speaking EU biconditional (if and only if ), conjunction (all cases at
country, a trend becoming even stronger with the same time), disjunction (at least in one of the cases),
Brexit. negation (not having a property), predication of proper-
ties (attributes) to objects (subjects, individuals, mem-
Yet there is a far wider IT context in which we need bers of the universe). Pretty much all programming is
to look at human languages and translation, well beyond based on testing if an object has an attribute (property)
Ireland or the EU. There’s a rich relationship between and then doing something based on the outcome. The
human natural languages and reasoning automation outcome is typically binary (very often recursively
ideas that ultimately lead to the formation of computer nested) and systems largely differ only in how they treat
science in the industrial era. the undefined cases.
Human language has abstract semantics and among Language is a baseline characteristic of a human as a
other things, it allows us humans to make logical infer- social animal. Symbolic representations of languages and
ences. Inferences, or the ability to relate thoughts and the unlimited exchange of abstract ideas that this facili-
make conclusions, form the baseline for interpersonal tates are what gives us the ultimate advantage over other
communication. Perhaps somewhat surprisingly, it can intelligent mammals such as dolphins or apes. The fact
be argued that all philosophical and logical develop- that language encodes interpersonal abstract thoughts
ments that eventually led to the creation of computer underlies not only the human ability to reason, but also
science were initially founded in the systematic study the human ability to translate from one human natural
of language. This happened from antiquity, through language to another.
medieval scholastics, up through the modern indus- As corollary, there is a fundamental difference between
trial and post-industrial era, when the automation a human translating and a computer translating. Even
ideas of 17th century philosophers gradually became the most advanced neural machine translation (NMT)
implementable. algorithms, running on the largest graphics processing
Aristotle was the first man known to notice, in his unit clusters, are performing operations on language
Prior Analytics, that language-based reasoning relies on as instantiated with a specific syntax. Deep learning
an abstract inner structure of thoughts. He founded the (DL) algorithms may glimpse functional dependencies
theory of quantifiers by recognizing statements made in between and among syntactic forms expressing thoughts
general (about all members of a scope or universe) or in languages, so that they sometimes cater to semantic
in particular (about at least one member of a scope or and pragmatic factors. But it’s chance, in the sense that we
universe). He noticed that pairs of input thoughts can know the computer did not decode the abstract thought
be evaluated against any single supposed conclusion in behind the syntax of a specific sentence, and did not con-
a rule based or automated way (evaluating the abstract sider if the intended thought might have been affected by
thought structure independent of the actual psycho- semantic or pragmatic relationships. It simply calculated
logical, spoken or written instances). Inferences thus that there is a high chance a certain string of characters
structured are called syllogisms, and from the modern or sounds in a language means the same as another string
point of view, these cover just a small fraction of pos- or characters or sounds in another language.
sible reasoning structures. But even these can cover a lot The machine did not perform the leap to the semantic
more when recursively chained. Additionally, the stoics and pragmatic levels to reconstruct the source meaning
recognized the nature of logical connectives such as and, in a certain pragmatic context in the target language. It
or and either/or. Again, as with Arsistotle, they glimpsed merely used advanced and opaque statistical methods
the abstract structure behind human language and laid to perform complex syntactic operations on strings of
the foundation of so-called propositional calculus. characters, never ever leaving the syntactic level.
These efforts were summarized in the 17th century, Human translators of course make mistakes, but the mis-
in unpublished Leibniz works on rational calculus, takes from human translators are fundamentally different.
September/October 2019 55
We can say that the machine produces in general are defined by vocabulary
both the correct and the incorrect (what thought is expressed by each
translations by chance. The algorithm primary component and what those
designer tries to make the chance of primary components are) and gram-
producing a correct translation as mar (the syntactic rules for using the
close to 100% as possible, and reduce vocabulary; among other things, how
the chance of producing a wrong to construct more complex expres-
translation to 0%. However, achieving sions from the primary components).
these ultimate limits is impossible It is important to repeat that
unless new semantic and pragmatic human language has semantics,
interferences are excluded. and moreover human language
Language study is indispens- is acquired by individuals in a
able in the latest technological pragmatic context. Thus, human
developments. Human language language can be reduced neither
is the preferred interface between to pure syntactic rules nor to pure
humans and technology. Therefore, semantics; human language is
development of human language always overloaded with pragmat-
based inference, communication, ics. This is why ML techniques can
and decision making capabilities in never crack it for good, because
computers is a critical strain in cur- no matter how deep the neural
rent AI research. network, results of a deep learning
Unfortunately, the current hype algorithm are always functional.
around DL and applications of neu- Even if the “designer” of that system
ral networks in general lead to many might well not be able to tell what
misconceptions. One of the most the function is or what the function
dangerous misconceptions among will end up being based on data that
nontechie decision makers is that — will be fed to the machine during its
somehow — data exchange formats training. In various contexts, neural
and interoperability standardization network based algorithms can have
are becoming less important with the better results in sensing or decision
advent of AI. Nothing can be further making than a human has — and it
from truth. In fact, strict formalism, is particularly easy to beat a human
standardization and modular addi- not trained in a decision-making
tivity of algorithms are what makes task. This is because such a system
it at all possible to have transparent has a bigger and faster capacity to
AI and to keep humans in the loop; absorb data. However, such a sys-
to augment human capabilities by AI tem doesn’t understand the data it
rather than making people involun- was fed, it merely performs statisti-
tary slaves of some opaque machine cal calculations on the syntax and
learning (ML) driven technology. both the semantic and pragmatic
After all, we have explained above levels remain out of bounds for any
that none of the current ML meth- ML system, deep learning or not.
ods ever leaves the syntactic plane. This is the ultimate rationale for the
So it is entirely absurd that such human in the loop and for using ML
methods would make standardiza- methods to enhance human capa-
tion of formats obsolete. Formats bilities rather than to replace them.
56
Tools and Services Showcase
Perhaps the most famous case is the used to a great advantage in making more hardware to run. Even GAN
centaur chess. After being defeated sensing systems more robust fac- systems are still statistical, although
in chess by Deep Blue, Garry Kasp- ing ever improving fake input data. they can make great leaps toward
arov took to centaur chess. No To explain very simply, AlphaGo very low error rates with limited
unassisted chess computer can beat acquired its initial amateur level by data. Because those systems are
a human grand master assisted by a playing amateur online matches, then statistical and not capable of
comparable AI. Similarly, Go game trained to master level by playing semantic insight, it is important
theory was greatly enriched by itself a zillion times. GAN hasn’t been to standardize the syntax of their
interaction between human players heavily used in machine translation input to increase their chance to
and the AlphaGo computer to beat (MT) so far, but first attempts were perform beyond human par. Finally,
the South Korean Grandmaster published. The question is, if GAN the standardized format is not only
Lee Sedol. Fan Hui (2nd dan) who can be as efficient in an open system critical at input but even the AI
played AlphaGo before Lee Sedol problem such as human language needs a method to store and display
(9th dan) admitted he became a far translation, as opposed to a closed its results, again nothing you could
better Go player with better strate- system with a clear win condition address without a format defini-
gic foresight after playing AlphaGo; that Go is, or binary sensing problem tion, and of course better standard-
indeed, his worldwide ranking (horse or not a horse, the authorized ized than proprietary if you want to
jumped from 600s to 300s. user of this computer or not). exchange and display (render) the
Interestingly, AlphaGo training To conclude this arc, NMT sys- information.
was a great example of using a tech- tems are still statistical systems, For instance, Internationalization
nique called generative adversarial albeit statistical systems that are Tag Set (ITS) 2.0, the W3C interna-
network (GAN). This technique is less explainable and require much tionalization and localization metadata
September/October 2019 57
Fo c u s
standard Multilingual readers know “The ITS 2.0 specification enhances Analytic quality evaluation in
from our biennial series of Localization the foundation [XML and HTML 5] translations has been a long-term
Standards Readers, has been listed by to integrate automated processing of concern in our industry. I applaud
JTC 1 Big Data Standards Roadmap as human language into core Web tech- Multidimensional Quality Metrics
a key enabler for automated processing nologies and concepts that are designed (MQM) in particular because it
of human language within Big Data to foster the automated creation and seems that it might have finally suc-
and AI architectures: processing of multilingual Web content.” ceeded in explaining to industry
stakeholders that there is no concept
of quality without specifying expecta-
tions (requirements to be fulfilled in
order to serve a specific purpose). It
also makes clear that even MQM can
only be applied if you subset it based
58 September/October 2019
Focus
Emerging technologies
and the cost of
video localization
You’ve just been asked by a client to dub a Emerging technologies are being leveraged to sim-
plify the video localization process. They are making the
short video with a tight turnaround time and a whole process faster, less expensive and more efficient.
limited budget. How do you tell your client that Advances in AI and machine learning technologies are
it’s not possible due to the tight deadline and bud- now making important inroads in transcription and
get? Or do you? text-to-speech (TTS) while automation in dubbing can
September/October 2019 59
Fo c u s
help expedite the post-production associated with every sentence and time of delivery. Machine-generated
process as well. Transcription, TTS collect the essential data from them transcriptions are now preferred
and dubbing are essential in video (known as text mining or text analyt- for projects with a tight turnaround
localization, yet they add to the proj- ics). The more data the computer col- time and budget. For extra assur-
ect cost because traditionally they lects and analyzes, the more accurate ance in accuracy, using both manual
have been done manually. Automat- it will become. and automated methods is the ideal
ing these aspects of the process can solution.
and will make localizing a video much Limitations of automated
more viable than ever before. transcriptions Text-to-speech
AI and NLP are emerging technol- Instead of using human voice to
Transcription ogies and there is still much work to record the voiceover (VO) speech of
Manually transcribing speech into be done in order to create a computer the video, companies are now explor-
text has always been considered the that can transcribe human speech ing the possibility of using TTS to
best way to do audio transcriptions. It with the same accuracy as human record the VO. The automated voice
is perceived as more accurate because transcribers. Obviously, there are lim- is becoming more and more natural
manual transcribers can choose to itations to fully accurate automated sounding, less robotic and machine
slow the playback speed of the audio transcriptions, mainly because spoken sounding.
or video files so they can type at their language is full of irregularities like For certain usages, like interactive
own pace; however, it also takes more pauses, filler words, mispronuncia- voice response or training videos
time and money as human effort is tions and nonstandard grammar. This with no presenter on screen, using
involved. With the advent of AI, it is makes it tricky for AI to classify it and TTS is less costly and has a quicker
now possible for machine (automated) understand its patterns. turnaround time. There is no need to
transcription to be a faster, reliably In addition to the complications hire a professional human voice tal-
accurate and more economical form the spoken language poses to com- ent, book studio time and hire sound
of audio transcription. puters, other elements in the audio engineering services — these costs are
With automated transcription, the or video can affect the accuracy of traditionally quite prohibitive and are
computer listens to and types out transcription. AI can’t account for the main reason why video localiza-
what's being said in the audio or video all the ambiguities when there are tion with dubbing is usually substi-
files using speech-recognition tech- multiple speakers in the audio: people tuted with subtitles instead. Moreover,
nology. Although the accuracy level is interrupting each other, some people if changes are needed, a rerecording
not perfect, it is still close enough to speaking more loudly than others, is not instant — we need to rebook
be acceptable as the turnaround time several people speaking at the same the voice talent’s time and may need
is much faster and the cost is much time and so on. Computers also to book the studio again. If the client
lower than manual transcription. The struggle with audio with background wants to make additional changes,
near-perfect accuracy will invari- noises and traffic sounds. Currently, extra costs will be incurred.
ably facilitate the human editing and also, NLP research is focused on pri- With TTS, the cost is minimal, and
reviewing process. marily American and British accents. in some instances, it’s free. And it’s
A subfield of computer science Transcribing speakers with different instant — with a click of a button, you’ll
and AI, natural language process- accents using automated technology get the voice recording in minutes. If
ing (NLP) aims for computers to may result in inaccuracies. there are changes to the script and a
understand, interpret and manipu- Although automated transcrip- rerecording is needed, that’s fine. Just
late human language. Most NLP tions have limitations, they are upload the updated script and gener-
techniques rely on machine learn- becoming more and more accepted ate the voice recording again.
ing to derive meaning from human as a viable alternative to traditional TTS does present a couple of chal-
languages. NLP entails applying manual transcriptions. To counter lenges that require post-engineering
algorithms to identify and extract the accuracy issue highlighted as a work. Firstly, the pronunciation of a
the natural language rules such that limitation, combining both manual certain word or sound unit may be
the unstructured language data is and automated transcription may different between various languages.
converted into a form that comput- be the ultimate solution for accu- For example, in the Japanese word
ers can understand. When the text racy and speed. A human editor genba, the first syllable is pronounced
has been provided, the computer will can be used as a quality assurance with a hard g sound as in get, not a soft
use algorithms to extract meaning mechanism to ensure accuracy at the g sound as in gem. To make sure the
60 September/October 2019
Focus
automated voice makes the correct audio segment by segment or ask the would reduce the hours and dollars
pronunciation, a speech synthesis voice talent to match the length of the the manual method would usually
markup language (SSML) formatted video during the recording process. require, resulting in lower costs and
tag can be inserted into the audio file Even worse, you may need to rerecord shorter turnaround times.
code, like a special instruction. More the whole recording, which means you Technological advances are provid-
information can be found on the W3's would need to rebook the voiceover ing an alternative to the labor-inten-
SSML 1.1 specification page. talent as well as the recording studio, sive work and high costs of manual
The second challenge TTS poses and then have the sound engineering processes in video localization. Years
involves post-editing adjustment of to redo the synchronizing. ago, to have a script recorded by a
the voice recording. Sometimes the machine meant a robotic sounding
recorded segment may be a bit lon- Automation as a solution voice reading in a stilted mono-
ger or shorter and may not sync per- As you can tell, this traditional way tone manner. Now, AI and NLP
fectly with video. Minor post-editing of dubbing is too rigid and time con- have made it possible to render
adjustments will then be required to suming, and is therefore one area that machine-recorded speech to sound
tweak the TTS recording. Such cost benefits from automation. There are as humanly natural as possible.
and time needed will still be less than now proprietary technologies emerg- With these advancements in tech-
using the traditional method. ing that allow you to upload your nology and innovation, dubbing a
audio recording and video files into a video is no longer a luxury, deemed
Automated dubbing cloud-based platform, and in a mat- only possible for deep pocket com-
When we think of dubbing, we ter of minutes, a synchronized video panies. Dubbing is now considered
think of movies dubbed in another in another language will be available the ideal method for videos to mar-
language, meaning you can hear the for download. This eliminates the ket your product, to train your staff
audio in that target language with each need to manipulate the sound waves, across the globe and to conduct
character using a voice similar to the manually cutting here and there to eLearning courses. These emerging
original one and their lips are synced make the audio match the video. On technologies will only improve and
with the target language. Dubbing top of that, these technologies also get more and more sophisticated,
in movies, TV commercials and TV include mixing in the background so it’s time to stop thinking of dub-
shows and documentaries demands music and sound effects. Automat- bing a video as a luxury, but instead,
a more rigid production process and ing this synchronization process as a necessity. [M]
much stricter quality control expecta-
tion, which makes automation not
the ideal solution, at least not for now.
However, automation technology in
dubbing has advanced so much that
it can be implemented for videos that
do not require such strict and exact
results, such as short promo, training,
eLearning and help videos.
Automated dubbing technology will
take the TTS or human voice recording
and synchronize it automatically with
the video. Traditionally, this post-pro-
duction stage requires a sound engi-
neer to manipulate the audio recording
to match the corresponding segment
in the video — a time-consuming and
costly part of video localization. It
means going through each segment
of the recording and making it match
with the video, which is not always
possible. To fix this synchronization
issue, you can either edit the video and
September/October 2019 61
Fo c u s
62 September/October 2019
Focus
Three years ago in 2016, Erik Vogt (my friend world’s companies needing to be more global than ever while
staying ahead of the curve of rapidly evolving technology.
and industry colleague for over 20 years) and I The pressure to deliver more, cheaper and faster trick-
ran a webinar called “Past Is Prologue” where we led down and drove process innovation and automation
looked at the state of technology — including in localization. New tools and processes were being devel-
localization technology — from 1996. I would oped both commercially and privately to do just that.
describe it as an exercise in hindsight analysis. In particular, translation memory (TM) was undergoing
a significant transformation: changing from a productivity
What trends were at play then that would ultimately tool used by a few technologically-inclined translators
materialize into what would become the present? What to a standard process expected to be used by everyone
predictions were people making then about where we’d be across all projects. Costs savings from the application of
twenty years later, in 2016? How well did those predictions TM, which were previously enjoyed by translators, were
pan out? And most importantly: what insights might we transferred to LSPs and then to localization customers.
glean about our future from such an exercise? Concepts like “weighted word counts” and “fuzzy match
I think it’s a perfect subject for reexploration here in grids” became baked into the models in which localiza-
the technology edition of MultiLingual in 2019, as we find tion services would be bought and sold. One could see the
ourselves on the cusp of a new decade. specter of commoditization emerging.
So, we’ll explore the technology landscape back in 1999, Out in the world, some interesting things were happening.
especially as it was applicable to the language industry. • The peer-to-peer MP3 sharing service Napster was
We’ll look at what the technologists and pundits of the released.
time were predicting and see how they did. Eerily accu- • Internet Explorer was winning “the Browser Wars”
rate? Hilariously off-target? Something else entirely? and Microsoft was embroiled in a court battle with the
Then, through this process, we’ll try to extract learnings US Department of Justice over allegations of abusing their
that we can bring along with us as we venture into the next dominance in the operating system market to create a
decade. browser monopoly.
• Apple Computer, Inc., was riding a wave of success
Welcome back to the future under the new leadership of Steve Jobs who restructured
In 1999, the symbolic significance of the New Year the company’s product line. In 1999, Apple began releas-
couldn’t have been more potent: not only were we crossing ing developer previews of their upcoming Unix-based
the threshold into a new decade, but into a new century operating system, “OS X.”
and new millennium! For years, the year 2000 had been • The 802.11 wireless LAN protocol was introduced for
used as shorthand to mean, “preposterously far into the home users for the first time under the name “WiFi.”
future” and was the stock setting for all sorts of futuristic • Mobile computing and mobile phones were largely
speculation across the optimistic/pessimistic continuum. considered different categories, the latter being domi-
Either we’d be colonizing outer space, or we’d be reduced nated by Nokia. In January 1999, Canadian pager com-
to rubble from global thermonuclear war. pany Research In Motion challenged this paradigm with
But after the New Year’s celebrations, there we were — the introduction of a device called the “BlackBerry.”
alive and still on planet Earth. The year 2000, the goalpost • Author and web designer Darcy DiNucci coined the
for “the future,” was no longer science fiction or even far term “Web 2.0” in a magazine article that describes the
away. It was barreling toward us. difference between the current state of the web and the
Part of the feeling of acceleration came from the fact that future as “roughly the equivalence of Pong to the Matrix.”
throughout the 1990s, the world was already experiencing • Jeff Bezos was Time magazine’s “Man of the Year.”
a series of significant and transformative paradigm shifts, Just one year prior, Amazon had purchased the company
fueled by globalization and technological advancements. Junglee and decided to expand their model to sell more
Personal computers — increasing in power but decreasing than just books.
in price — went from being niche products to ubiquitous • There were a record 486 IPOs in 1999 as the “dot com
household appliances. The internet was proving capable of bubble” continued to expand. eToys.com was being valued
opening doors to radically new worlds of possibility, and at $4.9 billion on $100 million of sales, while Toys R Us
of creating new business empires. was making revenues of $11.5 billion and had a real-world
In 1999, I was earning my paycheck as a localization engi- valuation of $4 billion.
neer for a language service provider (LSP), and at that time • Analysts debated the impact that the “Y2K Bug”
the industry was reacting to the compound demand of the would have on the world as systems that were designed to
September/October 2019 63
Fo c u s
64 September/October 2019
Focus
translating their collateral to work in transformation. I would assert that • Customer service will become
another country needs to be replaced we have been and will be most suc- the primary value-added function in
by the practice of designing one’s cessful as a practice when we’re active every business.
business to work everywhere in the participants of the digital transfor- • The middleman must add value.
world, regardless of language, culture mation process, enabling a company • Those who treat information as
or geography. not only to be digital and global, but an asset will succeed.
to be “globally digital.” Reading The Age of Spiritual
The microprocessor revolution… Some other broad lessons from Machines today is good fun, especially
is on the verge of creating a whole the book that I believe we should take the tenth chapter, which is dedicated
new generation of personal digital with us into the next decade are: to predictions about 2019. While the
companions — handhelds, Auto level of maturity and ubiquity of many
PCs, smart cards, and others on of the technological advancements
the way — that will make the use “Keyboards are rare, although Kurzweil describes aren’t there yet, all
of digital information pervasive… they still exist. Most interaction the big technology subjects we’re talk-
The Internet creates a new univer- with computing is through ges- ing about today seem accounted for:
sal space for information sharing, tures using hands, fingers, and self-driving cars, virtual assistants,
collaboration, and commerce. facial expressions and through wearable computers, quantum com-
One of the most powerful two-way natural-language spoken puting and neural networks.
socializing aspects of the Web is communication. People communi- Like Business @ the Speed of
its ability to connect groups of cate with computers the same way Thought, some predictions seem so
like-minded people independent they would communicate with a eerily accurate that one gets the sense
of geography or time zones. If you human assistant, both verbally that there’s a form of observer effect
want to… talk issues with people and through visual expression. at play; that Kurzweil’s book has had
who share your political views or People read documents either an influencing effect on the very
stay in touch with your ethnic on the hand-held displays or, industries that it is making predic-
group scattered all over the world, more commonly, from text that tions about. Perhaps those who read
the Web makes it easy to do.” is projected into the ever pres- his book used it as a blueprint.
Bill Gates (1999). Business @ The Speed ent virtual environment using the
of Thought. Grand Central Publishing. ubiquitous direct-eye displays.
Paper books and documents are “A coalition of telecom and tech-
rarely used or accessed. nology companies is pushing for an
There is enough predictive mate-
The vast majority of transac- industry-wide standard to trans-
rial in Business @ the Speed of Thought
tions include a simulated person, form Web content from text into
that we could have devoted this entire
featuring a realistic animated voice. The new standard would give
article just to it. It shouldn’t come
personality and two-way voice people without computers access
as a surprise that Gates had a clear
communication with high-quality to online content over phone lines.
vision of a future where businesses
natural language understanding. If VXML is adopted as an indus-
and societies would evolve to take
Often, there is no human involved, try standard, companies envi-
advantage of information technology.
as a human may have his or her sion making news, weather, and
It was a future for which his company
automated personal assistant stock quotes available over the
Microsoft was, and still is, playing a
conduct transactions on his or telephone, even online sales and
driving role.
her behalf with other automated purchases. Phone users could also
The processes he describes are
personalities. create settings for the news and
still actively transforming the busi-
Automated driving systems financial information delivered to
ness landscape today, having spun
have been found to be highly reli- them, said Lucent Speech Solu-
off the cottage industry of “digital
able and have now been installed tions president Dan Furman..”
transformation,” a class of profes-
Joanna Glasner (March 1999). “Giv-
sional service that helps companies in nearly all roads… There are very
ing Voice to the Web,” Wired magazine.
recast their businesses in the Infor- few transportation accidents.”
mation Age. Ray Kurzweil (1999). The Age of Spiri-
The language industry’s success tual machines: When Computers Exceed Kurzweil has long been optimistic
is related to its support of the global Human Intelligence, Penguin Books about the future of language technol-
growth made possible by digital ogy, including machine translation,
September/October 2019 65
Fo c u s
Multimedia
DTP
E-Learning
Voice-Over
Localization Engineering
66 September/October 2019
Focus
secret foresight, but because as a lished in Chinese. Or a Hollywood making these marriages possible and
society, we agreed that these futures screenwriter looking to develop a com- in shaping the internet to make them
were desirable and then worked to pelling protagonist finding inspiration possible automatically.
build them into reality. from a third-century Syriac fable. The Tower of Babel will become a
In March 1963, Nobel Prize win- The language industry, empow- glorious new Library of Alexandria.
ning physicist Dennis Gabor wrote ered by both machine and human Does this sound like a desirable
in his book Inventing the Future, intelligence, will be instrumental in future to you? [M]
“The future cannot be predicted, but
futures can be invented. ” He makes
an assertion that has likely been
made throughout history: we are the
masters of our fate.
I have no doubt that if we really
want flying cars, that will happen. If
we really want to colonize Mars, we’ll
do it. As for the language industry,
the first step in predicting our future
may be to decide what it is that we
really want.
It’s in that spirit that I’d like to offer
my own prediction.
By 2029, the language industry
will no longer be focused on solv-
ing problems that world language
diversity represents for commerce
and collaboration. It won’t have to.
Through compounding technologi-
cal advancements, the concept of
the “language barrier” will start to
fade away.
Multimodal machine translation,
content profiling, content enrich-
ment, machine transcreation and
other capabilities made possible
through technology will be interop-
erable microservices everywhere
communication happens in real-
time. The features of language tech-
nology will become taken for granted
in every platform.
The world’s focus will shift. Instead
of seeing the world’s language diversity
as a barrier, it will be recognized as an
untapped human intellectual asset,
the place from which compound ideas
and inventions can be created through
the intermarriage of culturally unique
concepts. Concepts previously segre-
gated by “language barrier.”
Imagine a Spanish-speaking research
team finding a viable solution to their
problem from a research paper pub-
September/October 2019 67
buyer’s
guide
Associations 68
Automated Translation 68
Conferences 68
Desktop Publishing 68
Associations
ucts and solutions, covering all types of platforms Education 69
from desktop to internet to enterprise servers. To
help organizations enhance multilingual com- Enterprise Solutions 69
munication and increase productivity, SYSTRAN Localization Services 69
delivers real-time language solutions for internal
collaboration, search, ediscovery, content man- Localization Tools 71
agement, online customer support and ecom- Nonprofit Organizations 71
European Language merce along with automatic speech recognition
Industry Association (Elia) and optical character recognition. SYSTRAN is Terminology Management 71
Elia is the European not-for-profit association of the leading choice of global companies, defense
and security organizations and language service Translation Mgmt Systems 71
language service companies with a mission to ac-
celerate our members’ business success. We do this providers. SYSTRAN is the official translation Translation Services 73
by creating events and initiatives that anticipate solutions provider for the S-Translator, a default-
embedded app on the Samsung Galaxy S and Translation Tools 75
and serve our members’ needs in building strong,
sustainable companies, thereby strengthening the Note series.
wider industry. Elia was founded in 2005 and has Languages: 130+ language combinations
since established itself as the leading trade associa- SYSTRAN Software, Inc. San Diego, CA USA
tion for the language services industry in Europe. +1 858 457 1900
Elia Brussels, Belgium Email: marketing-americas@systrangroup.com
Email: info@elia-association.org Web: www.systrangroup.com
Web: http://elia-association.org Ad on page 38 LocWorld
Ad on page 33 LocWorld conferences are dedicated to the lan-
guage and localization industries. Our constitu-
Conferences
ents are the people responsible for communicating
across the boundaries of language and culture in
the global marketplace. International product and
marketing managers participate in LocWorld from
all sectors and all geographies to meet language ser-
Globalization and vice and technology providers and to network with
their peers. Hands-on practitioners come to share
Localization Association their knowledge and experience and to learn from
The Globalization and Localization Associa-
tion (GALA) is a global, nonprofit trade associa- Game Global others. See our website for details on upcoming and
past conferences.
tion for the language industry. As a membership Born from LocWorld’s successful Game Local- Localization World, Ltd. Sandpoint, ID USA
organization, we support our member companies ization Round Table, Game Global gathers the 208-263-8178
and the language sector by creating communities, main stakeholders in game globalization (from Email: info@locworld.com, Web: https://locworld.com
championing standards, sharing knowledge and ad- design to testing) in the same place and time to Ad on page 25
vancing technology. share their endeavors, successes, practices and
Globalization and Localization Association research in a collaborative manner. The goal of
this two-day event is to help improve the gaming
Desktop Publishing
Seattle, WA USA
+1-206-494-4686 industry through networking, sharing insights
Email: info@gala-global.org, Web: www.gala-global.org and learning. Game Global is steered by an ad-
visory board of high-level professionals from the
industry. Check our website for details on up-
coming and past conferences.
Automated Translation Localization World, Ltd. Sandpoint, ID USA
(208) 263-8178 Global DTP
Email: info@gameglobal.events
Global DTP s.r.o., based in the Czech Republic,
Web: http://gameglobal.events
offers professional multilingual desktop publishing
Ad on page 35
and media engineering solutions to the localiza-
tion industry. Over the past 15 years, Global DTP
SYSTRAN Software, Inc. has become one of the leading DTP/multimedia
For more than four decades, SYSTRAN has been companies. We have been delivering high-quality
the market leader in language/translation prod- and cost-effective services for at least eight of the
68 September/October 2019
buyer’s
guide
Education
ADAPT Localization Services
STAR Group ADAPT Localization Services offers the full range
Multiple Platforms of services that enable clients to be successful in
STAR is a leader in information management, lo- international markets, from translation into all
calization, internationalization and globalization business languages through linguistic and tech-
Quality Training in Localization services and solutions such as GRIPS (Global Real
Time Information Processing Solution), STAR
nical localization services, prepress and publica-
tion management. Serving both Fortune 500 and
& Global Marketing CLM (Corporate Language Management) includ- small companies, ADAPT has gained a reputation
The Localization Institute is the leader in educa- ing Transit (Translation & Localization), TermStar/ for quality, reliability, technological competence
tional advancement in the field of localization — the WebTerm (Terminology Management), STAR MT and a commitment to customer service. ADAPT
adaptation of products and services for international (Corporate Machine Translation), STAR WebCheck is certified under ISO 17100. Fields of specializa-
markets. We organize comprehensive, vendor-neutral (Online Translation Reviewing) and Mind- tion are the medical, life sciences, IT/telecommu-
conferences (LocWorld and Brand2Global), seminars Reader (Authoring Assistance). With more than nications and technology sectors. With offices in
and round tables where participants gain insights that 50 offices in 30 countries and a global network Bonn, Barcelona, Copenhagen, Stockholm and a
help their companies better succeed in international of prequalified freelance translators, STAR pro- number of certified partner companies, ADAPT
business. In addition, The Institute has partnered with vides a unique combination of information man- is well suited to help clients achieve their goals in
top universities and professional associations to de- agement tools and services required to manage any market.
velop comprehensive certification programs in lo- all phases of the product information life cycle. Languages: More than 50
calization project management, quality management, Languages: All ADAPT Localization Services Bonn, Germany
internationalization and global digital marketing. STAR AG (STAR Group headquarters) 49-228-98-22-60
The Localization Institute Madison, WI USA Ramsen, Switzerland, +41-52-742-9200 Email: sales@adapt-localization.com
608-826-5001 Email: info@star-group.net, Web: www.star-group.net Web: www.adapt-localization.com
Email: kris@localizationinstitute.com STAR Group America, LLC Lyndhurst, OH USA Ad on page 56
Web:
www.localizationinstitute.com 216-691-7827, Email: lyndhurst@star-group.net
Ad on page 21 Ad on page 16
September/October 2019 69
buyer’s
guide
70 September/October 2019
buyer’s
guide
Translation
Ad on page 15 ity powered by translators. We are a volunteer-based
Management Systems
online community aiming to help our language com-
munity thrive and bridge all the sectors within our
Localization Tools
industry. We facilitate cross-functional collaboration
among the diverse sectors and stakeholders within
the language industry and instigate transparency,
trust and free knowledge. Our mission is to offer
free access to tools and all other available resources,
to facilitate community-driven projects, to empower Consoltec
linguists and to share educational and language assets. Multiple Platforms
Translation Commons Las Vegas, NV USA Consoltec offers FlowFit-TMS, a web-based trans-
lation management system that helps you simplify
VideoLocalize.com (310) 405-4991
and optimize your projects, while reducing your
Multiple Platforms Email: krista@translationcommons.org
Web: www.translationcommons.org administrative costs. FlowFit can also be used for
Video localization is complicated. It involves not many other project types. FlowFit provides fully
only translation processes and graphic engineering, customizable web portals for clients, providers and
but also voiceover and audio/video editing as well. project management. Get an accurate overview of
The challenge is how to keep control of the budget your teams’ workload in real time and select the best
while meeting client expectations. VideoLocalize available providers. Manage your clients, contacts
is the answer. Videolocalize.com is a cloud-based and internal/external providers effectively with the
online platform designed for video localization. new CRM features. Use Timesheet to track the time
It is the brainchild of Boffin Language, an Asian- spent on projects and tasks. Connect seamlessly
language service provider led by cofounder George Translators without Borders to your favorite CAT tools (memoQ, SDL Studio,
Zhao. VideoLocalize’s mission is to make video Originally founded in 1993 in France as Traducteurs LogiTerm) and get comprehensive reports that
localization faster and more cost-effective. sans Frontières by Lori Thicke and Ros Smith- provide enhanced insight on production, produc-
Boffin Language Group Inc. Toronto, Canada Thomas to link the world's translators to vet- tivity, costs and translation memory efficiency.
+1 (647) 802 8223 ted NGOs that focus on health and education, Consoltec Montreal, Québec, Canada
Email: george.zhao@boffin.com Translators without Borders (TWB) is a US non- (+1) 514 312-2485
Web: www.videolocalize.com profit organization that aims to close the language Email:
info@consoltec.ca, Web:
www.consoltec.ca
Ad on page 12 gaps that hinder critical humanitarian efforts
September/October 2019 71
buyer’s
guide
successfully processing their translation projects. and payment automation: pay vendors easily across
Customers from diverse industries use the Across the globe. You can start experiencing the next gen-
Language Server and the Across Translator Edition eration of translation technologies and boost your
to tackle their daily localization challenges. The translation business efficiency from day one.
Localize use of the Across translation management system Languages: All
enables the implementation of transparent trans- Smartcat Cambridge, MA USA
Localize offers a full-featured, cloud-based content lation processes with a high degree of automation Email: support@smartcat.ai, Web: www.smartcat.ai
and translation management system that features and maximum information security. All who are Ad on page 9
advanced translation workflows, allowing con- involved in the project can be integrated in the
tent managers and translators to propose, review overall process and work on the basis of the same
and publish translations with ease. For companies data. This saves time for what matters – the cre-
without in-house translators, we provide access to ation of high-quality content in multiple languages.
high-quality, on-demand translations through our Languages: All
network of professional translators. Our easy to Across Systems GmbH Karlsbad, Germany
install plugin fits neatly into your existing technol- +49 (0) 7248 925 425
ogy stack. The technology powering the Localize Email:
info@across.net, Web: www.across.net
Smartling
Platform was built from the ground up to minimize Smartling Translation Cloud is the leading translation
the need for engineers in the localization process. management platform and language services pro-
This reduces costs by enabling nontechnical per- vider to localize content across devices and platforms.
sonnel to manage the localization workflow. Get- Smartling’s data-driven approach and visual context
ting started is easy. Start your free trial today! capabilities uniquely positions brands for efficiency.
Languages: All Seamlessly connect your CMS, code repository, and
Localize San Fransisco, CA USA Plunet BusinessManager marketing automation tools to Smartling’s TMS via
(415) 651-7030 Multiple Platforms prebuilt integrations, web proxy, or REST APIs. No
Email: sales@localizejs.com, Web: https://localizejs.com Plunet develops and markets the business and work- matter the content type, Smartling automation tools
Ad on page 11 flow management software Plunet BusinessManager help you do more with less. Smartling is the platform
— one of the world’s leading management solutions of choice for B2B and B2C brands, including Inter-
for the translation and localization industry. Plunet Continental Hotels Group, GoPro, Shopify, Slack,
BusinessManager provides a high degree of automa- and SurveyMonkey. The company is headquartered
tion and flexibility for professional language service in New York, with offices in Dublin and London. For
providers and translation departments. Using a web- more information, please visit Smartling.com.
based platform, Plunet integrates translation soft- Smartling New York, NY USA
ware, financial accounting and quality management 1-866-707-6278
Memsource systems. Various functions and extensions of Plunet Email: hi@smartling.com, Web: www.smartling.com
Memsource is a leading cloud-based translation BusinessManager can be adapted to individual needs Ad on page 2
management system that enables global compa- within a configurable system. Basic functions include
nies, translation agencies and translators to collabo- quote, order and invoice management, comprehen-
rate in one secure, online location. Internationally sive financial reports, flexible job and workflow man-
recognized for providing an easy-to-use, yet pow- agement as well as deadline, document and customer
erful CAT tool combined with a TMS, Memsource relationship management.
processes two billion words per month from over Plunet GmbH Berlin, Germany
200,000 users around the world. Manage your +49 (0)30-322-971-340
translation projects in real-time in an intelligent Wordbee Translator
Email: info@plunet.com, Web: www.plunet.com Web-based
platform that accepts over 50 file types and offers Ad on page 53
REST API, out-of-the-box CMS connectors and Wordbee is the leading choice for enterprises
powerful workflow automation to save time and and language service providers that need to save
money. Join localization professionals from around money and make their company run more effi-
the world who rely on Memsource to streamline ciently. Wordbee has the most complete feature
their translation process. To start your free 30-day set of any cloud solution: project management,
trial, visit www.memsource.com. portal, business analytics, reporting, invoicing
Languages: All and a user-friendly translation editor. Tasks such
Memsource Prague, Czech Republic Smartcat as project and workflow setup, job assignment,
+420 221 490 441 At Smartcat we believe the translation industry deadline calculation, multiple phase kick-offs and
Email:
info@memsource.com, Web:
www.memsource.com should be better for everyone. We connect lin- cost management can all be automated in the col-
Ad on page 58 guists, companies and agencies to streamline the laborative translation platform. Also, the Beebox
translation of any content into every language on connects CMSs, DMSs or any propriety database
demand. Our platform helps you build and man- source with the TMS of the translation vendor or
age translation teams, and puts your translation internal translation team.
process on autopilot from content creation to pay- Languages: All
ments. The unique features of Smartcat are our Wordbee Soleuvre, Luxembourg
marketplace, where you can find translators for any +352 2877 1204
Email:
info@wordbee.com
Across Systems GmbH language with one click; our CAT tool, translation
Web:
www.wordbee.com
using an AI-assisted platform, a team management
With its smart software solutions, Across Systems with full control of your team, suppliers and content
assists enterprises and translators worldwide in
72 September/October 2019
buyer’s
guide
XTM: Better Translation Technology birotranslations Your Partner in Asia and Beyond!
Multiple Platforms Founded in 1992, birotranslations specializes in life With our headquarters in Korea, our production
XTM is a fully featured online CAT tool and transla- science, legal, technical, IT and automotive trans- offices in Vietnam and China, and our sales office
tion management system available as a pay-as-you- lations into all East European languages (Albanian, in the US, we are in an excellent position to be your
go SaaS or for installation on your server. Built for Bosnian, Bulgarian, Croatian, Czech, Estonian, Asian language localization partner. For localizing
collaboration and ease of use, XTM provides a Hungarian, Latvian, Lithuanian, Macedonian, projects from English or German into Asian lan-
complete, secure and scalable translation solution. Polish, Romanian, Russian, Serbian, Slovak, guages, such as Korean, Japanese, Chinese, Viet-
Implementation of XTM Cloud is quick and easy, Slovenian, Ukrainian). We have a long-term part- namese, Thai, Indonesian and Burmese, you can
with no installation, hardware costs or mainte- nership with the world's top 100 MLVs and many trust our professional translation services for IT,
nance required. Rapidly create new projects from end-clients all around the globe. With our expe- software, marketing/transcreation and technical
all common file types using the templates pro- rienced project managers, extensive network of projects. Since our establishment in 1990, we have
vided and allocate your resources to the automated expert linguists and usage of the latest CAT tool been at the forefront of the localization industry as
workflow. XTM enables you to share linguistic technology, your projects will be delivered on one of the Asia Top Ten and the No. 1 LSP in Korea
assets in real time between translators. Discover time, within budget and with the highest stan- (by CSA Research). ISO17100 certified since 2014.
XTM today. Sign up for a free 30-day trial at www. dards of quality. For more information, please Languages: More than 54 languages including Korean,
xtm-intl.com/trial. contact Mr. Matic Berginc (details below). Chinese, Japanese, Vietnamese, Thai, Indonesian.
Languages: All Unicode languages Languages: Eastern European languages HansemEUG, Inc.
XTM International Gerrards Cross, United Kingdom birotranslations Ljubljana, Slovenia Gyeonggi-do, South Korea
+44-1753-480-469 +386 590 43 557 +82-31-226-5042
Email: sales@xtm-intl.com, Web: https://xtm.cloud Email: projects@birotranslations.com Email: info@ezuserguide.com
Ad on page 13 Web: www.birotranslations.com Web: http://hansemeug.com/en
Ads on pages 17, 57 Ad on page 61
Translation Services
September/October 2019 73
buyer’s
guide
74 September/October 2019
buyer’s
guide
Translation Tools
1
Educators:
We want to help you help your students
Help further your students’ understanding of the intersection of language,
technology and culture via articles written by experts around the world. Provide
them with a digital subscription to MultiLingual with our compliments.
Advertiser Index
September/October 2019 75
Colum n
Takeaway
Exploring the best
startup hubs
Sean Hopwood
Sean Hopwood is the president and CEO of Day Translations, Inc.
He has a deep love for languages, soccer and new technologies.
He spends whatever time is left from his busy schedule to write
about business management.
The world’s best startups are not concentrated in one or a few especially in the tech sector. The
countries or continents. In the internet age, successful businesses country doesn’t only rely on local
can spring up from anywhere, provided that they get the needed talent but also attracts skilled
funding, government support and talented managers and employ- labor force from neighboring
ees. This provides a ripe opportunity for the localization industry. countries to fill vacancies in newly
started companies.
Acknowledging the importance of startups in growing and sustaining More than 25% of the world’s
economies, governments worldwide have been implementing initiatives unicorn startups (companies with
that foster startup growth. One of the best government initiatives is the a current valuation of at least $1
establishment of startup hubs. Those seeking new places for establishing billion) started in San Francisco.
new businesses might be highly interested to find these hubs. The area in California fondly
The fundamental goal of a startup hub is to provide the ideal conditions referred to as the Silicon Valley
for starting and growing a business. It is a place, usually a city, whose remains to be a relevant place
government offers a variety of perks and support to help new businesses for startups. It may no longer
flourish, and has the right infrastructure and people to foster business dominate rankings for the best
success. A startup hub becomes fertile grounds for the development of new startup hubs, but it continues to
products, services, technologies, systems and workers. Its ultimate goal is to serve as a preferred location for
create economic activity that leads to increasing wealth and uplifts everyone. setting up business.
The following are some of the best startup hubs of the world (in no
particular order). San Francisco: Silicon Valley
and other parts of San Francisco
Singapore: This city-state may be small and limited in natural are excellent places for business
resources, but it is lucky to have visionary leaders and people who because of the relatively low tax,
have embraced the goal of achieving progress. Singapore is notable suitable infrastructure, the favorable
for its modern infrastructure and highly educated people. When it venture capital ecosystem, proximity
realized it has limited human resources, the city did not shy away to academic institutions and the
from bringing in people from other countries. overall mindset that promotes risk
Singapore topped Nestpick’s 2017 index for the best startup taking, critical thinking and cooper-
cities. It may not be the best when it comes to salary and startup ation, to some extent. It helps a
ecosystem, but it excels in terms of quality of life and cost of living. business succeed when it is
Other cities offer bigger salaries, but the higher pay is easily offset by surrounded by other businesses that
the taxes and living expenses. Singapore has a proficient workforce succeeded as they took risks, pursued
September/October 2019 77
Column
novel ideas and accepted failure as new companies to operate in New York: It’s an expensive city,
part of the process. the city. Rent for living spaces is but it is one that genuinely exem-
also not that expensive, helping plifies what being a global city
Tel Aviv: Israel’s financial center may attract workers or talents to means. It has the prestige, access
be geographically small, but it has a move in. It is projected that by to money and many of the best
global relevance that matches or even 2020, Berlin’s startup scene will talents that can help drive startup
outperforms other major cities generate 100,000 new jobs. At success. The city is a place of op-
especially when it comes to the tech present, Germany has the highest portunities with a good startup as-
industry. Tel Aviv has one of the percentage of foreign startups in sistance program (START UP NY).
highest concentrations of startups in the European Union.
the world. It has more than 2,500 Founding a successful
startups with a city population of less
Stockholm: Skype, Klarna, venture
Spotify and Minecraft were all
than half a million. With the right attitude and
founded in Stockholm. The capital
What sets Tel Aviv apart from environment, a highly successful
city of Sweden is not only known
other startup hubs is its unique culture venture can disregard location and
for being the location of the Nobel
and sociopolitical situation. Gal transcend the language barrier.
Prize. It also has one of the best
Kalkshtein, the investor-entrepreneur If there’s language disconnect, it
ecosystems for startups in the
responsible for the Startup Lobby shouldn’t be difficult to find a good
world. Its unicorn startup density
in Israel’s Parliament, noted that if translation provider. Geographical
(ratio of the number of startups
there are political and security issues challenges in communication and
with billion-dollar valuations over
in Israel, they don’t affect the Israeli operations are addressed by new
the city population) is the second
market. Instead, they help positively technologies. The marketing of
highest in the world.
shape the investment culture in the products and services to interna-
city as Israeli entrepreneurs pursue Helsinki: Another city that dem- tional markets has been made easier
business opportunities in the IT field onstrates the growth of Europe with the growing acceptance of
in accordance to the need for powerful as a startup hub is the capital and globalization by most countries —
intelligence and technology useful in largest city of Finland. Govern- but still, as CSA Research concluded
dealing with conflicts. ment support for new businesses is in its “Can’t Read, Won’t Buy”
Moreover, immigrants in Israel a major factor for this, but it also study, an overwhelming majority
help create a mindset of risk-taking bears noting that the city is home of customers (75% across the ten
and the desire to do business. to risk-taking investors, talented countries where they conducted
An overwhelming majority of entrepreneurs, key influencers polls) strongly prefers products
Jewish Israelis are immigrants or and accelerators. Helsinki is the advertised in their own language.
immigrants’ children, grandchildren origin of influential brands such as So, reaching out to a cross-cultural
or great grandchildren. Supercell, Nokia, Rovio, Linux and consultant, dedicated linguists and local
Clash of Clans. marketing experts is key to effectively
Berlin: The capital city of the market at an international level. Global-
largest economy in the European Bengaluru: India also has a ization involves sharing some common
Union has a vibrant startup scene. startup hub that can rival those codes between cultures, but it doesn’t
Since the fall of the Berlin Wall, of other countries with advanced completely erase their differences.
the city transformed into a viable economies. Bengaluru or Bangalore, Nowadays, the entrepreneur and the
location for eager startups. One of an industrial city in south central translator are natural allies.
the original tech hubs of Europe, India, boasts of a tradition of tech Startup hubs established or facili-
it is the birthplace of a number innovations. It was already re- tated by governments and business
of well-known companies such as garded as a tech capital even before groups lend excellent support to new
Siemens, Neutron Games, NVIDIA the rise of the popular web-based businesses. However, ultimately,
Advanced Rendering Center and companies known at present. The success is still dependent on the
Native Instruments. The online city provides excellent support for entrepreneurs themselves. It is they
music-sharing site SoundCloud entrepreneurs. Cultural and practi- who decide to carefully study and plan
was established in Berlin. cal challenges abound, but the city their actions to achieve success, or
Berlin has relatively low office can provide great opportunities for rise as they fail and rise again as they
rent, which naturally drives many innovative ideas. encounter more setbacks. [M]
78 September/October 2019