University of Tartu developing translation programme with Mozilla Firefox

Mozilla Firefox logo.

Recently, news of Mozilla's new translation programme Bergamot has spread through international technology news portals but few readers may know the team also includes language technologists from the University of Tartu.

ERR's science portal Novaator spoke with the head of the Tartu team professor Mark Fišel, of natural language processing at the Institute of Computer Science, about the collaboration and the work going on behind the scenes.

The project also involves Charles University in Prague and Sheffield and Edinburgh universities in the UK.

Mark Fišel, please tell us what this project is about?

It all began with language technologists from four universities wanting to do a European Commission-funded research project together on machine translation. One idea was to fit machine translation into a web browser. Thanks to a contact person at the University of Edinburgh, we asked Mozilla to be our partner and in January 2019 the project kicked off. This is a research project, which means that most of our activity is exploratory: we are studying how we could alter the best existing machine translation methods in a way to make them even better.

What exactly is machine translation?

The principle of machine translation is easy to explain: a machine or a computer must translate one text from one language to another automatically. It is one of the oldest language processing tasks, as this has been actively addressed since the beginning of the 1950s. Despite the long history, ideal machine translation is yet to be developed, however, in practice, its quality is good enough for it to find use. Machine translated text is mainly used in post-editing, where the automatically translated text is manually corrected. With many topics, the average time needed for post-editing is less than what is required for translating from scratch.

What needs to be done to make the quality of machine translation better? What does your daily work entail?

Our main role in this project is to make machine translation engines flexible and adaptable to the content and style of the text. For example, in the context of nature, the machine should translate the word aas as 'meadow', but recognising a text on knitting, aas should be translated as 'loop'. Or seeing a formal English text, the Estonian translation should use the form 'teie' (formal) not 'sina' (informal). In the end, the programme should be able to make these decisions automatically.

We are also participating in other stages of the project: for example, we are working on the automatic estimation of translation quality. Its purpose being to decide after the generation of the translation whether it was successful or not. This is necessary to warn the user of a low-quality translation.

What will the final product be if everything goes according to plan?

A large proportion of the project is research and experiments, but a working prototype will also be made. At the moment we plan to make the new technology available in the Firefox browser.

What is its main difference between this when compared to Google's current automatic translation?

The main difference with Google's automatic translation and its machine translation plugin for Chrome is that Google Translate is cloud-based, which means that all text input is sent to Google's servers for translating. Bergamot machine translation will work on the client's computer and not on a cloud, which ensures the privacy of the texts.

The second characteristic is that existing translation engines – including Google's and UT's – translate single sentences without looking at the context. The contribution by the University of Tartu scientists should ensure that the translation engine adapts to the context and style of the entire web page and takes into account other additional information to improve translation quality.

What is the 'shift to client-side translation' that has received a lot of attention in the English-language media?

Our partners at Charles University in Prague are working on the so-called client-side translation. The idea is to provide the possibility of improving translation quality for users who are not fluent in the target language. The purpose of the machine translation system in this case would be to identify that part of the input as either being too complicated or ambiguous for successful translation, and to ask the user to rephrase it.

In conclusion, it may be said that the researchers at the University of Tartu Institute of Computer Science are working on applications, which most of the readers of this article probably use regularly. It is important to note that all the results of this research when finished, will be freely released with permissive licences. This project involves translations from English into Estonian, Polish, Czech, German, French and Spanish, and vice versa.

The translation of this article from Estonian Public Broadcasting's science news portal Novaator was funded by the European Regional Development Fund through the Estonian Research Council.

Download the ERR News app for Android and iOS now and never miss an update!

Editor: Helen Wright

University of Tartu developing translation programme with Mozilla Firefox

LIHTSAD UUDISED

Lihtsad uudised 19. aprillil

global estonian report

Global Estonian Report: April 17-24

listen: radio tallinn

About us

Latest news

Ministry trying to convince MPEÕK congregations to leave Moscow Patriarchate

Gallery: SDE hand over list of EU election candidates to commission

Gallery: Eesti 200 register candidate list for European elections

Gallery: Parempoolsed submit candidate list for European elections

EDF colonel: Frontline situation difficult as Russian troops advance

Winter road conditions expected across Estonia this weekend

Tõnis Saarts: Isamaa aiming for the heavyweight title

CEO: Estonian economy needs foreign labor in a smart way

Why are Estonia's bogs so important?

Difficult for non-voting Center members to appeal Tallinn election result

watch: jupiter

Most Read articles

Watch again: Full videos of the XXVII Song and XX Dance Festival 'My Love'

EDF chief: I am absolutely certain Estonia could win a war

Ministry: Bolt drivers in charge of whether they get enough rest

Return to winter weather in Estonia from Thursday

Opposition parties launch no confidence motion against new Tallinn mayor

State on the lookout for new fifth island ferry

Expert: Israel's counterattack was restrained

Veiko-Vello Palm: Some firms waiting to 'normalize' doing business with Russia

useful information

Commenting on ERR News articles

ERR News materials terms of use