OMEGAT: INTRODUCTION AND TUTORIAL
by Samuel Murray-Smit
OmegaT is free-to-use-and-distribute open-source Translation Memory CAT Tool running on Windows, Linux, and Mac OS.
Disclaimer: This article is not meant as an anti-Microsoft article. The author is a satisfied user of Microsoft Word and Excel, and a happy user of Wordfast.
Section 1. Introducing OmegaT
Translators using Microsoft products are often not aware of two things. The first is that quite a few fellow-translators and clients do not use Microsoft Word.The second is that even the same version of Microsoft Word installed on two computers may not necessarily be 100% mutually compatible. Owing to these two pieces of ignorance, many translators are fearful of making a change to non-Microsoft products.
There are usually other reasons for sticking to Microsoft Word regardless. One such reason is that translators feel many translation related programs are not available in non-Microsoft Word formats. Take Trados, for example - there is no Trados for WordPerfect. Or take any of the three most well-known Afrikaans spell-checkers - all work only on Microsoft Word.
One serious alternative to Microsoft Word is OpenOffice Writer. The fact that neither Trados nor the more affordable Wordfast can be used with OpenOffice Writer, used to present a serious stumbling block for translators used to using computer assisted translation (CAT) technologies with translation memory, fuzzy matching andactive glossary look-up.
This is where OmegaT comes in. OmegaT is a freeware translation memory program that doesn't run on MicrosoftWord. It seamlessly imports and exports plaintext, OpenOffice Writer documents and HTML.
OmegaT is a no-nonsense tool that increases productivity and consistency without taking creativity out of a translator's hands. The only requirement for using OmegaTis a reasonably fast computer with a Java Runtime Environment installed on it.
As of writing, the most recent versions of the above-mentioned software is Java 1.5, OpenOffice 1.1.4 and OmegaT 1.4.5. For simplicity's sake I assume the reader has Windows, but these programs are also available for Linux and the new Mac OSX. OmegaT needs Java to work. OpenOffice is optional, but it is useful to have.
Java 1.5
Programs written in Java have one small drawback - the user needs to have Java installed on his computer to use it. The advantage to Java is that a single program can be used on many different types of computers as long as those computers have their version of Java installed. For this reason OmegaT is just as accessible to Linux or Mac users as it is to Windows users.
You can download the latest version of Java at http://java.com/. The recommended version for Omega 1.4.5is Java 1.5.
Java based programs tend to use a little more processing power than ready-compiled programs written in other popular programming languages, which is why using a fast computer is preferable.
Some web browsers such as Mozilla or Internet Explorer have a Java Runtime Environment (also called a virtual machine) embedded in the browser. Unfortunately these Java installations are often capable of running web applications only, and they can't be used for fully fledged programs such as OmegaT.
OpenOffice 1.1.4
OpenOffice is a whole suite of office programs. The bundle includes a word processor (Writer), a spreadsheet (Calc), a presentations editor (Impress), and a graphics editor (Draw). The new version of OpenOffice will also contain a database editor (Base).
Most Microsoft Word and Excel documents can be opened, edited and saved in OpenOffice Writer and Calc. Some features present in Microsoft Office are not included in OpenOffice, and vice versa. OpenOffice will usually prompt you to save as OpenOffice format, but there's always the option to save documents in a Microsoft format.
You can download the latest stable version of OpenOffice at http://www.openoffice.org/. If you have a recent version o fJava installed on your computer, the OpenOffice installer will enable extra features in the office suite, but Java is not required for OpenOffice to work.
Although OmegaT can import and export OpenOffice Writer files, OpenOffice itself is not required for OmegaT to work. In fact, OmegaT will work on OpenOffice files even if OpenOffice is not installed on the same computer. Bu tOmegaT cannot convert OpenOffice files to Microsoft format and vice versa - for that you need OpenOffice itself.
OpenOffice has its own macro language, but it cannot import Microsoft's VBA based macro language used in Word and Excel. For this reason macro based programs such as Wordfast, Wordfisher and Ando cannot run on OpenOffice. Similarly, a macro written for OpenOffice will not run in Microsoft Word or Excel.
While macros can add functionality to documents, they are often not explicitly required for viewing the document.Translators using OpenOffice may wish to tell clients not to send them documents that depend on macros (although in most cases, OpenOffice will keep the Microsoft Word macro intact even if it can't execute it).
Free spell-checkers and thesauri for many languages are linked-to from the OpenOffice web site itself.
OmegaT 1.4.5
The biggest advantage of OmegaT for translators is consistency. OmegaT generally also speeds up the translation process.
OmegaT automatically finds similar or exact phrases in a translation project which might span several files in different formats. This enables the translator to check the paragraph he is translating at that moment with similar sentences elsewhere in the project.
If those similar sentences have been translated already, OmegaT tells the translator how they have been translated. One could say the translator is consulting himself, and need no longer struggle with the same or similar difficult paragraphs all over again.
The latest version of OmegaT can be downloaded at http://www.omegat.org/omegat/omegat.html. [Note by SiteFounder: The latest version is available at the SourceForge Project Website] You can ask questions from fellow OmegaT users at http://groups.yahoo.com/group/OmegaT/.
Although OmegaT aids and automates some of the translation processes, it is not simply a case of importing the sourcefile, pressing a few buttons, and exporting the targetfile. The translator himself does the translating, and before he can start, he must set up a few things manually.
Briefly, OmegaT first creates a set of project folders. The user then copies the source documents into the source text folder. The user translates those documents using OmegaT.When the user is finished, he uses OmegaT to compile the translation into the final product.
Very brief introduction to CAT
[CAT = computer assisted translation]
Before the translation process begins, OmegaT creates a translation memory based on the individual paragraphs contained in all the source text documents. These paragraphs are called segments in CAT-speak. OmegaT also checks to see if it can find segments that are either moderately similar or almost exactly alike. Moderately similar segments are called fuzzy matches and segments that are more than 99% alike are called exact matches.
When the user moves his cursor to any segment, OmegaT checks the translation memory for fuzzy or exact matches that have been translated already, and alerts the user to their existence using colour codes. The user can opt to re-use or partially re-use the existing translation.
Although this is extremely helpful in texts with a lot of repetitive or near-repetitive segments, it usually speeds up the translation process even in texts that are not repetitive. The main advantage is better overall consistency within the translation project.
Unlike many other CAT tools that segment by sentence or phrase, OmegaT segments by paragraph. There are, however, macros and scripts available to convert real sentences to virtual paragraphs, which enable OmegaT to create smaller segments.
The user can copy his own glossaries to the project's glossary subfolder. OmegaT will then automatically search the glossaries for words that occur in the source segments and alert the translation to their existence. This is useful especially when glossaries contain thousands of entries.
Section 2. Brief tutorial for OmegaT
OmegaT is a tool that helps translators work more consistently in medium to large translation projects. It enables translators to automatically compare paragraphs that are similar. If a similar paragraph has been translated, OmegaT will even tell the translator how it was translated. In this way translators needn't struggle with difficult sentences all over again.
This is a tutorial for OmegaT 1.4.5. To run this version of OmegaT on your computer, you need at least Java 1.4 installed as well. For simplicity's sake I will assume the reader uses Microsoft Windows, but you can use Linux or MacOSX as well. Download OmegaT at http://www.omegat.org/omegat/omegat.html. Java can bedownloaded at http://java.com/.
Installing OmegaT
Download OmegaT and install it like any other program by double-clicking on the install file.
There is a little annoying feature in OmegaT and/or Java on Windows machines. When browsing a dialog box, OmegaT might try to repeatedly access the A: drive. Simply click Cancel when this happens, and if the dialog box should seem to have disappeared, use Alt-Tab to find it again.
Creating and opening a project
Create a new set of project folders by clicking File -> NewProject. Browse to the folder where you want to save OmegaT's files (for example My Documents), and type in thename of a project. Click Save. OmegaT will automatically create a source, target, glossary and TM (translation memory) folder, but you can change the location of these folders on the screen that pops up right after you clicked Save.
You should also specify the source and target languages ofthe current project on the same screen. The language codes consist of a two-letter language code, a hyphen and a two-letter country code. Afrikaans might be AF-ZA and English may be EN-UK. Click OK.
Minimise OmegaT and browse to the project folder that you specified. Copy your source files into the subfolder called "source".
Now maximise OmegaT again, and click File -> Open Project. Browse to the project folder (which has a little OmegaT icon), and select it, and click Open. If you forgot to add source text files to the source subfolder, or if you forgot to convert Microsoft Word files to OpenOffice format, you'll get an error message. OmegaT will load the first source file and also open a window called Project Files with a list of files and their number of segments.
The translation process
The entire text of the current document is visible at all times. As you translate the segments, the source text will be replaced by the target text. The source text of translated segments will not be visible unless you navigate to those segments.
OmegaT creates a copy of the segment and displays it between two segment tags. The first two tags are and . Translate the text between the two tags by overwriting or replacing the source text.
Most normal text editing functions work inside the editing screen. You can try Ctrl+C for copy, Ctrl+X for cut, and Ctrl+V for paste. Try Shift+Ctrl+arrow to highlight words and overtype them.
Type your translation, delete any left-over source text, and press Ctrl+N to move on to the next segment. You can at any time move back to previous segments with Ctrl+P. You don't have to translate segments in any given order, and you can change or edit segments you've already translated - simply move to the segment you want to translate or edit by using Ctrl+N or Ctrl+P.
Tip: Pressing ENTER also moves to the next segment (same as Ctrl+N). Ctrl+ENTER moves to the previous segment (same as Ctrl+P).
OmegaT automatically saves the translation memory every five minutes. Saving the file does not compile the target text, however. If you want to see what the target text looks, click File -> Create Target Documents, and then open the file the target subfolder using the appropriate viewer.
While you can only work on one file at a time, you don't have to finish one file before moving on to another. ClickFile -> Project Files to open the Project Files window, and click on the file you want to translate.
Fuzzy matching
OmegaT 1.4.5 supports fuzzy matching using bright colours.Only one fuzzy match is coloured at a time, but you can cycle through other matches by using Ctrl+1, Ctrl+2 etc.You can re-use a fuzzy match by using Ctrl+R or Ctrl+I. Ctrl+R will replace the entire target segment with the fuzzy match text. Ctrl+I will add the fuzzy match text at the cursor position. You can even add more than one fuzzy match's text by using Ctrl+numbers and Ctrl+I repeatedly.
Find functions
OmegaT offers two types of searches, with wild cards. ClickEdit -> Find (or Ctrl+F) to do a search. OmegaT 1.4.5 can search for phrases or multiple search terms.
The Exact Match search finds whole words and partial words from both the source and target text of the current project. If you've copied old TMs from previous projects to the current project's TM subfolder, select Search TMs to search them as well. The Keyword search finds whole words only.
TM and glossary formats
OmegaT uses a variant of the industry standard TMX 1.1 format for translation memory. By using available scripts, you can easily import TM files made by Trados, Wordfast or any other standards compliant CAT tool as long as the TM file was saved in TMX version 1.1 (not version 1.2+). You can also use the scripts to import OmegaT's TMs into those applications.
The glossary format is a plaintext tab delimited file. The first column contains the source word, the second contains the translation, and a third column contains any comments.Glossaries must be created using a text editor and saved in the project's glossary subfolder before using OmegaT. You can add new terms to the glossary during the translation, but new terms will only be recognised when you reload the current project. OmegaT automatically recognises glossary terms, but contains no internal glossary editor.
Working with formatted documents
When translating plaintext files in OmegaT, you don't have to worry about things like bold, italics or changes in the font. With HTML or OpenOffice documents, however, OmegaT has to somehow retain the formatting. This is done using internal tags.
A sentence like "This is a house" in which the word "This" is in bold, "a" is in a different font, and "house" is in italics, will be marked up in OmegaT as "This isa house.".
If you translate this as "Dit is 'nhuis.", OmegaT will know which words must be in, in a different font or in italics. When the final product is compiled, the translated document will look the same as the original.
Before compiling the final product, click Tools -> Validate Tags to check for any errors in internal tags. Uncorrected errors will corrupt the final OpenOffice or HTML document.
An annoying feature present in OmegaT 1.4.5 (but hopefully fixed in later versions) is that the style used for the first word of the source text will also be the style used for the first word of the target text.
Licensing and support
OmegaT is open source software licenced in terms of the GNU General Public License (version 2). In lay terms this means that OmegaT is unrestricted freeware for all types of use (including commercial), and that anyone may modify the source code to suit their own requirements.
OmegaT is is an open source project that can't offer direct support for the product. Instead, users may join the mailing list at http://groups.yahoo.com/group/omegat/. There, more experienced users may be able to assist new users.
[Note by SiteFounder: Samuel Murray-Smit is an Afrikaans-English translator living in South Afirca. He is part of the development team of the OmegaT Project and tester for the software. More information about Samuel Murray-Smit is available here
The OmegaT Project has been started by Keith Godfrey back in 2002. The official OmegaT website is http://www.omegat.org/
Article republished with permission]

Home
My BabelPort
Projects
Directories
Community
Tools/Extras
About








