The Titanic Task of Cleaning Up a Translation Memory

A few months ago we decided to clean up the translation memory of one of our longest-serving and busiest clients. Throughout so many years of work, the memory was loaded with inconsistencies and some errors that confused the translator when doing its job.

The fact that there are errors and inconsistencies in our TMs is not because translators are unprofessional, but because of the characteristics of the text and changes in the client’s preferences over time. Times change, contexts change and, of course, also the most convenient way to translate a particular text changes. Therefore, every so often, it is convenient to clean up the memory so as not to continue dragging obsolete segments that are no longer useful.

In our case, the first step —knowing what we had to correct, was simple. We used Xbench, a tool with which we always work to make our quality controls. We loaded the memory on Xbench and got a report of inconsistencies in the source text, inconsistencies in the target text and spelling errors. The latter were very few; and most were false positives. But the more than sixteen thousand inconsistencies made us think that we were facing a never-ending task. To make it easier to move between errors and leave comments on questions or queries, we imported the Xbench report into an Excel file.

However, what seemed more complicated was to manage thousands of entries that made up our memory and modify them without generating even more errors. Given that the possibilities offered by the translation tools we use do not allow us to manage the memory as we needed, we decided to use Olifant, a program that allows us to open memories in .tmx or .txt formats and edit them. Something to keep in mind is that, by default, Olifant searches the source text, but we can choose to search only in the source text, the target text or several other options. You just have to check what you need you to do.

Finally, the titanic task of erasing, changing, correcting, etc., began. The instructions were quite simple and, maybe, obvious: to unify the inconsistencies and correct spelling errors or any other type. But there was no need to eliminate any segments since each of the segments that make up the memory is related to another either follows it or precedes it, and erasing one would alter that relationship and therefore, the memory.

This was just the preparation for our cleaning task. Once we had all this, we started with the work itself. In our next post, we will tell you how the process was.