Blog Archives

Processing TMX with sed

This entry is a bit complex, but possibly some translators with a technical lean might find it useful. Some time ago I wrote about the TMX format for translation memories. These TMs can be very big files, spanning many thousands or even millions of lines. Processing these files with traditional methods sometimes can become a […]

On Homers and Fastballs

My father—who’s Argentine—has the good fortune of being able to read, in their original language, novels written in English. He prefers, whenever possible, to avoid the middleman and skip the step of the literary translation. However, when he set out to read the latest John Grisham novel, Calico Joe, he was faced with more than […]

Tagged with:

Character encoding in HTML

For historical reasons, the English alphabet and many of its punctuation marks are encoded in electronic devices in a universal and unique way. This encoding is called ASCII (American Standard Code for Information Interchange). However as soon as we step outside this narrow character set, problems are waiting for the unwary. Any letter that is […]

Tagged with:

Indigenous Languages

The official language in Argentina is, without a doubt, Spanish. However, it is a country that has been enriched by a vast number of languages. Though it is difficult to say exactly how many, around 35 indigenous languages are spoken. Currently only thirteen of these are officially listed, and include the following: Toba, Pilagá, Mocoví, Wichí, Nivaclé, Chorote, Ava-Chiriguano, Mbya, Guaraní, Quichua Santiagueño, […]

Tagged with: