Converting Different File Formats for Easier Word Analysis

25837140.thbNothing is more enjoyable for a project manager than when we receive a word document (or other plain text document) to quote and work with. With many programs to determine the word count of said document, the quotation process becomes faster, more precise, and much easier when it is in this type of file format. But alas, work life is not always so accommodating. Most times we receive .pdf or .jpeg files filled with images, tables, handwritten text, and worse yet, scanned copies of text. Since we cannot simply make up a word count to send to our clients, we need to perform certain functions in order to find this out. Also, when quoting, we not only must determine how many words are in a document but how many images, images with text, headers, footers, and other such important properties are contained within that document.

Luckily, we have a couple of programs we use to analyze our files. When we open a .pdf and see that it is a relatively nice copy (i.e., text format is straight and neat), all letters are perfectly clear and/or no images, we use a program called Solid Converter PDF. Using this program, we upload the document and then convert into MS Word format (you can also convert it to plain text, .rtf and .xml). From there, we take our file and run it through either Trados, Wordfast or Memsource to determine a word count (in many cases, we run the word count through at least two programs in order to get an accurate analysis).  In most cases, though, we receive files that are illegible, scanned, or filled with images and tables. For this we use a reliable program we PM’s could not do without and that is the ABBYY FineReader. This special application first pre-digitalizes the file for you. Then, you go through each page of the document and modify what it has analyzed for the last digitalization/scanning. ABBYY (as we refer to it in the office) allows you to separate text, images, and tables. You simply highlight all areas that are different from each other according to what they are (again, either text, image or table) and have the program “read”the analyzed file.  From there you proceed to the other pages doing the same thing until you have completed the document! ABBYY then converts everything into a word file for us to continue with the process.

With great tools like this to get more reliable results, we can assure the client is getting the most accurate quote.