Word Count Differences (2)
How can word counts differ within the same tool on different machines?
Have you ever run a word count with the same document on two different machines and received different word counts?
Well, here is what can have an impact on the word count statistics:
- The use of a TM on one machine and no TM on the other machine can produce different word counts. A project with no TM will use default settings for counting, which might have been adjusted in the TM you actually use. For example, the setting to count words with hyphens as one or two words.
Example: The same file in the same project gets analyzed without a TM and with a TM where the default settings had been adjusted (and here even the number of segments and characters changes).
- The filters you use to import the file have different settings. If a filter includes or excludes hidden text, hidden layers, comments, hidden rows or columns, embedded objects etc. this can have a big impact on the number of words that are counted. I remember one time when a Word document that had visibly only a few words, produced a very large word count because of extracting the content of an embedded Excel file on one machine, but not on the other.
Example: The same file (just with different names) was imported with the default XML filter and with a filter that also imports the content of an attribute for translation.
- The use of different versions of the software . Believe it or not, the tools providers do tweak the way words are counted now and then. At one point there was a Trados version where a number-measurement combination was counted as two words in one version, but counted as one word in the next version. It took some time to figure that one out, believe me, as it was unfortunately not mentioned in the release notes.
- The analysis settings you use. The analysis might have an option to ignore locked segments, if that is switched off on one machine, but switched on on the other, the word counts will differ as well (provided of course there are locked segments in the files, for example in XLIFF files from another tool, or if you run an analysis after file preparation).