Mike Scott's Web
WordSmith Tools University of Liverpool Publications Language Awareness Contact & Links
You are at: Home > WordSmith Tools > Version 4 > FAQs > Languages > Greek

Handling Greek or Russian

The pictures below refer to Greek but the same principles apply to Russian and a lot of other languages...

You can process either the Windows (1-byte per character) or the Unicode (2-bytes per character) version of your text as you prefer. The 1-byte system works for these languages because Greek and Russian (unlike Japanese, Chinese etc.) never need 2 or 3 bytes to represent one character. Unicode, on the other hand, can represent virtually any language.

Your text may have been saved as plain text in what Word calls Greek (Windows) text which is 1-byte per character:

or what Word calls Unicode (2-bytes per character):

When you choose the text file in WordSmith you can check this by pressing this button:

and it will list each file under the Unicode column as U or A.

I saved another copy in Unicode and for that one I get this:

Now make sure you choose the right Language in Settings:

and go ahead to make your wordlist, concordance or whatever.