This information was kindly supplied by Fumiko Kondo and Daiske Akagi.

Japanese needs to be segmented for satisfactory handling with WordSmith. You can get a segmenter at http://chasen.naist.jp/hiki/ChaSen/

Chasen requires Windows XP or Windows 2000; it doesn't work with Mac OS or Windows Vista. Instructions are:

  1. Save your text files in UTF8 format using Notepad ++ or in Word
  2. Open each in Chasen and save a segmented version of it.
  3. Use notepad++ to convert the text to UTF16, or use WordSmith's Text converter to do that.

You should be able to get results like this in Concord:

and for WordList:

or