Collocation Settings



To set collocation horizons and other Concord settings, in the main WordSmith Controller menu at the top, choose Concord Settings.


Collocates are computed case-insensitively (so my in the concordance line will be treated like My).

If you don't want certain collocates such as THE to be included, use a stop-list.

You can lemmatise (join related forms like SPEAK -> SPEAKS, SPOKE, SPOKEN) using a lemma list file.


Minimum Specifications

The minimum length is 1, and minimum frequency is 1 (default is 10). You can specify here how frequently it must have appeared in the neighbourhood of the Search Word. Words which only come once or twice are less likely to be informative. So specifying 5 will only show a collocate which comes 5 or more times in the neighbouring context.

Similarly, you can specify how long a collocate must be for it to be stored in memory, e.g. 3 letters or more would be 3.



Here you specify how many words to left and right of the Search Word are to be included in the collocation search: the size of the "neighbourhood" referred to above. The maximum is 25 left and 25 right. Results will later show you these in separate columns so you can examine exactly how many times a given collocate cropped up say 3 words to the left of your Search Word.

The most frequent will be signalled in the most frequent collocate colour (default=red).



These are


collocate breaks


which you will see in the bottom right corner of the screen visible in the Controller Concord Settings.

When the collocates are computed, if the setting is to stop at sentence breaks, collocates will be counted within the above horizons but taking sentence breaks into account.


For example, if a concordance line contains


source, per pointing integration times, respectively. However, when we compared these two maps


and the search-word is however,


when we compared these two

will be used for collocates because there is a sentence break to the left of the search word. If the setting is "stop at punctuation", then nothing will come into the collocate list for that line (because there is a more major break than punctuation to the left of it, and no word to the right of the search-word before a punctuation symbol.


stop at end of text: end of text is by default assumed to be the end of the text file. stop at heading or section: this works by recognising ends of heading or section which you can specify in the text format box (language settings):






Click the Permalink button if you want to copy a link to this page.