This index view — in file order — shows you the words in the order in which they cropped up in the BNC Spoken corpus (10 million words).

word — frequency — plot

Near the top of the list most words are highly frequent ones, er and I, etc. Further below there's a shot from half way down the list.

This feature can be used to make an index of a document or series of documents. The plot marks can be "saved as text" as numbers.

WordList Index

The blue lines show the beginning & end of the corpus; you can see an almost continuous red line showing where these sports words first appeared. Of course there aren't any high frequency items — these have got used up near the top of the list.

There aren't quite as many red marks as each frequency indicates — because some crop up too near each other to be visible as separate marks.

