The British National Corpus is a valuable resource but has certain problems as it comes straight off the cdrom:
•it is in Unix format
•it has entities like é to represent characters like é
•its structure is opaque and file-names mean nothing
You will find it much easier to use if you
•convert it to Unicode
•filter the files to make a useful structure
as explained at http://lexically.net/wordsmith/Handling_BNC/index.html
The easiest way to do that is in two stages.
Conversion:
After choosing the texts,
and when you press OK you'll be asked something like this
After the work is done you will see the BNC texts copied to a similar structure (in our case stemming from j:\temp)
Filter
Choose the converted texts in the first window:
de-activate conversion,
and choose filtering like this:
Eventually you should get folder structures like this: