Please enable JavaScript to view this site.

Handling the British National Corpus 

Navigation: BNC Mark-up > Tags

Header Tags

Scroll Prev Top Next More

In the case of the BNC the header information supplies data about the length of the text, who wrote it and where it was found, who processed it for the BNC and when, copyright issues, etc.

 

The whole text file starts with <bncDOC and some form of identification:

<bncDoc id=A0T>

in the case of text file A0T, and ends with

</bncDoc>

 

 

in the XML edition you will find xml in the header and to conform to XML specifications, double quote characters are used around each attribute:

<bncDoc xml:id="A00">

 

The header section starts with something like this:

<teiHeader type="text" status="update" date.updated="2000-12-13">

and ends with

</teiHeader>

<text  decls='CN001 HN001 QN000 SN000'><body>

 

You might wish to cut out the whole header when processing with WordSmith. To do so, see this section.