Frequently Asked Questions, WordSmith 3
Installation
Where should I put WS_PART1.EXE and WS_PART2.EXE
and why?
When you download WordSmith over
the Internet your browser will offer you a chance to store WS_PART1.EXE
and WS_PART2.EXE wherever you like. I recommend you to put them in
a clean directory where there are no other files. The reason is that
it will later be very easy to delete the lot to recuperate your valuable
hard disk space, with no risk of deleting other files. A suitable place
is c:\temp. Use this as a kind of rubbish bin
where you keep files which you can later delete without worrying
about whether
they're important. If c:\temp does not yet exist, I suggest you "Create
Directory" (File Manager in Windows 3.1), or "New Folder" (Explorer
in Windows 95) now: it'll be useful for lots of purposes.
What are WS_PART1.EXE and WS_PART2.EXE and
what are they for?
There are about 30 component files in WordSmith
Tools, and these in their normal state after the installation
process is finished take up about 4 megabytes of disk space. To
speed up electronic transmission and reduce risk of files getting
missed out, we have put them all into 2 compressed files, WS_PART1.EXE
and WS_PART2.EXE, which total about 2 megabytes in size and either
of which will fit on one floppy. You simply run these two files
and they unpack themselves, creating all the 30-odd component files
within them on your hard disk, usually into the same directory
that WS_PART1.EXE and WS_PART2.EXE are in at the time. You will
have used up about 6 MB of disk space. After installation, once
you've checked everything is working properly, you will want to
delete all the files in c:\temp to get this space back.
Should I keep WS_PART1.EXE and WS_PART2.EXE?
You don't need to keep them unless you really
want to, as you can always download a fresh copy, which will usually
have improvements, from the website which is visible in WordSmith and
specified in the readme.txt file which comes with it.
What is SETUP.EXE for?
I seem to have loads of SETUP.EXE files all over the place!
You do. Nearly all software comes with an accompanying
file called setup.exe or install.exe. These sub-programs manage the
installation for you so that the main program you've bought will work
smoothly, and will for example create any necessary sub-directories,
visible icons and program groups. In many cases running setup.exe will
copy files into various parts of your hard disk, often without you
being told about changes made to your whole system! In the case of WordSmith
Tools, there is a setup.exe program which you should use to
manage the installation. It will copy the relevant files to a suitable
directory on your hard disk. I suggest c:\wsmith but you can easily
set it to a different directory. WordSmith Tools' setup.exe
does NOT alter your basic system settings or copy any extra files into
hidden places at all.
So what should I do now?
After you've downloaded WS_PART1.EXE and WS_PART2.EXE,
extracted the component files, and run the setup.exe which WS_PART1.EXE
and WS_PART2.EXE will have extracted with the other files, you will
have a complete installed copy of WordSmith in the right
directory of your hard disk. Now you should run WordSmith itself,
in the \wsmith directory. The main controller is called WSHELL.EXE
and that's what you should run. If you created an icon, it should
automatically run wshell.exe. When you run it, your version will
not yet be a full
one (you haven't yet given it your name or registration code, so
it will "complain" that it's in demo mode and will suggest you
Update from Demo.
Why don't you just use
disks? It'd be a lot easier than all this hassle!
Would it? You would have to wait for the disks
to arrive, for a start. Bookshops don't like handling software and
cannot give good support, and even big software shops don't stock much
specialised software like WordSmith Tools. Customers
outside the UK (that is MOST users of WordSmith judging
by the feedback I get) might have to wait a long time, depending on
2 postal systems and customs formalities. The cost of WordSmith would
be higher. It wouldn't fit onto one floppy; not everybody has a cd-rom
drive available. And how would we make the frequent updates available
to you? I make an updated version at least twice a month. The Internet
is a great way of distributing software, actually; it's not so good
for distributing hardware consumer goods like tv sets but is designed
for distributing information which is what software is.
I
have the Oxford University Press registration code; how do I update?
There's a menu item visible in WordSmith
Tools, called Update from Demo. This will run UPDATER.EXE
which, as its name suggests, allows you to type in your name
and registration code, converting WordSmith from
demo into full operational use. Make sure you type everything
in exactly as specified by OUP. If it's right, you will be told
so after you've clicked on OK. And you will no longer see the
menu option to update or get bothered by demo mode messages.
My
registration code didn't work...
If OUP have mis-spelled your name (they do try
ever so hard not to!) you should register with the mis-spelt name
anyway. WordSmith Tools should work okay with the name
and code as supplied to you. Then, with a working copy of WordSmith,
get back to OUP and ask for a new registration code. Make sure you
give them the correct name legibly! For any other WS3 registration
problems contact Oxford University Press.
Does WordSmith...
Can WordSmith
handle Language X?
WordSmith 3 can handle most European languages and any which use a
1-byte-per-character system. A user can define
their own alphabet and own alphabetical order but there are problems
in getting a Windows
pc to show things correctly.
WS3 uses alphabets which can be represented in one byte (a number between
0 and 255), a system which was usual in computers until recently. With
such a one-byte representation system there needs to be
a "codepage" (a table of 256 characters) which contains the
symbols you need.
In practice you can handle English, French, Greek, Russian, etc.
Does WordSmith tag
texts?
No. You have to tag your own manually or use a tagger to tag them automatically.
Does WordSmith come
with a corpus?
No. I cannot legally supply you with a body
of texts. But you can easily build up your own using Internet resources.
There are lots of corpora,
some of which are freely accessible, others can be purchased cheaply,
and others are extremely expensive. Try a google search on "test
corpus". Or visit newspaper web sites.
Text Handling & Display
Accents
are not displayed right!
1) Check the format of 1 or 2 of your texts in both DOS & Windows.
In Windows, look at them using Notepad. You will immediately see whether
the accented characters look right. If they do, you have Windows-format
texts (ANSI). If they don't look right, then go to MSDOS, and then
try EDIT xxx.txt -- if that works you'll see them in their DOS encarnation.
Look again at the accented characters. If your pc cannot understand
EDIT, then try TYPE xxx.txt -- the text will flash by fast and will
be impossible to scrutinise but if there are accented characters in
the last few lines of xxx.txt you will be able to see them. If they
look right in one of these two ways go to 3.
2) Using Tag File 2 to convert accents on the fly should not be necessary
unless your accented characters are like this: "é" "À" in
the text. (If they are in this sort of format you have HTML or similar
and WILL need to use Tag File 2 correctly and enough should be there
in Help to guide you.)
3) Once you know which format they're in, in Text Characteristics
set the Language to the right one, and the format to Windows or DOS
accordingly.
4) If they're in DOS or HTML format, you could convert all your texts
to the usual Windows format if you wanted, using Text Converter, but
would need to know the correct codes for each conversion needed --
the codes can be found in the Appendix of a DOS or Windows manual,
but it can be a pain finding them accurately. Or, one by one you might
be able to convert them successfully in MS Word. It'd depend how many
texts you had as to whether that'd be easy to do or not I guess.
Concord
Trying
to concordance "can", it keeps giving
me "can't" as well. I've told the text settings that I don't want the
apostrophe to be included in the word but the only difference that
seems to make is that the resulting concordance treats them as different
words in the sorting (it doesn't when the apostrophe is 'included in
word').
The problem is one of ambiguity. The apostrophe
is used for at least 3 purposes in English (genitives, irony and quotes).
Here you're concerned with how Concord recognises the end
of a word, since by default if you ask for "can" you want " can ".
But there are other word separators besides the space symbol, including
carriage returns and punctuation symbols such as apostrophes. In
other
words, the apostrophe in BROTHERS' must be seen as a word separator,
just like the space after SISTER or the full stop after COUSIN.
One solution -- the easiest -- is to delete
the unwanted concordance lines with "can't". Sort on the search word
(F6) so that they all come together, then delete them all in one
go. Another
is to use a stop list. This will be a bit slower than solution
1, because it forces Concord to check every occurrence of "can" to
see whether it is in the stop list.
I wanted to concordance phrases like big,
bold and beautiful so I entered * , * and * but
it didn't work. Why not?
Use * *, * and * (in
other words put an asterisk before the comma). By default Concord assumes
anything, even a comma, is a "whole word".
Can the software work with large corpora of over
200M words? Would there would be notable delays before concordance lines
start appearing on a high-spec PC?
Yes. NO noticeable delays; concordance lines start appearing
the millisecond they're found. You can stop at any time if you have enough.
But it will take a long time to go through 200M words (= 1,200 MB if pure
untagged text, on a fast pc I'd guess about 3 minutes though this depends
on search-word etc. and more info is supplied on this in the Help.) See also
no. 3 below.
Is the corpus indexed?
Not usually. But now it can be, if you so choose. To
do lots of concordances always using the same very big corpus you'd be best
advised to make an index of it.
I got a "General Protection Fault" message
when running Concord.
The GPF message means that Concord tried to tread
into some memory space it wasn't allowed access to. This is a pain, and even
MS Word, Eudora, Windows Help and other programs occasionally do it.
Possibility 1) A GPF is most likely to happen if there
is a straightforward error in the program. In the current OUP version this
is not very likely in ordinary use; I would have had lots of angry messages
since launch date if this usually happened! Solution: get a new version from
my website and try again.
Possibility 2) A GPF could be caused by a shortage of
memory whilst Concord is trying to do its job. This might happen if there
were other sizeable programs such as MS Word loaded up at the same time,
or else if there was something wrong with the hard disk. Windows uses the
hard disk as a storage area for memory when it runs out of room on the chips
in the machine. Solution: re-boot the machine so as to start with a nice
fresh reset setup. Then run Scandisk to check whether the hard disk is screwed
up at all and correct any faults which appear. Ensure there is at least 25
MB of room on the hard disk. Now run WordSmith again.
Possibility 3) There might be something special about
the machine in question, especially if it is somehow different from the usual
setup commonly found. I developed WordSmith Tools 1-3 using Pentium PCs running
Win 95B, 98, NT and 2000. Things which might be special include: non-standard
operating system, networked setup, unusual non-Intel CPU, very old CPU (eg.
386) or very latest CPU, laptop machine. Solution: try it on another machine
if possible.
WordList
I got a "List Index Out of Bounds" Message when trying
to use the Match List function.
What the error means is that when trying to build
up a list of words, WS3 has done something absurd, such as trying to
read item -1 or item 5 if there are only 4 in the list. That is why
the list item is out of bounds. The result is the hourglass (which
starts when WS3 begins to go through the list) doesn't disappear as
it would if it managed to finish processing the list. Unfortunately
this bug means that Match List from a text file doesn't work, though
matching from a template does (explained in the Help).
|