Scripts
This option allows you to run a pre-prepared script. In the case below, sample_script.txt has requested two concordance operations, a word list, and a keywords analysis. It's done 3% of a further WordList job. The whole process happened without any intervention from the user, using the defaults in operation.
The syntax is as suggested in the EXAMPLES visible above. (There is a sample_script.txt file in your Documents\wsmith7 folder). First the tool required, then the necessary parameters, each surrounded by double quotes, in any order. Not case sensitive. Each command must be on one line.
examples
concord corpus="x:\text\dickens\hard_times.txt" node="hard" output="c:\temp\hard.cnc"
made a concordance of the hard_times.txt text file looking for the search-word hard and saved results in c:\temp\hard.cnc
concord corpus="x:\text\dickens\hard_times.txt" node="c:\temp\sws.txt" output="c:\temp\outputs.txt" 1_at_a_time="true"
made a concordance of the same text file looking for each search-word in the sws.txt file, counted the number of hits and saved results in c:\temp\outputs.txt
wordlist corpus="x:\text\shakes\oll\txt\tragedies\*.txt" output="c:\temp\shakespeare.lst"
made a word list of all the .txt text files in a folder of Shakespeare tragedies (not including sub-folders) and saved it in its default format.
keywords refcorpus="j:\temp\BNC.lst" wordlist="c:\temp\shakespeare.lst" output="j:\temp\shakespeare.kws"
made a key words list of that word list compared with a BNC word list and saved it.
wordlist-index corpus="x:\text\shakes\oll\txt\tragedies\*.txt" clusters = "3" output="c:\temp\shakespeare_tragedies_3s.lst" excluded_subfolders="*_characters" include_subfolders="true"
computed an index of shakespeare tragedies, looked at sub-folders but excluded any containing _characters, then computed clusters 3 words in length and saved the word list as shakespeare_tragedies_3s.lst,
|
other output formats (e.g. Excel .xlsx)
TXT_format="true"
Excel_format="true"
RTF_format="true"
XML_format="true"
Saves the output instead, as a txt, .xlsx, .rtf, or .xml file for use with other software. RTF_format is slower than the others.
|
other options: 1_at_a_time, fetch, show
1_at_a_time="true"
fetch="N" (N = a number)
show="N"
If 1_at_a_time is true, a word-list will export separate results text file by text file. The output needs to be a folder, not a file-name.
If 1_at_a_time is true, Concord will read search words from a text file and save summary results:
concord corpus="x:\text\dickens\hard_times.txt" node="c:\temp\sws.txt" output="c:\temp\outputs.txt" 1_at_a_time="true"
produced this in c:\temp\outputs.txt:
x:\text\dickens\hard_times.txt
hard 50
soft 3
mean 54
empty 9
fred 0
book 13
north* 4
south* 2
concord corpus="x:\text\dickens\hard_times.txt" node="c:\temp\sws.txt" output="c:\temp\outputs.txt" 1_at_a_time="true" fetch="5" show="2"
using the same text produced this:
fetch tells WordSmith how many concordance lines to find, show tells it how many to show as a sample.
|
collocate scripts
It is also possible to run a script requesting the collocates of each word in a word-list. This syntax
wordlist collocates of "c:\temp\shakespeare.lst" output="c:\temp\shakespeare\collocates"
tells WordSmith to compute the collocates of each word in the shakespeare.lst word-list, and save results as plain text files, one per word, in the c:\temp\shakespeare\collocates folder. The texts to be processed are the same text files used when the word list was created (and must still be present on disk to work, of course). Settings affecting the process are shown below. The first 6 have to do with the words from the word-list, and the min. in collocate-list refers to how many collocates of each word-list word are needed (here 10) for processing to be reported. Min. total column refers to the number in the total column of a collocation display.
Results look like this:
Here they're incomplete because I pressed the Stop button.
Each of these lists has the collocates output much as in a collocates display, but with the relationships also computed.
The process only saves results where the settings shown above are met and where the relationships also meet the requirements as in the WSConcgram settings.
|
list of syntax terms
term
|
use
|
1_at_a_time
|
one text-file at a time
|
clusters
|
number of words in a cluster
|
collocates
|
computes collocates
|
Concord
|
use Concord tool
|
corpus
|
specifies your text files
|
Excel_format
|
save as .xlsx
|
excluded_subfolders
|
do not include specific sub-folders
|
fetch
|
limits the number of concordance lines
|
include_subfolders
|
include sub-folders of the main corpus folder in getting text files
|
KeyWords
|
use KeyWords tool
|
node
|
specifies the search-word
|
output
|
specifies where to save results
|
refcorpus
|
reference corpus
|
RTF_output
|
saves as .rtf file
|
show
|
limits number of concordance lines to show in output
|
TXT_format
|
save output as a .txt file
|
WordList
|
use WordList tool
|
WordList-index
|
make a WordList index
|
XML_format
|
output is saved in .xml format
|
|
See also : drag and drop