Contents of this chapter:

labels and words
string variations

words and labels

"Labels" are the tags inserted by the annotators who prepared the corpus (e.g., "IP", "CONJ", "N".) "Words" are the original words of the text that was parsed. Every node in the sentence-tree has a label, and in the leaf nodes the label is paired with a word. CorpusSearch can conduct searches on labels or words or combinations of the two.

string variations

CorpusSearch uses case-sensitive character-by-character string matching to match search-function arguments to strings found in the input. Therefore, spelling and upper-case/lower-case variations must be described explicitly (usually with an argument list.) For instance, this query searches for a complementizer whose associated text is "that" or "That":

(C iDominates that|That)

and finds sentences such as this:

and he shalle do yow remedy, that youre herte shal be pleasyd. '

    12 CP-ADV: 13 C that

      (12 CP-ADV (13 C that)
                 (14 IP-SUB
                            (15 NP-SBJ (16 PRO$ youre) (17 N herte))
                            (18 MD shal)
                            (19 BE be)
                            (20 VAN pleasyd)))
      (ID CMMALORY,3.47))