Memory based tagger generator
mbtg -T <filename> -s <setting filename>
or
mbtg [options]
This programs generates, based on a tagged corpus, all the files needed to be able to tag a text with mbt.
-h or --help
show help
-T <tagged training corpus file>
or
-E <enriched tagged training corpus file>
All further options have reasonable defaults, so using them is only needed for the experienced user. See the mbt manual for more details.
-s settingsfile
mbtg creates this file, which can be used to run mbt with minimal effort. (like mbt -s settings -T somefile)
-p pattern
the pattern for known words (default ddfa)
-P pattern
the pattern for unknown words (default dFapsss)
-% <number>
filter threshold for ambitag construction (default 5%)
-l <lexiconfile>
-L <file with list of frequent words>
-r <ambitagfile>
-k <known words case base>
-u <unknown words case base>
-K <known words instances file>
-U <unknown words instances file>
-V or --version
show version info
-e <sentence delimiter> (default '<utt>')
-X
keep the intermediate files
-Otimbl options
(Note: there is NO SPACE between O and the options)
<options> classifier options for both known and unknown words instances bases K: <options> classifier options for known words instance base U: <options> classifier options for unknown words case base valid
timbl options are: a d k m q v w x -
possibly
Ko van der Sloot [email protected]
Antal van den Bosch [email protected]