pocketsphinx_continuous: Run speech recognition in continuous listening mode

DESCRIPTION

This program opens the audio device and waits for speech. When it detects an utterance, it performs speech recognition on it.

-adcdev: name for audio input (platform-specific)
-adchdr: Size of audio file header in bytes (headers are ignored)
-adcin: Input is raw audio data
-agc: Automatic gain control for c0 ('max', 'emax', 'noise', or 'none')
-agcthresh: Initial threshold for automatic gain control
-allphone: Do phoneme recognition
-alpha: Preemphasis parameter
-backtrace: Print back trace of recognition results
-beam: Beam width applied to every frame in Viterbi search (smaller values mean wider beam)
-bestpath: Run bestpath (Dijkstra) search over word lattice (3rd pass)
-bestpathlw: Language model probability weight for bestpath search
-cachesen: Cache senone scores from first pass search
-cep2spec: Input is cepstral files, output is log spectral files
-cepdir: files directory (prefixed to filespecs in control file)
-cepext: Input files extension (prefixed to filespecs in control file)
-ceplen: Number of components in the input feature vector
-cmn: Cepstral mean normalization scheme ('current', 'prior', or 'none')
-cmninit: Initial values (comma-separated) for cepstral mean when 'prior' is used
-compallsen: Compute all senone scores in every frame (can be faster when there are many senones)
-ctl: file listing utterances to be processed
-ctlcount: No. of utterances to be processed (after skipping -ctloffset entries)
-ctlincr: Do every Nth line in the control file
-ctloffset: No. of utterances at the beginning of -ctl file to be skipped
-dict: pronunciation dictionary (lexicon) input file
-dither: Add 1/2-bit noise
-doublebw: Use double bandwidth filters (same center freq)
-dsratio: Frame GMM computation downsampling ratio
-fbtype: FB Type of mel_scale or log_linear
-fdict: word pronunciation dictionary input file
-feat: Feature stream type, depends on the acoustic model
-fillpen: Filler word transition penalty
-frate: Frame rate
-fsg: state grammar
-fsgbfs: Force backtrace from FSG final state
-fsgctlfn: finite state grammar control file
-fsgusealtpron: Use alternative pronunciations for FSG
-fsgusefiller: (FSG Mode (Mode 2) only) Insert filler words at each state.
-fwd3g: Use trigrams in first pass search
-fwdflat: Run forward flat-lexicon search over word lattice (2nd pass)
-fwdflatbeam: Beam width applied to every frame in second-pass flat search
-fwdflatefwid: Minimum number of end frames for a word to be searched in fwdflat search
-fwdflatlw: Language model probability weight for flat lexicon (2nd pass) decoding
-fwdflatsfwin: Window of frames in lattice to search for successor words in fwdflat search
-fwdflatwbeam: Beam width applied to word exits in second-pass flat search
-fwdtree: Run forward lexicon-tree search (1st pass)
-hmm: containing acoustic model files.
-hyp: output file name
-hypseg: output with segmentation file name
-input_endian: Endianness of input data, big or little, ignored if NIST or MS Wav
-kdmaxbbi: Maximum number of Gaussians per leaf node in kd-Trees
-kdmaxdepth: Maximum depth of kd-Trees to use
-kdtree: file for Gaussian selection
-latsize: Lattice size
-lifter: Length of sin-curve for liftering, or 0 for no liftering.
-live: Get input from audio hardware
-lm: trigram language model input file
-lmctl: a set of language model

The -hmm and -dict arguments are always required. Either -lm or -fsg is required, depending on whether you are using a statistical language model or a finite-state grammar.

pocketsphinx_continuous (1)

SYNOPSIS

DESCRIPTION

AUTHOR

COPYRIGHT

RELATED TO pocketsphinx_continuous…