Skip to content
Nguyen Huynh edited this page Apr 21, 2015 · 9 revisions

Welcome to the iss wiki!

ISS is a collection of scripts for building hidden Markov models (HMMs), multi-layer perceptrons (MLPs) and the like for speech processing.

Contents

Configuration

There are typically no command line arguments. Rather, there is a single top level configuration file, Config.sh, into which you put environment variables, and a single working configuration file bin/config.sh, that converts these environment variables into shell variables.

The top level contains a Config.sh and a few other scripts that source Config.sh and run things (you can find example top level scripts in etc/examples). These "top level scripts" are the things you can edit to customise to whatever you want to do. Config.sh imposes a consistent feel, e.g., it immediately changes directory into the working directory specified as the first argument; so

Extract.sh my-dir

creates and changes to ./my-dir before doing anything.

The bin directory contains bin/config.sh and other scripts that source bin/config.sh and do actual work. These "working scripts" should remain task independent. The working scripts also use zsh functions in lib/zsh. Similar to Config.sh, bin/config.sh imposes a consistent feel.

The top level scripts communicate with the working script using environment variables. In this sense they can be written in any scripting language. See bin/config.sh for a (long) list of which environment variables can be used.

Language

ISS is basically written in zsh, which is a Bourne-like shell similar to bash or ksh. Its main advantage is that it handles arrays like csh. Other parts of ISS are written in python or ruby. If you like perl, consider ruby instead.

Exemplar

There is an exemplar called iss-wsj using the WSJ databases in github. This will train and test WSJ systems using both si-84 and si-284 training sets, and November '92 (si-et-05) and November '93 (h2-p0) test sets.

We recommend that you check out this and alter it to suit your database.

The basic sequence of operation for HMM/GMM training is based on the HTK book:

extract.sh
init-train.sh
flat-start.sh
fix-silence.sh
align.sh
reestimate-mono.sh
init-tri.sh
reestimate-tri.sh
tie.sh
mix-up.sh
synth-full.sh

Then for decoding:

extract.sh
decode.sh
score.sh

although there is a decoder dependent grammar construction stage too.

Further reading

"Lord Blackadder. Our foremost cartographers have given us a map of the area you'll be traversing."

"But it's blank!"

"Yes, they'd like you to fill it in as you go."

Blackadder, series 2

It's a wiki; content welcome.

Clone this wiki locally