-
Notifications
You must be signed in to change notification settings - Fork 4.3k
How do I Read Things in BrainScript
Chris Basoglu edited this page Apr 12, 2017
·
5 revisions
The HTKMLFReader (the reader to read Master Label Files (MLF) of the Hidden Markov Toolkit (HTK)) can be configured to read multiple label streams. The example below is taken from TIMIT_TrainMultiTask_ndl_deprecated.cntk in the Examples directory:
reader = {
readerType = "HTKMLFReader"
...
labels = {
mlfFile = "$MlfDir$/TIMIT.train.align_cistate.mlf.cntk"
labelMappingFile = "$MlfDir$/TIMIT.statelist"
labelDim = 183
labelType = "category"
}
regions = {
mlfFile = "$MlfDir$/TIMIT.train.align_dr.mlf.cntk"
labelDim = 8
labelType = "category"
}
}
See the description at Understanding and Extending Readers and look for the section describing how to "compose several data deserializers"
Use the composite reader to specifiy the two files, one for lables, and one for features. And make sure to match sequence id's in labels file and the features file.
reader = [
…
deserializers = (
[
type = "CNTKTextFormatDeserializer" ; module = "CNTKTextFormatReader"
file = "$RootDir$/features.txt"
input = [ features = [...]]
]:[
type = "CNTKTextFormatDeserializer" ; module = "CNTKTextFormatReader"
file = "$RootDir$/labels.txt"
input = [ labels = [...]]
]
]