State-of-the-art Acoustic Modeling (AM) techniques use long short term memory (LSTM) networks, and apply multiple phases of training on large amount of labeled acoustic data - initial cross-entropy (CE) training or connectionist temporal classification (CTC) training followed by sequence discriminative training, such as state-level Minimum Bayes Risk (sMBR). Recently, there is considerable interest in applying