In my experience Viterbi training alone is often enough to get reasonable accuracy, at least in speech recognition.
Instead of doing the more costly Baum-Welch training you can spend your time better elsewhere, e.g. use deep neural networks instead of GMMs or collect more data.
Interesting - I've never tried Viterbi training. Maybe it is worth implementing after all. I plan to do a hybrid DNN-HMM (or whatever it is called now) with pylearn2 in a followup post.