心胸决定格局,眼界决定境界...

如何用kaldi做孤立词识别二

基本模型没有变化,主要是调参,配置:

%WER     65%  下降到了     15%

后面再继续优化...

Graph compilation finish!
steps/decode.sh --nj 1 --cmd utils/run.pl exp/mono0/graph_tgpr data/waves_test exp/mono0/decode_waves_test
decode.sh: feature type is delta
steps/diagnostic/analyze_lats.sh --cmd utils/run.pl exp/mono0/graph_tgpr exp/mono0/decode_waves_test
steps/diagnostic/analyze_lats.sh: see stats in exp/mono0/decode_waves_test/log/analyze_alignments.log
Overall, lattice depth (10,50,90-percentile)=(1,3,9) and mean=3.9
steps/diagnostic/analyze_lats.sh: see stats in exp/mono0/decode_waves_test/log/analyze_lattice_depth_stats.log
score.sh works!
exp/mono0/decode_waves_test
%WER 15.00 [ 3 / 20, 2 ins, 1 del, 0 sub ] exp/mono0/decode_waves_test/wer_11

 

200_001_001 espresso
LOG (gmm-latgen-faster[5.2.124~1396-70748]:DecodeUtteranceLatticeFaster():decoder-wrappers.cc:286) Log-like per frame for utterance 200_001_001 is -9.08665 over 118 frames.
200_001_002 lungo
LOG (gmm-latgen-faster[5.2.124~1396-70748]:DecodeUtteranceLatticeFaster():decoder-wrappers.cc:286) Log-like per frame for utterance 200_001_002 is -9.24863 over 87 frames.
200_001_003 extralungo
LOG (gmm-latgen-faster[5.2.124~1396-70748]:DecodeUtteranceLatticeFaster():decoder-wrappers.cc:286) Log-like per frame for utterance 200_001_003 is -8.80181 over 121 frames.
200_001_004 Cappuccino no
LOG (gmm-latgen-faster[5.2.124~1396-70748]:DecodeUtteranceLatticeFaster():decoder-wrappers.cc:286) Log-like per frame for utterance 200_001_004 is -9.10243 over 83 frames.
200_001_005 lattemakiato
LOG (gmm-latgen-faster[5.2.124~1396-70748]:DecodeUtteranceLatticeFaster():decoder-wrappers.cc:286) Log-like per frame for utterance 200_001_005 is -9.09944 over 120 frames.
200_001_006 bluemountain
LOG (gmm-latgen-faster[5.2.124~1396-70748]:DecodeUtteranceLatticeFaster():decoder-wrappers.cc:286) Log-like per frame for utterance 200_001_006 is -8.92891 over 116 frames.
200_001_007 ok
LOG (gmm-latgen-faster[5.2.124~1396-70748]:DecodeUtteranceLatticeFaster():decoder-wrappers.cc:286) Log-like per frame for utterance 200_001_007 is -10.0784 over 94 frames.
200_001_008 yes
LOG (gmm-latgen-faster[5.2.124~1396-70748]:DecodeUtteranceLatticeFaster():decoder-wrappers.cc:286) Log-like per frame for utterance 200_001_008 is -9.52974 over 46 frames.
200_001_009 no
LOG (gmm-latgen-faster[5.2.124~1396-70748]:DecodeUtteranceLatticeFaster():decoder-wrappers.cc:286) Log-like per frame for utterance 200_001_009 is -9.06832 over 68 frames.
200_001_010 thankyou
LOG (gmm-latgen-faster[5.2.124~1396-70748]:DecodeUtteranceLatticeFaster():decoder-wrappers.cc:286) Log-like per frame for utterance 200_001_010 is -9.38154 over 73 frames.
200_002_001 espresso
LOG (gmm-latgen-faster[5.2.124~1396-70748]:DecodeUtteranceLatticeFaster():decoder-wrappers.cc:286) Log-like per frame for utterance 200_002_001 is -8.87652 over 99 frames.
200_002_002 lungo
LOG (gmm-latgen-faster[5.2.124~1396-70748]:DecodeUtteranceLatticeFaster():decoder-wrappers.cc:286) Log-like per frame for utterance 200_002_002 is -8.98032 over 85 frames.
200_002_003 extralungo
LOG (gmm-latgen-faster[5.2.124~1396-70748]:DecodeUtteranceLatticeFaster():decoder-wrappers.cc:286) Log-like per frame for utterance 200_002_003 is -9.24635 over 123 frames.
200_002_004 Cappuccino no
LOG (gmm-latgen-faster[5.2.124~1396-70748]:DecodeUtteranceLatticeFaster():decoder-wrappers.cc:286) Log-like per frame for utterance 200_002_004 is -9.10968 over 75 frames.
200_002_005 lattemakiato
LOG (gmm-latgen-faster[5.2.124~1396-70748]:DecodeUtteranceLatticeFaster():decoder-wrappers.cc:286) Log-like per frame for utterance 200_002_005 is -8.68037 over 117 frames.
200_002_006 bluemountain
LOG (gmm-latgen-faster[5.2.124~1396-70748]:DecodeUtteranceLatticeFaster():decoder-wrappers.cc:286) Log-like per frame for utterance 200_002_006 is -9.34412 over 110 frames.
200_002_007
LOG (gmm-latgen-faster[5.2.124~1396-70748]:DecodeUtteranceLatticeFaster():decoder-wrappers.cc:286) Log-like per frame for utterance 200_002_007 is -9.84015 over 64 frames.
200_002_008 yes
LOG (gmm-latgen-faster[5.2.124~1396-70748]:DecodeUtteranceLatticeFaster():decoder-wrappers.cc:286) Log-like per frame for utterance 200_002_008 is -9.53148 over 77 frames.
200_002_009 no
LOG (gmm-latgen-faster[5.2.124~1396-70748]:DecodeUtteranceLatticeFaster():decoder-wrappers.cc:286) Log-like per frame for utterance 200_002_009 is -9.62339 over 51 frames.
200_002_010 thankyou

 

posted @ 2017-09-29 16:31  WELEN  阅读(2390)  评论(0编辑  收藏  举报