Empty Hypotheses with n_best-oracle Decoding in ctc_decode.py #1870

moadel2002 · 2025-01-28T11:06:02Z

When I decode using the n_best-oracle method in ctc_decode.py with my dataset, I consistently get empty hypotheses, resulting in a 100% WER.
I trained zipformer model on my own dataset and I used the checkpoint already with streaming decode and it works fine, in addtition to creating HLG.pt with steps prepare_lm.sh (3gram model) .
Here is the script I used for decoding
I added two arguments to (-test-set-cut-path, --test-set-name) instead of using librispeech data

./zipformer/ctc_decode.py \
    --test-set-cut-path "data/manifests/MGB3_dev/cuts.jsonl.gz" \
    --test-set-name "MGB3_dev" \
    --epoch 39 \
    --exp-dir "zipformer/exp" \
    --max-duration 100 \
    --context-size 2 \
    --num-paths 100 \
    --on-the-fly-feats True \
    --nbest-scale 0.5 \
    --decoding-method "nbest-oracle" \
    --hlg-scale 0.6 \
    --lang-dir data/lang_bpe_5000 \
    --use-ctc True

Here is a snapshot of the output in errs.txt:
%WER = 100.00 Errors: 0 insertions, 30997 deletions, 0 substitutions, over 30997 reference words (0 correct) Search below for sections starting with PER-UTT DETAILS:, SUBSTITUTIONS:, DELETIONS:, INSERTIONS:, PER-WORD STATS: PER-UTT DETAILS: corr or (ref->hyp) comedy_75_first_12min_566.284_574.964-58: (أما بييجي يقول اليوم بيقول اليوم وكان هذا هي نشرة أخبار اليوم->*) comedy_75_first_12min_133.783_142.442-4: (أ قرود زي بعض قروض زي بعض آه لكن يبقى أرنب أخو قرد ما هو كارتون بقى وكده آه أطفال أطفال مش فاهمين أطفال بقى آه آه->*) comedy_75_first_12min_218.077_226.666-12: (عامل نفسه كوميدي قال يعني يبص إزاي فتقوم التانية أول ما تشوفه->*)

The text was updated successfully, but these errors were encountered:

moadel2002 · 2025-02-02T10:49:56Z

@danpovey

danpovey · 2025-02-08T15:07:37Z

I think you should add various printouts in the function nbest_oracle in icefall/decode.py.
try printing out the ref_texts and the lattice, and the word_table. See if the texts in ref_texts can be successfully
split into words with .split(), and whether it's successfully looking up the words in the symbol table.
Edit: it looks like it's the hyp that's empty, so you might focus on the lattice. Is the lattice nonempty?
Try looking up some of the word-ids in aux_labels, and see if they point to correct-looking words.
You can also see whether doing this on CPU vs GPU makes a difference: calling nbest_oracle
with lattice.to('cpu') in place of lattice may help. (not sure what it's called in the calling code, possibly
nbest).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Empty Hypotheses with n_best-oracle Decoding in ctc_decode.py #1870

Empty Hypotheses with n_best-oracle Decoding in ctc_decode.py #1870

moadel2002 commented Jan 28, 2025

moadel2002 commented Feb 2, 2025

danpovey commented Feb 8, 2025 •

edited

Loading

Empty Hypotheses with n_best-oracle Decoding in ctc_decode.py #1870

Empty Hypotheses with n_best-oracle Decoding in ctc_decode.py #1870

Comments

moadel2002 commented Jan 28, 2025

moadel2002 commented Feb 2, 2025

danpovey commented Feb 8, 2025 • edited Loading

danpovey commented Feb 8, 2025 •

edited

Loading