看output feature map的每个token随时间帧的值变化,查看是否有一些时间帧存在大波浪

1.语音原始输出

原语音的ground-truth如下:

t h a t | h a d | i t s | s o u r c e | a w a y | b a c k | i n | t h e | w o o d s | o f | t h e | o l d | c u t h b e r t | p l a c e | i t | w a s | r e p u t e d | t o | b e | a n | i n t r i c a t e | h e a d l o n g | b r o o k | i n | i t s | e a r l i e r | c o u r s e | t h r o u g h | t h o s e | w o o d s | w i t h | d a r k | s e c r e t s | o f | p o o l | a n d | c a s c a d e | b u t | b y | t h e | t i m e | i t | r e a c h e d | l y n d e ' s | h o l l o w | i t | w a s | a | q u i e t | w e l l | c o n d u c t e d | l i t t l e | s t r e a m

preoutput 取出每一帧,索引probability最大的字母维度,如下:

...............||..att..||hh.a.d....i.t.......|.s.oouurc.e...||.a.....w.a.y.......|||b.a.ck..|..inn||the.||w..o..dd.s||.off|tthe||.o.l..d....||.c.o.fh.....b.eritt......pl.a.c.e........................................||..i.t..|wwaas.|..rre...p.uu.t....e.d||t.o||b.e||.an..||..ann....ttric...guett....i..||hhead.....ll.o.ng.......||brro.kk..|..i.n.|i.ts||..ear....lhy||..a..||c.oourrse...|through|tho...sss.||.w.o..dd..s....................||wwiithh.....||d.a.rrkk...........||.s.e.c....rreit.ss||.off.....|.p..o..l.....||.andd...||.c.a.st........g.a.ddee.................................||b.u.t.....||bbyy..|tthe.||tt.i.m.e.|.a.t..|.r.eachhe.d..|..l.a.ne....ss..|.h..o.l......oww.....|||.itt.||wwass||.a..|.qquui.......e.t.......||ww.e.l....||c.on....d.u.ct......edd.||l.itt.llee...||sttree.t...........||

tokens分为四类,每一类的max和min值:

/Users/markdana/Desktop/draw_pics/CTC_lambda10_100000epoch_1.0lr/preOutput.txt

按11步滑窗平滑一下,方便观察变化趋势:

/Users/markdana/Desktop/draw_pics/CTC_lambda10_100000epoch_1.0lr/preOutput.txt

放大看线:

outputTest

具体到每条线:



2.原始输出softmax

将原始输出进行按每个token标准化,再取有用的26个token,竖着softmax。

经过上述操作后,(即实际参与计算loss和反向传播的矩阵),其对应的字母如下:

essyssededddddeitthattdeeihhhaaddeahitt'sssoddisswoouurceeeessthauleawwhaiyveeeedddaabeacckkeahhinndttherrwwwoourddsssssoffdttheeyoolllddddseeihcooufhffiepbberitttiuiappllaaceeesddyyeeeeereddsdddddddddddddddddyheddeeekmiiitttwwwaassihhrreeffpuuuttttileddettoooobbeehhandeteciannddtsttrickkgguettdehhindthhheaddeedrlllonnggaddddeawbrrookkkeachinnghistsiineearreellhyyttoafficccoourrseeddtthroughhthotewssseeawwoourdddssssdddsssssssddyyyeddhwwwiithhhdeetaad'aarrkkeee'sssssddsassheachsacrreitessstooffesstdaaphooollleeenehhanddthedaachasstteeddiisgaaaddeeeddddedddseedededddddhhedddddddddaabhuttttddddabbyyyeittheesattiimmeeeeoantetooreeachheedddttlleaineed''ssddohhhoolllllehoowwhheddddiiittttwwwassstha''aaqquuiiddeedhetteeddedddawwwelllehudccoondddddouccttddadhedddeallitttlleessdissttreeatheghheedddssa

tokens分为三类,每一类的max和min值:

/Users/markdana/Desktop/draw_pics/CTC_lambda10_100000epoch_1.0lr/preOutput_0.txt
/Users/markdana/Desktop/draw_pics/CTC_lambda10_100000epoch_1.0lr/preOutput_0.txt



3.训练后覆盖了m*noise的输出

λ=10,epoch=100000,lr=1.0

/Users/markdana/Desktop/draw_pics/CTC_lambda10_100000epoch_1.0lr/lastOutput.txt
/Users/markdana/Desktop/draw_pics/CTC_lambda10_100000epoch_1.0lr/lastOutput.txt



4.训练后覆盖了m*noise的输出,经过softmax(实际参与totloss的矩阵)

/Users/markdana/Desktop/draw_pics/CTC_lambda10_100000epoch_1.0lr/lastOutput_0.txt
/Users/markdana/Desktop/draw_pics/CTC_lambda10_100000epoch_1.0lr/lastOutput_0.txt



5.将m较大的50%的点置零,留下m小的点:

...............||.aan||he.r||.a...h.ad.t|ho.uu..|f.o.m|...||yoou.r.||the.n...y....|..ii.n......a.n.||.att.|.w.i.f....|.t.a.|y.i||h.ow....|ho.r||.e.c......|bb...n....|.g.o..v.e....|.h.e....e...........................||the....c.o.ss....i...|.h.e...||ha.d.|dre....t.a.nntt...l.ing.||l.i.n.|..err.f..||f..a..hm......orr..d|ho.......||..u....|.....||.e...tf.o.......|.c.a.m..|t.o....rf.....||thriit....e.th...||.w.i............................|t....|.to.|..e.r.|.it|.it.|the.||h.e.....||..o.f......i........|...o..ee..||m.a.t.|..e...|.thouv.....n.......|.m.u.pp............................|||.....t...||r.a.||..i....s||s.o.f.e||..f..||the.y..|d.e.....lie.n........||.frrin.....|..oul...........|.l.o.t.......|.h..o....er..and.......||l.o..gee...|...n..|n.o.tt..|..inn...p.re...v.ilt..he.|..a......ol.rr....||
/Users/markdana/Desktop/draw_pics/CTC_50high/preOutput.txt
/Users/markdana/Desktop/draw_pics/CTC_50high/preOutput.txt



6.将m较小的50%的点置零,留下m大的点:

...............||..and..||hhaa.d..|.itt......||.s.oo.rrc....|...a....ww.aiy.......||b.a.ft...|.i.n.|the..||.w.o..d.es||.of.|thee|.o.ll.......||h.a.t.......brri.t......pllee.se.........................................||..i.t..|wwass||..er...f.uuutt....it.||too|bbe.|.inn...|..if.....t.o..k|g.u.tt......||.s.i.d...||llo.n........|.w.o...p...|..in.|hiss.||..ear...llhy|..o...|.c.ouurrse...|through|the||wass|||...o..dd.s.....................||w.itth.....||.d.a.rrk.....s.....||.ss.e......r.e.ces||.off......|.p.u..l.....||.a.t.....||c.a.sst.......g.aa.t.ee................................||.n.u.t.....||b.y..||the.||tt..ii...|w.al..||..eaachhe.d....ll.inn....ss...|.h.o.ll.....ooww....|||.i.tt.||waas||.a...|.qquui.......a.t.......||ww.ell....|....n...d.u.ctt...|..et..|l.i.t.llee....||strree.t...........||
/Users/markdana/Desktop/draw_pics/CTC_50low/preOutput.txt