请求大家一个关于backoff值的问题!
有没有谁用过SRILM的ngram-count生成的语言模型文件啊!我搞了好几天都不知道里面的backoff值是怎么用的!实在没办法了,所以想求大家帮个忙!下面是ngram-count生成的三元语言模型:
\data\
ngram 1=3615
ngram 2=14108
ngram 3=1913
\1-grams:
-2.324166The-0.2515847
-2.27146European-0.5990924
-2.604675Parliament-0.3154638
-3.639437wants-0.2196503
-3.57249aims-0.1509919
-2.554265at-0.4736579
-2.899074good-0.2251422
-4.116558translations-0.08273052
-1.456642.-1.639534
-1.417588</s>
-99<s>-0.7667612
-4.417771<unk>
\2-grams:
-0.9586073<s> The-0.0006196946
-1.269071The European-0.7032914
-0.6837301European Parliament0.07154049
-1.480314aims at
-0.9362461translations .
-0.01597029. </s>
\3-grams:
-1.260168<s> The European
\end\
上面文件中1元和2元后面最后一个值都是backoff值!但是我就是搞不懂在算语言模型的时候如何使用该backoff值!!!求大家帮帮忙啊!!!拜托了!!!救命啊!!!
请求大家一个关于backoff值的问题!
This file is in the ARPA-standard format introduced by Doug Paul.p(wd3|wd1,wd2)= if(trigram exists) p_3(wd1,wd2,wd3)
else if(bigram w1,w2 exists) bo_wt_2(w1,w2)*p(wd3|wd2)
else p(wd3|w2)
p(wd2|wd1)= if(bigram exists) p_2(wd1,wd2)
else bo_wt_1(wd1)*p_1(wd2)
All probs and back-off weights (bo_wt) are given in log10 form.
请求大家一个关于backoff值的问题!
大哥太感谢你了!虽然在这之前我已经解决了!呵呵!但是还是要感谢你!!!页:
[1]