中文模板

Similar documents
基于词语关联度的查询缩略*

SVM [6] PCA+SVM 79.75% 9 FERE FERE. PCA LDA Adaboost SVM 5 1 SVM Moghaddam [6] M (x,y ) x R N y x y {0,1} M f ( x) = y α k( x, x ) + b x k f(x) = 1 x

中国轮胎商业网宣传运作收费标准

Microsoft Word - 系统建设1.doc

Microsoft Word - 5 魏志生.doc

續論

Ashdgsahgdh

0! 0!""! #$!""!%& "!" &"!""!% (%&)*" +" *" +&!""!%*" &* *(!""!%& &, *-!""!,- *!""!%!""!%+ *"" (!!, &") " &" *"" +. *+!"!""!% &") "!" "!""!%& + *! "!""

untitled

240 生 异 性 相 吸 的 异 性 效 应 [6] 虽 然, 心 理 学 基 础 研 [7-8] 究 已 经 证 实 存 在 异 性 相 吸 异 性 相 吸 是 否 存 在 于 名 字 认 知 识 别 尚 无 报 道 本 实 验 选 取 不 同 性 别 的 名 字 作 为 刺 激 材 料, 通

~ ~

Microsoft Word tb 赵宏宇s-高校教改纵横.doc

Vol. 36 ( 2016 ) No. 6 J. of Math. (PRC) HS, (, ) :. HS,. HS. : ; HS ; ; Nesterov MR(2010) : 90C05; 65K05 : O221.1 : A : (2016)

!# $#!#!%%& $# &% %!# (# )#! "

Shanghai International Studies University THE STUDY AND PRACTICE OF SITUATIONAL LANGUAGE TEACHING OF ADVERB AT BEGINNING AND INTERMEDIATE LEVEL A Thes

中文模板


《中文信息学报》投稿模版

Integration of English-Chinese Word Segmentation and Word Alignment

% GIS / / Fig. 1 Characteristics of flood disaster variation in suburbs of Shang

Dan Buettner / /

物理学报 Acta Phys. Sin. Vol. 62, No. 14 (2013) 叠 [4]. PET 设备最重要的部件就是探测器环, 探测 备重建图像具有减少数据插值的优势. 器环的性能直接影响 PET 的成像能力. 探头与探头 之间得到的符合直线叫做投影线. 所有的投影线在

[9] R Ã : (1) x 0 R A(x 0 ) = 1; (2) α [0 1] Ã α = {x A(x) α} = [A α A α ]. A(x) Ã. R R. Ã 1 m x m α x m α > 0; α A(x) = 1 x m m x m +

数学分析学习指导书》上册(吴良森、毛羽辉、韩士安、吴畏

cm /s c d 1 /40 1 /4 1 / / / /m /Hz /kn / kn m ~

第六篇

标题

Microsoft Word - 专论综述1.doc


/3 CAD JPG GIS CAD GIS GIS 1 a CAD CAD CAD GIS GIS ArcGIS 9. x 10 1 b 1112 CAD GIS 1 c R2VArcscan CAD MapGIS CAD 1 d CAD U

Microsoft Word 定版

Microsoft PowerPoint - aspdac_presentation_yizhu

92

2.181% 0.005%0.002%0.005% 2,160 74,180, ,000, ,500,000 1,000,000 1,000,000 1,000,000 2

UDC Empirical Researches on Pricing of Corporate Bonds with Macro Factors 厦门大学博硕士论文摘要库

(Microsoft Word - 001\253\312\255\261.doc)

A B A B S + V + Pt or Complement + Num-MP + O a b SVO c 2 9 * 10 * X Y

Microsoft Word doc


Theoretical Discussion / 理 论 研 讨 / 并 为 其 选 择 适 当 的 工 种 和 岗 位 省 自 治 区 直 辖 市 人 民 政 府 可 以 根 据 实 际 情 况 规 定 具 体 比 例 2008 年 4 月 24 日, 第 十 一 届 全 国 人 民 代 表 大


08陈会广

Microsoft Word - 01李惠玲ok.doc

定稿

不确定性环境下公司并购的估价:一种实物期权.doc

第2期定稿.FIT)

第一章 出口退税制改革的内容

698 39,., [6].,,,, : 1) ; 2) ,, 14,, [7].,,,,, : 1) :,. 2) :,,, 3) :,,,., [8].,. 1.,,,, ,,,. : 1) :,, 2) :,, 200, s, ) :,.

4 & & & 5+)6,+6 5+)6,+6 7)8 *(9 ):*");, +!*((6,<6 #!;";<=*#!8 > #)+9 " =68 )(( 8"=*");,8 >?=*%),<8 > 6B#(*,*9 ";=C <=*#!)+8 ),"6=*+")D6

successful and it testified the validity of the designing and construction of the excavation engineering in soft soil. Key words subway tunnel

Transcription:

000-9825/2005/609523 2005 Journal of oftware Vol6, No9 +, 2,,,00084 2,00084 emantc Analyss and tructured Language Models LI Mng-Qn +, LI Juan-Z 2 ANG Zuo-Yng, LU Da-Jn Department of Electronc Engneerng, snghua Unversty, Beng 00084, Chna 2 Department of Computer cence and echnology, snghua Unversty, Beng 00084, Chna + Correspondng author Phn +86-0-6278704, E-mal lmq@thspeetsnghuaeducn, http//wwwtsnghuaeducn Receved 2004-05-4; Accepted 2004-09-07 L MQ, L JZang ZY, Lu DJ emantc analyss and structured language models Journal of oftware, 2005,69523 533 DOI 0360/os6523 Abstract An ntegrated semantc analyss system s presented, and the structured language models are proposed based on t he semantc analyss system can automatcally tag semantc class for each word and analyze the semantc dependency structure between words wth the precson of 9085% and 7584% respectvely In order to descrbe sentence structure and long-dstance dependency, two nds of structured language models are examned and analyzed Fnally, these two language models are evaluated on the tas of Chnese speech recognton Experments show that the best semantc structured language model headword trgram model acheves 08% absolute error reducton and 8% relatve error reducton over the trgram model Key words semantc analyss; dependency analyss; language model; speech recognton,, 9085% 7584%,,,, 08%, 8% ; ; ; P8 A upported by the Natonal Hgh-ech Research and Development Plan of Chna under Grant No200AA407 863 977,,,,,, ;964,,,,CCF,,, ; 935,,,, ; 928,,,,

524 Journal of oftware 2005,69 N [],,,,,, [2] rgger [3] [4] ppng [5],, [6,7] [8] [9] [0],,, 9085% 7584%,,,,,, 08%, 8%,, 2 3 4, [], 343,, [2],,,,, 59, 70,,,,,,,,,, /Experencer,,,,,,,, /content,, N W w, w,, w }, { 2 N RL { R, R2,, R N }, R H, R Rsemantc relaton, R, R H, R, ernel word 2, H

525 w /s /Dd5 /Ae3 /Ca /Ka0 /Gb2 /Aa04 /Hc05 /Da4 /Kd0 /Ie0 /H28 Englsh hese years, Doctor Yang pays a lot of attenton to the popularzaton and applcaton of hs nventon a he sentence tagged wth semantc classes a experencer tme content degree restrctve restrctve de restrctve dependency coordnaton [Yang] [Doctor] [these years] [a lot] [hs] [producton] [popularzaton] [pay attenton to] [nventon] [of] [applcaton] target b he semantc dependency tree b Fg emantc dependency structure of a Chnese sentence w /s /Dd5 /Ae3 /Ca /Ka0 /Gb2 /Aa04 /Hc05 /Da4 /Kd0 /Ie0 /H28 Englsh hese years, Doctor Yang pays a lot of attenton to the popularzaton and applcaton of hs nventon - 2 a he sentence tagged wth semantc classes a R Modfer HeadWord H Index Word Index Word emantc relaton R /Yang 2 /Doctor /Restrctve 2 /Doctor 5 /pay attenton to /Experencer 3 /these years 5 /pay attenton to /me 4 /a lot 5 /pay attenton to /Degree 5 /pay attenton to - - /Kernel word 6 /hs 8 /producton /Restrctve 7 /nventon 8 /producton /Restrctve 8 /producton 0 /popularzaton /Content 9 /of 8 /product // De dependency 0 /popularzaton 5 /pay attenton to /arget /applcaton 0 /popularzaton /Coordnaton b emantc dependency relaton lst b Fg2 emantc dependency relaton lst of a Chnese sentence 2 2 W w, w,, w } s, s,, s } RL { R, R2,, { 2 N { 2 N R N }, P, W, N P, W { P s,, W P w, P, }

526 Journal of oftware 2005,69,,,, U,,, P s, P s s, s 2 2 P w, P w s, w 3 P, 3 22 P, W,,, N Pc, W { Pc w,, W Pc s,, W P, } P w, P w w, w 5 c 2 P s, P s w, s 6 c, W, W 3,, 3,,,,,, r hw hs,,, {, / }, 70 /, ;,, NULL NULL r,,,,, 3,,,,, P, P τ, P q,, q,, q 4, 7, q τ,, 3, 2, Restrcton,rght Restrcton,rght P q,, q,, q,, q l,, q l, g P q r, r P q hwl, hs g l, hw, g hs g core q,, q,, q λ log P q wl, s,, log, l w g s g + λ P q rl r g l g 8, λ, λ 03 P q r, r P q hw, hs, hw, hs l g l l g g

527 [0],, Experencer,rght, me,rght, Degree,rght, arget,left, Restrctve,rght, De dedependency,left, Restrctve,rght, Restrctve,rght, Restrctve,rght, Coordnaton,left [Yang] [Doctor] [these years] [a lot] [hs] [producton] [popularzaton] [pay attenton to] [nventon] [ de ] ['de'] [applcaton] Fg3 One of the bnary semantc parse trees for the semantc dependency tree n Fg 3 2,,,, 2 W W P W P, W 9, [0], 9 P Pr W P,, Pr, 0 P W,, Pr, * *,, arg max P, W, 0 * * P W P,, 0

528 Journal of oftware 2005,69 22 2, [6],, 4, 9, rgram model 2 3 4 5 6 7 8 9 0 a radtonal word trgram model a experencer Headword trgram model restrctve 2 3 tme degree restrctve de restrctve dependency 4 5 6 7 8 9 0 b Headword trgram model b Fg4 4,, P 2 2 w W PWP w,, W P,, W, 4 P WP h h,, 2 P WP P WP w, P w, 2 P h, h 2 Pr 2 w W PWP w,, W ρ,, W, ρ,,, N,, W Pr PN,, W 3 4 P, ρ 5, P, Chelba [6] 6 N P, { P s,, W P w, P, }, P, P q,, q,, q τ W,, W, 6,,, τ 6

529,,,, 5 [ ] τ, P s,, W P w, P, J P N 7 7, 4, 7, 3 P N 3, semantc dependency net, DN [2] 9394 DN, 9394 993 ~994 2, 2, DN, 7 2, DN 9394, DN [3] [0] ;, 9394, 9394-DN;, 9394-DN, CR CR, w, w, R,, H w, w, R, H 2 RCR RCR 3 CR CR, em class tagger 2 Dep parser [0],Dep, 00%, 7 8 3 4 W W 4 able Results of semantc analyss system CR % RCR % CR % em class tagger 94 Dep parser 00 6725 7687 W-Parser 9085 6650 7584 W-Parser 9020 6632 7566,W W W [3] W,W

530 Journal of oftware 2005,69 P s s 2, s 2, W W,, 33,W W,,,, 32 HP,HP, 6,5, 20, 25 [6,7],,, P W λp W + λ P 8, W P I W P W P w P w2 w P w w 2, w N 3, P w W λ P w W + λ P w w, w 20 32 I 2 2 2 PPL,,, K PPL exp log PM w / w,, w K 0,, P w w,,, w w, 3 W- 20 λ, 2, λ 04, rgram 28%, 07% able 2 Perplexty of headword trgram model under dfferent nterpolaton weghts 2 λ 0 02 04 06 08 PPL 2763 25034 24820 25332 267 3444 322, 2 [6] 2, h w, ρ, 5 δ x, y D 2 N Pr P d δ D, d ρ, y x, δ x, y ;, δ x, y 0 HP, 5 4% 2, 2733 637, 323, 45 MFCC, 4 MFCC, DDBHMMduraton dstrbuton based hdden Marov model [4],, trgram, 00 ; 9 2

53, 3 3 trgram -best n n-best 06 05 Probablty 04 03 02 0 0 0 2 3 4 5 6 7 8 9 0 Depth Fg5 Depth dstrbuton of headword trgram model 5 able 3 Chnese character error rates of -best and n-best paths of baselne % 3 -best n-best % -best 5-best 20-best 00-best 020 766 624 508 4 3 FDM BDM HM, able 4 Results of all semantc structure language models % 4 % CER relatve error reducton FDM BDM HM W-Parser 956 622 940 780 939 794 W-Parser 987 39 989 299 937 84 FDMBDM,, W-Parser W-Parser 780% 299% BDM HM, 3,, 67%,,, 2,,,,,,, 08%, 84%,,

532 Journal of oftware 2005,69 4, [5], 677%,,,, [6,7],Chelba W, Chelba Upenn reeban,,, 2, W W,,, Chelba W 3 7, 6, Chelba 2208 588, W 2733 637, Chelba 008%, W 937% 07 06 05 our Our normalzaton method Chelba's Chelba s normalzaton method Probablty 04 03 02 0 0 0 2 3 4 5 6 7 8 9 0 Depth Fg6 Depth dstrbuton comparson between headword trgram models wth and wthout our normalzaton method 6 5 [0],,, 9085% 7584%,

533,, 07%, 08%, 8%,, 3 ;2,,, ;3, P WP References [] Jelne F elf-organzed language modelng for speech recognton In Wabel A, Lee KF, eds Readngs n peech Recognton an Mateo Morgan Kaufmann Publshers, 990 450 506 [2] Brown PF, DellaPetra VJ, Deouza PV, La JC, Mercer RL Class-Based n-gram models of natural language Computatonal Lngustcs, 992,84467 479 [3] Lau R, Rosenfeld R, Rouos rgger-based language models A maxmum entropy approach In ullvan BJ, ed Proc of the Int l Conf on Acoustcs, peech, and gnal Processng ICAP, Vol II 993 45 48 [4] Bellegarda JR A mult-span language modelng framewor for large vocabulary speech recognton IEEE rans on peech Audo Processng, 998,65456 467 [5] Gao JF, uzu Hen Y Explorng headword dependency and predctve clusterng for language modelng In Hac J, Matsumoto Y, eds Proc of the Emprcal Methods n Natural Language Processng EMNLP 2002 248 256 [6] Chelba C Explotng syntactc structure for natural language modelng [PhD hess] Johns Hopns Unversty, 2000 [7] Xu P, Chelba C, Jelne F A study on rch syntactc dependences for structured language modelng In Proc of the 40th Annual Meetng of the Assocaton for Computatonal Lngustcs ACL ACL, 2002 9 99 [8] Roar B Probablstc top-down parsng and language modelng Computatonal Lngustcs, 200,272249 276 [9] Gao JF, uzu H Unsupervsed learnng of dependency structure for language modelng In Proc of the 4st Annual Meetng of the Assocaton for Computatonal Lngustcs ACL ACL, 2003 7 2 http//researchmcrosoftcom/~fgao/paper/ dlm-acl03pdf [0] L MQ, L JZang ZY, Lu DJ A statstcal model for parsng semantc dependency relatons n a Chnese sentence Chnese Journal of Computers, 2004,272679 687 n Chnese wth Englsh abstract [] Me JJ, Zhu YM, Gao YQ, Yn HX ongyc Cln Dctonary of ynonymous Words hangha hangha Cshu Publsher, 983 n Chnese [2] L MQ, L JZ, Dong ZDang ZY, Lu DJ Buldng a large Chnese corpus annotated wth semantc dependency In Ma Q, Xa F, eds Proc of the 2nd IGHAN Worshop on Chnese Language Processng 2003 84 9 [3] Zhang JP A study of language model and understandng algorthm for large vocabulary spontaneous speech recognton [PHD hess] Beng Department of Electronc Engneerng, snghua Unversty, 999 n Chnese wth Englsh abstract [4] Wang ZY, Xao X Duraton dstrbuton based HMM speech recognton models Chnese Journal of Electroncs, 2004,3246 49 n Chnese wth Englsh abstract [5] Zhou M A bloc based dependency parser for unrestrcted Chnese text In Proc of the 2nd Chnese Language Processng Worshop 2000 78 84 http//researchmcrosoftcom/chna/papers/robust_dependency_parser_chnese_extpdf [0],,,2004,272679 687 [],,,,983 [3] [ ],999 [4], HMM,2004,3246 49