( 6 ) cqzong@nlpr.ia.ac.cn http://www.nlpr.ia.ac.cn/english/cip/cqzong.htm No.95, Zhongguancun East Road Beijing 100080, China http://www.ia.ac.cn Tel. No.: +86-10-6255 4263
, CAS-IA 2004-4-28 2 6.1 morphology
, CAS-IA 2004-4-28 3 6.1
, CAS-IA 2004-4-28 4 6.2
, CAS-IA 2004-4-28 5 6.2 (1) Mr. Green is a good English teacher. (2) I ll see prof. Zhang home after the concert. (1) Mr./ Green/ is/ a/ good/ English/ teacher/. (2) I/ will/ see/ prof./ Zhang/ home/ after/ the/ concert/.
6.2 (1) prof., Mr., Ms. Co., Oct. (2) Let s / let s => let + us (3) I am => I + am (4) {it, that, this, there, what, where} s => {it, that, this, there, what, where} + is (5) can t => can + not; won t => will + not (6) {is, was, are, were, has, have, had}n t => {is, was, are, were, has, have, had} + not (7) X ve => X + have; X ll=> X + will; X re => X + are, CAS-IA 2004-4-28 6
, CAS-IA 2004-4-28 7 6.2 (8) he s => he + is / has =>? she s => she + is / has =>? (9) X d Y => X + would ( Y ) => X + had ( Y )
6.2 1. 1) -ed ed *ed * (e.g., worked work) *ed *e (e.g., believed believe) *ied *y (e.g., studied study) 2) -ing *ing * (e.g., developing develop) *ing *e (e.g., saving save) *ying *ie (e.g., die dying), CAS-IA 2004-4-28 8
6.2 3) -s *s * (e.g., works work) *es * (e.g., discuss discusses) *ies *y (e.g., studies study) 4) -ly *ly * (e.g., hardly hard), CAS-IA 2004-4-28 9
, CAS-IA 2004-4-28 10 6.2 5) er/est *er * *ier *y (e.g., cold colder) (e.g., easier easy) 6) s/ses/xes/ches/shes/oes/ies/ves ies/ves bodies body, shelves shelf, boxes box, etc. 7) X s, Xs
6.2 2. choose, chose, chosen axis, axes bad, worse, worst, CAS-IA 2004-4-28 11
, CAS-IA 2004-4-28 12 6.2 3. 1) 1990s 1990 2) 82th th 3) $200 $ 200 4) 98.5% 98.5
, CAS-IA 2004-4-28 13 6.2 4. 1) e.g., one-fourth 2) e.g., Human-computer, multi-engine, mixed-initiative, large-scale 3) ed e.g., machine-readable, hand-coding, non-adjacent, contextfree, rule-based, speaker-independent
, CAS-IA 2004-4-28 14 6.2 4) e.g., job-hunt 5) - e.g., co-operate, 7-color, bi-directional, inter-lingua, Chinese-to-English, state-of-the-art, part-of-speech, OOV-words, spin-off, top-down, quick-and-dirty, text-to-speech, semi-automatically, i-th
, CAS-IA 2004-4-28 15 6.2 1) 2) 3)
, CAS-IA 2004-4-28 16 6.3
, CAS-IA 2004-4-28 17 6.3 1 2
6.3 1 / / / / / / / / / / / / / / / / / 2 / / / / / / / / / / /, CAS-IA 2004-4-28 18
, CAS-IA 2004-4-28 19 6.3 1 2
, CAS-IA 2004-4-28 20 6.3 1
, CAS-IA 2004-4-28 21 6.3 2 i) ii)
, CAS-IA 2004-4-28 22 6.3 1) / / / /
, CAS-IA 2004-4-28 23 6.3 2)
, CAS-IA 2004-4-28 24 6.3 3)
, CAS-IA 2004-4-28 25 6.3 4)
, CAS-IA 2004-4-28 26 6.3 5)
, CAS-IA 2004-4-28 27 6.3 6) i) / / / / ii) / / iii) / / / / / iv) / / / /
, CAS-IA 2004-4-28 28 6.3 v) / / / vi) / / / / / / / / / / vii) / /
, CAS-IA 2004-4-28 29 6.4 / /
, CAS-IA 2004-4-28 30 6.4 Maximum Matching, MM (Forward MM, FMM) (Backward MM, BMM) Bi-directional MM S = cc Lc 1 2 i 1 2 n w = c c L c m m
, CAS-IA 2004-4-28 31 6.4 FMM 0 i=0 p i 1 p i n if n=1 3) m= if n < m, m = n 2 p i m w i i) w i w i iii) ii) w i w i 1 w i 2) i) w i 1 w i w i iii) iii) w i p i p i 3 i=i+1 3
, CAS-IA 2004-4-28 32 6.4 7 p p FMM BMM
, CAS-IA 2004-4-28 33 6.4 95 [ 2003]
, CAS-IA 2004-4-28 34 6.4 S=c 1 c 2 c n c i i =1,2, n n n 1 n+1 G V 0 V 1 V 2 V n c v 1 c 0 v 2 1 c i-1 c v i i-1 c j c v j+1 j c n v n N-
, CAS-IA 2004-4-28 35 6.4 (1) v k-1, v k <v k-1, v k > c k ( k =1, 2,, n) (2) w= c i c i+1 c j (0<i<j n) v i-1, v j <v i-1, v j > w c v 1 c 0 v 2 1 c i-1 c v i i-1 c j c v j+1 j c n v n w=c i c i+1 c j (3) (2) (4)
, CAS-IA 2004-4-28 36 6.4 1) (6) (7) 2) (5) (5)
, CAS-IA 2004-4-28 37 6.4
, CAS-IA 2004-4-28 38 6.4 S = c 1 c 2 c n c i i =1,2, n n n 1 W = w 1 w 2 w k 1 k n P W P S W Wˆ ( ) ( ) = argmaxp( W S) = argmax W W P( S) = argmaxp( W) = W k i= 1 P( w w i 1, L, wi 1)
, CAS-IA 2004-4-28 39 6.4
, CAS-IA 2004-4-28 40 6.4 [ 2000]
, CAS-IA 2004-4-28 41 6.4
, CAS-IA 2004-4-28 42 6.5 ABC AB C A BC AB BC 1000 16 1 2 95 [ 2000 1997]
, CAS-IA 2004-4-28 43 6.5 1 1 1 2 ABC AB C >A BC ABC AB C A BC f( ) 3 f( ) 600 f( ) 0 f( ) 14
, CAS-IA 2004-4-28 44 6.5 2 2 2 ABCD AB CD 3 3 3 ABCDE ABC DE ABC 1
, CAS-IA 2004-4-28 45 6.5 1 A( ) BC / / / / / / / / / / / / / /
, CAS-IA 2004-4-28 46 6.5 AB C ( ) / / / / ( ) [ 1995]
, CAS-IA 2004-4-28 47 6.5 [ 1997] xy x, y N r( x, y) I ( x : y) = log 2 r( x) r( y) N r(x, y) x, y r(x), r(y) x, y
, CAS-IA 2004-4-28 48 6.5 xyz Pt 1 : xy z, Pt 2 : x yz I(x:y) I(y:z) α Pt 1 I(y:z) I(x:y) α Pt 2
, CAS-IA 2004-4-28 49 6.5 AB A B 1 2 3 4 5
, CAS-IA 2004-4-28 50 6.5 1 M + AB -> M + A(q) + B ( A B ) 2 3
, CAS-IA 2004-4-28 51 6.6 SARS (cool)
, CAS-IA 2004-4-28 52 6.6 5544 3410 1990 144 737 729 8 300 974 952 23 300 4064 [ 2002a, 2002b]
, CAS-IA 2004-4-28 53 6.6 : (1) (2) (3) (4)
, CAS-IA 2004-4-28 54 6.6 Step-1: Step-2: Cname = Xm 1 m 2 X m 1 m 2 F ( X ) = X X ( ) m1 F m1 = m m F ( m2) = 2 1 m 2
, CAS-IA 2004-4-28 55 6.6 Cname P( Cname) = F( X ) F( m1 ) F( m F( X ) F( m2 ) 2 ) X T min F( X ) Min( F( m1 ) F( m ( X ) = F( X ) Min( F( m2)) f = ln P( Cname) 2 )) X β X ( threshold value) f β X
, CAS-IA 2004-4-28 56 6.6 β X β =α T ( X)) 1 F( X) α X = 2 0.5 < α X 1 + X X ln( min Cname = Xm 1 m 2 F(X)=100% (α X =1) ln P ( Cname) > ln( Tmin( X )) F(X) 100% ln P( Cname) X ln( Tmin( X )) >α
, CAS-IA 2004-4-28 57 6.6. 2 Step-3: 100 100
6.6 Step-4:, CAS-IA 2004-4-28 58
, CAS-IA 2004-4-28 59 6.6 1994 88026 [ 2000], [ 1995a]
6.6, CAS-IA 2004-4-28 60
6.6 threshold value, CAS-IA 2004-4-28 61
, CAS-IA 2004-4-28 62 6.6 { } +
, CAS-IA 2004-4-28 63 6.6
, CAS-IA 2004-4-28 64 6.6
, CAS-IA 2004-4-28 65 6.6 [ 1993]
, CAS-IA 2004-4-28 66 6.6 HMM-Based [Zhou, 2002] Maximum-Entropy [Collins, 2002] Named Entity, NE
6.6 [Nie, 1995] MM 1) n 2) 3) n [Chang, 2002] [Chang, 2003], [, 1997], CAS-IA 2004-4-28 67
, CAS-IA 2004-4-28 68 6.7 S = A B A S B (Correct ratio): C 100 % = B S (Recall ratio): R 100% = A S F- 2 ( β + 1) C R F measure = 100% 2 β C + R β =1,
, CAS-IA 2004-4-28 69 MM,
, CAS-IA 2004-4-28 70 F-
, CAS-IA 2004-4-28 71 1. 2. {he, she} s he / she has he / she is 3. F- 4. 5. 6. GB13715
, CAS-IA 2004-4-28 72 Thanks