ISSN 1671-9484 CN 32-1687/ G 2008 3 7 2 ( 33 ) 113-126 3 1,2 2 2 1 2 100083,, ;; HS K( ) ; H087 A 1671-9484 (2008) 02-0113 - 14 1,,,, :,,,( 2003), =,=,,,, (2006),,,,,,,, 1,( ) 2,,,,,, ()?,!, [ ] 2007 11 27 [ ] 2008 2 1 3,! 1, (1987),(1993) 2 ( ),:,,, ( 199 ) 113
2008 3,,,,,,,( ), (),,,??,,,?,,,, ; (2003, 2005, 2006), ; (2006),,?,,?,,,,,,(, : http :/ / p oe m. guoxue. com :8080/ ), 2, 3,4,5 2 2. 1,,,,,,,,,, : ( 2006) ( Comp utational Syste m),, 114 i) 3 [] [ ],( ) ; ii) [ ] [ ] + [ ], ( ) ; iii),
,, +,,, (,) +,,, +,,,,, 2. 2,, ( ) +, (, ) 3, 350 1 1 2. 3,,,,, +, ( ) 500 2 2. 4,,, 300 3 3, (2006) 115
2008 3,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, 2 () (),,,,?,?,,,, v a,,,, v n v 3 (v,a,n, ) 3,, (), () HS K( ),,,, (),, ( ),, 4 4,,: (), ( ) 116
?, (toke n) (typ e) (), 1 2 : 1 (t oke n) = (t oke n) / 2 (typ e) = (typ e) /,, 3 4 : 3 (t oke n) = (t oke n) / 4 (typ e) = (typ e) /,, 5 6 : 5 (t oke n) = (t oke n) / 6 (typ e) = (typ e) / HS K,, 7 8 : 7 HS K (t oke n) = HS K (t oke n) / 8 HS K (typ e) = HS K (typ e) /,, 9 10 : 9 (t oke n) = (toke n) / 10 (typ e) = (typ e) /,,;, ( ) 20 %,, 5 11 : ( t oke n typ e, ) 11 = (+ + 4 + HS K + ) 3 20 %,,, :;,,,, 5,,, 20 %,,, 117
2008 3 (),?,,,?,,,(),,?,HS K ( ) 2001-2005 HS K 3949 ( HS K ), :,,,2001-2005 HS K : 1 A B C D,, C ; 2, 65 4 : 1 A B C D 2 95 90 85 80 75 70 65 60 55 50 45 40 4 :(A - B - C 95-65), ( ),,, HS K (400 ),,,, 5 6 7 18,, A B C D 455 431 390 315 470 868 1687 924 5 2 1 A B C D t oke n ( %) 1. 95 1. 61 1. 43 1. 44 t yp e ( %) 1. 60 1. 32 1. 17 1. 19 7 2 1 95 90 85 80 75 70 65 60 55 50 45 40 472 468 444 441 424 407 378 334 294 265 236 224 54 162 254 345 523 756 931 617 186 80 30 11 6 2 2, D ( 315),,7 (1 ), (A B C ) : A = 1. 95 % B = 1. 61 % C = 1. 43 % 118
1 2 1 A B C,, : A B C D Toke n ( %) 0. 43 0. 30 0. 25 0. 24 t yp e ( %) 0. 35 0. 25 0. 21 0. 20 8 2 1 8,HS K, A (85-95 ), 0. 43 %,B (75-80 ) 0. 30 %, C (65-70 ) 0. 25 %,2 : 2 2 1,, 9 : ( 2) A B C D Toke n ( %) 2. 01 2. 09 1. 91 1. 91 t yp e ( %) 1. 68 1. 67 1. 52 1. 57 9 2 1 HS K, B A 3 : 3 2 1 119
2008 3, +?,,, 10 : A B C D t oke n ( %) 1. 98 1. 53 1. 27 1. 28 Typ e ( %) 1. 84 1. 38 1. 16 1. 14 10 2 1,, :1. 98 % > 1. 53 % > 1. 27 %,B 0. 26 % C,A 0. 45 % B 4 : 4 2 1? 11 : A B C D Toke n ( %) 1. 10 0. 89 0. 72 0. 72 t yp e ( %) 0. 83 0. 67 0. 55 0. 57 11 2 1,0. 19 %,A B C 5 : 5 2 1,, 12 : A B C D t oke n ( %) 4. 21 3. 25 3. 03 3. 09 t yp e ( %) 3. 31 2. 63 2. 42 2. 46 12 2 1, : AB C 4. 21 % > 3. 25 % > 3. 03 % 120
;0. 59 % 6 : 6 2 1,, HS K, 4000 AB CD,,, ( 95 96 97 ),,,,,,, 80 90,90 91,:5, 13 : 95 90 85 80 75 70 65 60 55 50 45 40 t oke n ( %) 2. 57 1. 91 1. 84 1. 70 1. 55 1. 41 1. 46 1. 50 1. 31 1. 38 1. 43 1. 74 Typ e ( %) 2. 13 1. 58 1. 50 1. 36 1. 29 1. 15 1. 18 1. 21 1. 07 1. 20 1. 28 1. 42 13 2 2 70,,,, 60 70,7 : 7 2 2? 60 65 70?,,,60-70, AB C,, 65-70 () (60,65,70 95 ),,,,,60-70? 14 : 121
2008 3 95 90 85 80 75 70 65 60 55 50 45 40 t oke n ( %) 0. 48 0. 46 0. 40 0. 30 0. 30 0. 25 0. 26 0. 25 0. 24 0. 16 0. 22 0. 22 t yp e ( %) 0. 39 0. 38 0. 33 0. 25 0. 25 0. 21 0. 21 0. 21 0. 20 0. 11 0. 22 0. 22 14 2 2, 80 75, 75 70,60-70 8 : 8 2 2 ( 60-70 ),? 15 : 95 90 85 80 75 70 65 60 55 50 45 40 t oke n ( %) 2. 63 1. 88 1. 97 2. 15 2. 05 1. 93 1. 89 1. 97 1. 80 1. 80 1. 96 1. 60 Typ e ( %) 2. 20 1. 59 1. 63 1. 69 1. 65 1. 53 1. 51 1. 60 1. 47 1. 54 1. 72 1. 51 15 2 2 (9 3),, 9 : 9 2 2 60-70 ( ), 16 : 95 90 85 80 75 70 65 60 55 50 45 40 t oke n ( %) 3. 26 1. 75 1. 86 1. 72 1. 40 1. 17 1. 34 1. 42 0. 80 1. 29 1 2. 76 t yp e ( %) 2. 79 1. 69 1. 73 1. 51 1. 29 1. 08 1. 22 1. 23 0. 76 1. 29 1 1. 75 16 2 2,70-95, (),60-70,,70 65,65 60,, 10 : 122
10 2 2 17,60-70 : 95 90 85 80 75 70 65 60 55 50 45 40 t oke n ( %) 1. 55 1. 03 1. 04 0. 91 0. 87 0. 68 0. 75 0. 74 0. 63 0. 63 1. 14 0. 73 t yp e ( %) 1. 17 0. 79 0. 79 0. 67 0. 66 0. 53 0. 56 0. 57 0. 51 0. 56 0. 91 0. 73 17 2 2,70 65 60, 11 : 11 2 2,18 : 95 90 85 80 75 70 65 60 55 50 45 40 Toke n ( %) 4. 94 4. 44 3. 92 3. 41 3. 15 3. 02 3. 03 3. 10 3. 09 3. 01 2. 83 3. 39 Typ e ( %) 4. 08 3. 46 3. 04 2. 70 2. 58 2. 42 2. 43 2. 47 2. 39 2. 50 2. 57 2. 90 18 2 2 18,70 65,65 60, 12 : 12 2 2 123
2008 3,60-70??, 60-70,, 60-70 ( ),,, 70-90, 60-70, HS K,, ( ), 13,,,,a x3, x3 > b x2, x2 > c x1, x1 > d, 13 2 2,,, 14 : 14 abcd 14 (typ e),,, 6 5,, 6,, 124
:,,, 1), 2),,,, HS K, (),,,,,,, 7,(2005), ( rea da bility),,,,,, :,,,,,,,,,,,,,,,,,,,, 2003, 2,53-63 2005, : 2006, : 1993, 2,1-3 2001, : 2005,,,332-338, : 1987, 5,321-329 7,, 125
2008 3,,,1955 5 15 1977, 1979,, 1982 1986, 1995 1994-2003, 2003,,, L inguistics East A sian L inguistics,,1980 2,,,,1979 11,, An Automatic Feature Checking Algorithm f or Degree of Formalities in Written Chinese Feng Shengli 1,2 Wang Jie 2 Huang Mei 2 1 De p artment of EA L C, H arv ard U ni versit y, B oston M assachusetts US A 2 B ei j i ng L an g uage and Cult ure U ni versit y, B ei j i ng 100083 Abstract Based on Prosodic Grammar, t his paper introduces t he formal feat ures of written Chi2 nese, including 1) mo no syllabic words used in disyllabic templates, 2) disyllabic words used in disyl2 labic cop ulates, and 3) formal patterns in written Chinese. Secondly, an automatic feature checking algorit hm is proposed for a quantitative analysis of t he formalities in Chinese formal styles. Thirdly, t he algorithm proposed in section 2 is verified by using nearly 4000 compositions f rom HS K ( Hanyu Shuiping Kaoshi, or Chinese Proficiency Test), resulting in a precise matching between t he degree of formalities calculated by t he algorit hm and the levels of HS K exam. Finally, it is argued for t he first time t hat t he automatic feat ure checking technology, can be used in a wide range of related fields, such as formality measuring, composition testing, readability scaling, style gradating, text book com2 piling, L2 learning, literacy acquiring, and so on. ing Keywords formal written Language prosodic grammar formal feat ures formality measur2 126