6.1 1. N-gram 2. 3. 4. 60
6.2 4.114.13 4.124.14 Rong Jin[7] TF*IDF - SARS SARS SARS SARS 61
62
[1] Michele Banko, Vibhu O. Mittal, and Michael J. Witbrock. 2000. Headline Generation Based on Statistical Translation. 38th Annual Meeting of the Association for Computational Linguistics, Hong Kong, China, 1-8 October. [2] Peter F. Brown, Stephen A. Della Pietra, Vincent J. Della Pietra, and Robert L. Mercer. 1993. The mathematics of statistical machine translation: Parameter estimation. Computational Linguistics, (2): 263-312. [3] Brown, Cocke, Della-Pietra, Della-Pietra, Jelinek, Lafferty, Mercer, Roossin. 1990. A Statistical Approach to Machine Translation. Computational Linguistics, 16(2) June. [4] Kuang-hua Chen and Hsin-Hsi Chen. 2001. The Chinese Text Retrieval Tasks of NTCIR Workshop 2. Proceedings of the Second NTCIR Workshop Meeting on Evaluation of Chinese & Japanese Text Retrieval and Text Summarization (NTCIR 2), pp. 51-72. [5] G. D. Forney. 1973. The Viterbi Algorithm. Proc of the IEEE, pp. 268-278. [6] Rong Jin and Alexander G. Hauptmann. 2001. Headline Generation using a Training Corpus. Second International Conference on Intelligent Text Text Processing and Computational Linguistics. [7] R. Jin and A. G. Hauptmann. 2000. Title Generation for Spoken Broadcast News using a Training Corpus.Proceedings of ICSLP 2000, Beijing China. [8] S. Katz. 1987. Estimation of probabilities from sparse data for the language model component of a speech recognizer. IEEE Transactions on Acoustics Speech and Signal 63
Processing, pp. 24. [9] Paul E. Kennedy and Alexander G. Hauptmann. 2000. Automatic Title Generation for EM. Proceedings of the fifth ACM conference on Digital libraries. [10] G..J. McLachlan and K. E. Basford. 1988. Mixture Models. Marcel Dekker, NY. [11] M. Mitra, Amit Sighal, and Chris Buckley. 1997. Automatic text summarization by paragraph extraction. In Proceedings of the ACL 97/EACL 97 Workshop on Intelligent Scalable Text Summarization, Madrid, Spain. [12] Papineni, Kishore papineni, Salim Roukos, Todd Ward, Wei-Jing Zhu. 2001. IBM Research Division Technical Report. RC22176(W0109-022), Yorktown Heights, New York. [13] Gernard Salton, A.Singhal, M. Mitra, and C. Buckley. 1997. Automatic text structuring and summary. Info. Proc. And Management, 33(2):193-207. [14] T. Strzalkowski, J. Wang, and B.Wise. 1998. A robust practical text summarization system. In AAAI Intelligent Text Summarization Workshop, pp. 26-30, Stanford, CA. [15] M. Witbrock and V. Mittal. 1999. Ultra-Summarization: A Statistical Approach to Generating Highly Condensed Non-Extractive Summaries. Proceedings of SIGIR 99, Berkeley, CA, August. [16] David Zajic, Bonnie Dorr, and Richard Schwartz. 2002. Automatic headline generation for newspaper stories. In Proceedings of the Workshop on Text Summarization Postconference workshop of ACL-02, Philadelphia, PA. [17] NSC 86-2621-E-002-025T 9 64
1 <?xml version="1.0" encoding="big5"?> <xml> <doc> <id>ctc_tec_0001609</id> <date>1999-01-26</date> <title> Rambus </title> <text> Rambus Rambus DRAM Intel 820 Rambus DRAM SDRAM 820 820 100 MHz 133 MHZ CPU Rambus DRAM SDRAM 820 133MHz 820 133MHz Rambus DRAM Rambus DRAM RIMMSDRAM DIMM 820 820 RIMM DIMM 820 65
820 SDRAM DIMM RIMM DIMM RIMM SRIMM SRIMM SRIMM RIMM 133 MHz </text> </doc> </xml> 66
2 <?xml version="1.0" encoding="big5"?> <xml> <doc> <id>cts_int_0000502</id> <date>1998-06-29</date> <title> </title> <text> 67
SPASO-HOUSE </text> </doc> </xml> 68
3 <?xml version="1.0" encoding="big5"?> <xml> <doc> <id>cts_int_0000639</id> <date>1998-07-16</date> <title> </title> <text> (Oscar Wells) (Orson Welles) 1938 15 69
</text> </doc> </xml> 70
4 <?xml version="1.0" encoding="big5"?> <xml> <doc> <id>cts_int_0001454</id> <date>1998-10-10</date> <title>46 </title> <text> </text> </doc> </xml> 71
5 <?xml version="1.0" encoding="big5"?> <xml> <doc> <id>ctc_sto_0004827</id> <date>1999-01-26</date> <title> </title> <text> 72
SOGO SOGO </text> </doc> </xml> 73
6 <?xml version="1.0" encoding="big5"?> <xml> <doc> <id>ctc_tec_0000370</id> <date>1998-07-14</date> <title> </title> <text> International Mobile Telecommunications200 0 WB CDMA WLL DCS-1800 Mobile Unit 10Mbps High Speed WLAM WLL 74
CDMA WLL WLL </text> </doc> </xml> 75
7 <?xml version="1.0" encoding="big5"?> <xml> <doc> <id>ctc_tec_0000384</id> <date>1998-07-18</date> <title> </title> <text> 76
</text> </doc> </xml> 77
8 <?xml version="1.0" encoding="big5"?> <xml> <doc> <id>cts_soc_0001534</id> <date>1998-07-12</date> <title> </title> <text> 78
0.00001 79
80
</text> </doc> </xml> 81
9 <?xml version="1.0" encoding="big5"?> <xml> <doc> <id>cts_soc_0002226</id> <date>1998-08-16</date> <title> </title> <text> 82
</text> </doc> </xml> 10 <?xml version="1.0" encoding="big5"?> <xml> <doc> <id>cts_int_0001162</id> <date>1998-09-11</date> <title> </title> <text> </text> </doc> </xml> 83