從 TIMSS 2007 臺灣八年級學生數學科作答反應檢視古典測驗理論和試題反應理論特性和測驗分析結果 TIMSS 2007 * ** TIMSS 2007 TIMSS 2007 287 R 2.13.1 Excel 2003 SPSS 12.0 ConQuest 2.0 PCM TIMSS DOI: 10.3966/199679772014123102003 2013 12 31 2014 3 252014 4 10 * E-mail: aguri.su@gmail.com ** 67
新竹教育大學教育學報 第三十一卷第二期 classical test theory, CTT Gullikson, 1987; Lord & Novick, 1968 true score model modern test theory Lord, 1980 item response theory, IRT 1991 Embretson & Reise, 2000 Trends in International Mathematics and Science Study, TIMSS International Association for the Evaluation of Educational Achievement, IEA TIMSS 2007 TIMSS 68
從 TIMSS 2007 臺灣八年級學生數學科作答反應檢視古典測驗理論和試題反應理論特性和測驗分析結果 TIMSS 2007 TIMSS 2007 TIMSS 2007 TIMSS 2007 1. 2. 69
新竹教育大學教育學報 第三十一卷第二期 1999 Embretson & Reise, 2000 2006 Embretson Reise 2000 1 1 Item response theory for psychologists, by S. E. Embretson & S. Reise, Mahwah, NJ: Lawrence Erlbaum Associates. 70
從 TIMSS 2007 臺灣八年級學生數學科作答反應檢視古典測驗理論和試題反應理論特性和測驗分析結果 1 Rasch model Linear logistic test model, LLTM 1 2006 Embretson & Reise, 2000 71
新竹教育大學教育學報 第三十一卷第二期 American Educational Research Association American Psychological Association National Council on Measurement in Education [AERA, APA, NCME] 1999 1999 Lord 1980 standard error, SE IRT 2006 1992b 72
從 TIMSS 2007 臺灣八年級學生數學科作答反應檢視古典測驗理論和試題反應理論特性和測驗分析結果 Embretson & Reise, 2000 Bond Fox 2007 Raschinfit outfit outfit infit 11 1±0.3 logit joint maximum likelihood, JML marginal maximum likelihood, MML conditional maximum likelihood, CML Embretson Reise 2000 1992a 73
新竹教育大學教育學報 第三十一卷第二期 TIMSS TIMSS 2007 1 74
從 TIMSS 2007 臺灣八年級學生數學科作答反應檢視古典測驗理論和試題反應理論特性和測驗分析結果 TIMSS 2007 TIMSS 2007 1 1 TIMSS 2007 TIMSS 2007 4,046 13.5 150 1 2 75
新竹教育大學教育學報 第三十一卷第二期 booklet 1 287 M01 M02 M01 TIMSS 2003 13 M02 TIMSS 200716 29 45 2 2 TIMSS 2007 M01 M02 M01 M02 M01 M02 M01 M02 M01 M02 2 2 1 2 0 1 0 1 3 6 4 3 0 0 4 1 1 2 9 6 0 0 0 2 1 1 0 1 1 4 6 5 1 4 5 3 1 4 29 1 0 16 1 0 21 0 9 4 1329 TIMSS 2007TIMSS http://timss.bc.edu/index.html 76
從 TIMSS 2007 臺灣八年級學生數學科作答反應檢視古典測驗理論和試題反應理論特性和測驗分析結果 TIMSS 2007 0 R CTT Excel 2003 R erm SPSS 12.0 Excel 2003 ConQuest 2.0 Wu, Adams, & Wilson, 2007 287 partial credit model, PCM Masters, 1982 θi x x = 0,,m i i m i +1 x = j P ix (θ)= exp[ x j =0 (θ - δ ij )] [ mi exp[ r j =0 (θ- δ ij )] r=0 1 0 j =0 (θ - δ ij ) 0 δ ij j = 1,, m i jstep difficulty Embretson & Reise, 2000 θ i j 77
新竹教育大學教育學報 第三十一卷第二期 δ ij 0 1 2 SPSS 12.0 R erm infit outfit erm 2006 Rasch Rasch erm ConQuest 2.0eRm [p' i (θ)] 2 I i θ= i=1,,n 2 P i (θ) Q i (θ) 2 I i (θ) i θ p' i (θ) θ P i (θ)p i (θ) θ i Q i (θ)=1-p i (θ) 1992b standard error, SE 2006θ θ 1992b 1 78
從 TIMSS 2007 臺灣八年級學生數學科作答反應檢視古典測驗理論和試題反應理論特性和測驗分析結果 Cronbach's alpha 0.90 Cronbach's alpha 0.90 2 10 2 79
新竹教育大學教育學報 第三十一卷第二期 1999 2006Cronbach s alpha Embretson Reise 2000 1991 TIMSS 2007 Olson, Martin, & Mullis, 2008 TIMSS & PIRLS International Study Center Science and Mathematics Item Review Committee [SMIRC] TIMSS 2007 Crocker Algina 1986 3 80
從 TIMSS 2007 臺灣八年級學生數學科作答反應檢視古典測驗理論和試題反應理論特性和測驗分析結果 3 TIMSS 2007 Crocker & Algina 1986 Crocker Algina 1986 TIMSS 2007 1. 2. National Research Coordinators [NRC] 3. 4. NRC 5. SMIRC NRC 6. 45 25 7. 8. 9. NRC 429 10. 1. TIMSS 2007 2. 3. NRC 4. NRC SMIRC 81
新竹教育大學教育學報 第三十一卷第二期 TIMSS Crocker Algina 1986 Bartlett 2,918.64 p =.000 17 9.1431.51% 1.344.64% 3 3 Reckase 1979 3 6.82 3 Reckase 1979 Hattie 1985 82
從 TIMSS 2007 臺灣八年級學生數學科作答反應檢視古典測驗理論和試題反應理論特性和測驗分析結果 Cronbach's alpha 0.90 4 4 Outfit MSQ Infit MSQ Outfit MSQ Infit MSQ M022043 1.69* 1.30 M042018 0.64* 0.89 M022046 0.87 0.99 M042055 0.97 1.01 M022049 1.44* 1.29 M042039 1.31* 1.31* M022050 0.85 0.93 M042199 0.42* 0.81 M022055 0.81 0.81 M042301A 0.93 1.05 M022057 1.47* 1.35* M042301B 0.64* 0.79 M022257 0.61* 0.82 M042301C 0.53* 0.62* M022062 0.97 0.91 M042263 0.84 0.87 M022066 0.33* 0.70 M042265 1.13 1.11 M022232 1.17 0.95 M042137 0.90 1.02 M022234A 1.06 1.01 M042148 0.70 0.93 M022234B 0.91 1.01 M042254 0.76 0.98 M022243 0.74 0.80 M042250 0.38* 0.85 M042003 1.70* 1.16 M042220 1.47* 1.20 M042079 0.78 0.82 * 1±0.3 4 Outfit MSQ Infit MSQ 1 1 0.7~1.3 Bond & Fox, 2007 M022057 M042039 M042301C 4 6 4M022057 0.7~1.3 6 M042039 5 M042301C 5 0.70.7 83
新竹教育大學教育學報 第三十一卷第二期 4 M022057 5 M042301C 6 M042039 4 6 84
從 TIMSS 2007 臺灣八年級學生數學科作答反應檢視古典測驗理論和試題反應理論特性和測驗分析結果 1991 weak assumption strong assumptions IRT 2006 1 29 140.80 14 0.60~0.80 M022232 0.30 1999 0.50 2928 0.3 1 M022049 0.3 5 85
新竹教育大學教育學報 第三十一卷第二期 5 TIMSS 2007 1 2 1 2 M042250-2.89 M042137-0.49 M042254-2.04 M042301B -0.44 M022066-1.62 M042301C -0.17 M042199-1.58 M022243-0.12 M022046-1.53 M042265-0.03 M042079-1.44 M022050 0.16 M042148-1.44 M022055 0.36 M042003-1.20 M042263 0.39 M022049-1.02 M022057 0.49 M042018-0.92 M042039 0.51 M042301A -0.92 M022232* 2.14 4.32-0.05 M022257-0.82 M022234A* 0.83 1.88-0.22 M042055-0.79 M022234B* 0.69 2.45-1.08 M022043-0.67 M042220* 0.22 2.37-1.92 M022062-0.52 * 0 1 2 0 1 5-3 ~ 0.5 logit 0 ~ 2 logit 1 1 7 10 Wu Adams 2007 1 TIMSS 2007 86
從 TIMSS 2007 臺灣八年級學生數學科作答反應檢視古典測驗理論和試題反應理論特性和測驗分析結果 1 Generalized partial credit model, GPCM Muraki, 1992-0.96 7 M022232 8 M022234A 9 M022234B 10 M042220 87
新竹教育大學教育學報 第三十一卷第二期 2006 IRT IRT 11 12 6 88
從 TIMSS 2007 臺灣八年級學生數學科作答反應檢視古典測驗理論和試題反應理論特性和測驗分析結果 M022057 Content domain number Cognitive domain applying Maximum points 1 key C 1426 15% (A) 200 (B) 300 (C) 1200 (D) 1600 (E) 1700 11 M022057 TIMSS 2007 http://www.dorise.info/der/01_timss_2007_html/index. html M042039 Content domain number Cognitive domain applying Maximum points 1 key A 60 30% (A) 18 (B) 24 (C) 30 (D) 42 12 M042039 TIMSS 2007 http://www.dorise.info/der/01_timss_2007_html/index. html 89
新竹教育大學教育學報 第三十一卷第二期 6 TIMSS 2007 M022057 A 0.17 0.25 0.00 B 0.15 0.03 0.00 C* 0.40 0.67 1.00 D 0.22 0.02 0.00 X 0.06 0.02 0.00 M042039 A* 0.45 0.62 1.00 B 0.06 0.01 0.00 C 0.27 0.02 0.00 D 0.22 0.35 0.00 X 0.00 0.00 0.00 1. X 2. * M022057 M042039 M022057 A M042039 D M022057 A M042039 D 13 14 90
從 TIMSS 2007 臺灣八年級學生數學科作答反應檢視古典測驗理論和試題反應理論特性和測驗分析結果 13 M22057 14 M042039 ConQuest 2.0 11 12 M022057 15% AM042039 D 91
新竹教育大學教育學報 第三十一卷第二期 7 7 TIMSS 2007 3-3.01 0.64 19 0.13 0.37 5-2.34 0.53 20 0.27 0.37 6-2.08 0.49 21 0.41 0.38 7-1.85 0.47 22 0.56 0.38 8-1.64 0.45 23 0.71 0.39 9-1.44 0.44 24 0.86 0.40 10-1.26 0.42 25 1.02 0.41 11-1.08 0.41 26 1.20 0.43 12-0.91 0.41 27 1.40 0.45 13-0.75 0.40 28 1.62 0.48 14-0.59 0.39 29 1.87 0.52 15-0.44 0.39 30 2.17 0.58 16-0.29 0.38 31 2.55 0.67 17-0.15 0.38 32 3.16 0.93 18-0.01 0.38 33 3.82 NA NA 7 92
從 TIMSS 2007 臺灣八年級學生數學科作答反應檢視古典測驗理論和試題反應理論特性和測驗分析結果 8 8 Cronbach's alpha Cronbach's alpha M022057 M042039 M042301C 0.6 1 M022057 M042039 M022232 M022234A M022234B M042220 M022057 M042039 93
新竹教育大學教育學報 第三十一卷第二期 Cronbach's alpha 0.90Cronbach's alpha 0.90 TIMSS 2007 0.6 M022057 M042039 M022057 M042039 M042301C M042301C -3 ~ 0.5 logit 1 M022232 M022234A M022234B M042220 1 M022057 M042039 94
從 TIMSS 2007 臺灣八年級學生數學科作答反應檢視古典測驗理論和試題反應理論特性和測驗分析結果 2011 Embretson & Reise, 2000 95
新竹教育大學教育學報 第三十一卷第二期 M022057 M042039 M022232 0.3 IRT 2.14 0 69% 1 3% 2 28% 0.64~0.9 IRT -1.62~0.51 215 M022232 Content domain number Cognitive domain applying Maximum points 2 95 70 5 95 90 90 85 85 80 80 75 75 70 2 10 3 19 4 48 6 55 9 43 95 70 15 M022232 TIMSS 2007 http://www.dorise.info/der/01_timss_2007_html/index. html 96
從 TIMSS 2007 臺灣八年級學生數學科作答反應檢視古典測驗理論和試題反應理論特性和測驗分析結果 number sense National Council of teachers of Mathematics, NCTM 1989 1997 2006 Gurganus 2004 Gurganus 97
新竹教育大學教育學報 第三十一卷第二期 M022057 M042039 M022232 TIMSS 98
從 TIMSS 2007 臺灣八年級學生數學科作答反應檢視古典測驗理論和試題反應理論特性和測驗分析結果 GPCM General Partial Credit Model Muraki, 1992 Olson et al., 2008 Testlet Response Theory, TRT NSC103-2911-I-003-301 99
新竹教育大學教育學報 第三十一卷第二期 1997 8 83-116 1991 8(6) 13-18 1992a 9(1) 5-9 1992b 9(6) 5-9 200646(3) 101-110 1999 2006 http://www. rcpet.ntnu.edu.tw/irt%e5%9c%a8%e9%87%8f%e8%a1%a8%e7 %B7%A8%E8%A3%BD%E4%B8%8A%E7%9A%84%E6%87%89% E7%94%A8(%E4%B8%8B)95.1.2.doc 2011 American Educational Research Association, American Psychological Association, National Council on Measurement in Education [AERA, APA, NCME] (1999). Standards for educational and psychological testing. Washington, DC: American Educational Research Association, American Psychological Association, National Council on Measurement in Education. Bond, T. G., & Fox, C. M. (2007). Applying the Rasch model: Fundamental measurement in the human sciences (2nd ed.). Mahwah, NJ: Lawrence Erlbaum Associates. Crocker, L., & Algina, J. (1986). Introduction to classical and modern test theory. New York, NY: Holt, Rinehart and Winston. Embretson, S. E., & Reise, S. (2000). Item response theory for psychologists. Mahwah, NJ: Lawrence Erlbaum Associates. 100
從 TIMSS 2007 臺灣八年級學生數學科作答反應檢視古典測驗理論和試題反應理論特性和測驗分析結果 Gullikson, H. (1987). Theory of mental tests. Hillsdale, NJ: Lawrence Erlbaum Associates. Gurganus, S. (2004). Promote number sense. Intervention in School and Clinic, 40(1), 55-58. Hattie, J. (1985). Methodology review: Assessing unidimensionality of tests and items. Applied Psychological Measurement, 9, 139-164. Lord, F. M. (1980). Applications of item response theory to practical testing problems. Hillsdale, NJ: Lawrence Erlbawn Associates. Lord, F. M., & Novick, M. R. (1968). Statistical theories of mental test scores. Reading, MA: Addison-Wesley. Masters, G. N. (1982). A Rasch model for partial credit scoring. Psychometrika, 47, 149-174. Muraki, E. (1992). A generalized partial credit model: Application of an EM algorithm. Applied Psychological Measurement, 16, 159-176. Olson, J. F., Martin, M. O., & Mullis, I. V. S. (Eds.). (2008). TIMSS 2007 Technical Report. International Study Center, Boston College, Chestnut Hill, MA: TIMSS & PIRLS. Reckase, M. D. (1979). Unifactor latent trait models applied to multifactor tests: Results and implications. Journal of Educational Statistics, 4, 207-230. Wu, M., & Adams, R. (2007). Applying the Rasch model to psycho-social measurement: A practical approach. Melbourne, Australia: Educational Measurement Solutions. Wu, M. L., Adams, R. J., & Wilson, M. R. (2007). ACER ConQuest: Generalized item response modeling software (2nd ed.). Hawthorn, Australia: Australia Council for Educational Research. 101
新竹教育大學教育學報 第三十一卷第二期 Inspecting the Characteristics of the Classical Test Theory and Item Response Theory by Using Test Analysis Results and the Responses of Taiwanese Eighth-Grade Students in the TIMSS 2007 Database Hsu- Lin Su* Po-Hsi Chen** Abstract The purpose of this study was to investigate the characteristics of classical test theory (CTT) and item response theory (IRT) by using the responses given by eighth-grade Taiwanese students in the TIMSS 2007 database to conduct test and item analysis from 2 distinct test perspectives, to provide benefit for test design and education settings. A total of 287 students were included in the research, and Booklet 1 was selected as the research tool. Software such as R 2.13.1, Excel 2003, SPSS 12.0, and ConQuest 2.0 were used during data analysis and curve drawing. The results showed that the test difficulty ranged from medium to easy, and the 2 distractors embedded respectively in the 2 items showed that the proportion of average students was more than that of below-average students, according to the aforementioned theories. Moreover, item characteristic curves of 4 multiplechoice items were not ordered. In general, the test quality was high, despite slight flaws meaning that some items could not differentiate average students. However, items were easy for students. The reliability, construct validity, item parameters, category analysis, and overall scores of the test takers in CTT corresponded to concepts of test information function, construct validity, and model fit assessing, item parameters, item and category characteristic curves, abilities of participants in the IRT. The relative strengths of the IRT lie in test information function, latent trait assumption, and parameter invariance and test equating. Regarding the test design and educational implications, we suggest modifying presentations of 2 distractors and making connection to number sense education owing to a specific difficult item. Keywords: classical test theory (CTT), item response theory (IRT), PCM, TIMSS DOI: 10.3966/199679772014123102003 Section editor: Shwu-Ching Young Received: December 31, 2013; Modified: March 25, 2014; Accepted: April 10, 2014 * Hsu-Lin Su, Doctoral student, Department of Educational Psychology and Counseling, National Taiwan Normal University, E-mail: aguri.su@gmail.com ** Po-Hsi Chen, Associate Professor, Department of Educational Psychology and Counseling, National Taiwan Normal University 102