MonitoringによるOral Production評価の信頼性について
スポンサーリンク
概要
- 論文の詳細を見る
Monitoring, as an evaluative technique, has provided a practical solution for wide, routine testing of oral production. However, it still leaves with us such an insoluble problem as the low reliability of the evaluation resulting from the different conditions under which each student is monitored. The present investigation represented an attempt to elicit an experimental confirmation of the effect of changes in monitoring conditions upon the reliability of the obtained grade. Fifty-eight and less students, taking the elementary course in English conversation at Chuo University, served as experimental subjects. The investigation covered a period of eleven weeks of the regular taped lessons in the language laboratory, where each subject was monitored in his actual drill practice and immediately graded with a five-point scale (absolute evaluation). As the items for the statistical analysis the following drills were selected: (1) repetition, or imitation of model utterances, with which five grades of each subject were obtained, and (2) question-answering (random questions), with which three grades were recorded, during the practice sessions. The former was observed from the larger viewpoint of all features of good articulation, while in the case of the latter, more complicated drill, the following three criteria were used for grading during monitoring: immediacy of response, appropriateness of response and pronunciation. Twenty of the questions used in the laboratory work were also asked individually in an interview type of test which was conducted in the latter part of the course.The correlation coefficients of ten pairs for repetition, as shown in Table 3, widely ranged from 0.201 to 0.719, half of which did not reach the 0.05 level of significance (0.4683). With respect to this table it should be noticed that a significant correlation was found between R_2 and R_3, while not between R_4 and R_5, despite the fact that in each of the pairs the same model utterances were given twice as a whole. The monitoring results of question-answering drill, as can be seen in Table 5, indicated that the correlation between Q-A_1 and Q-A_2 did not reach the 0.1 level of significance (0.4409) as to the criterion of immediacy, while with respect to other criteria, pronunciation and relevance, it was found significant at the 0.05 level (0.5139). Between Q-A_1 and Q-A_3, whose questions for practice had the least in common with each other, was found a clearer correlation at 0.05 level with all the criteria, presenting a great contrast to the case of the relationship between Q-A_2 and Q-A_3: the correlation did not reach the 0.05 level of significance with readiness and pronunciation, disregarding the fact that about two-thirds of the questions asked in each of them were common. It is apparent from the above findings that instability in conditions during moments of monitoring individuals produces no little variability in grading that reduces the reliability of the results. In the interview type of test, as previously indicated, all the subjects were observed under the same condition, or nearly the same condition as possible, and the obtained grade might be considered as a standard one for the question-answering skill. The relationship between the interview grade and the average of the three grades monitored in the laboratory was then examined by means of correlation coefficient procedure, which yielded the following figures: 0.665 with immediacy, 0.808 with appropriateness and 0.878 with pronunciation, all showing the significant correlation at the 0.01 level. The last result implies that the accumulation of monitoring grades over a period of time is an effective way of obtaining a reliable evaluation of oral proficiency, no matter how low reliability each grade has. However, considering the relatively small number of subjects and monitor or interviewer fluctuation in grading, the reliability of these results are questionable, or at any rate, do not justify the conclusions about the reliability of monitoring as a means of evaluating performance and the effect of the cumulative process. The results bear further testing.
- 外国語教育メディア学会の論文
- 1974-03-31