Item analysis to improve reliability for an internal medicine undergraduate OSCE.Adv Health Sci Educ Theory Pract. 2005; 10(2):105-13.AH
Utilization of objective structured clinical examinations (OSCEs) for final assessment of medical students in Internal Medicine requires a representative sample of OSCE stations. The reliability and generalizability of OSCE scores provides validity evidence for OSCE scores and supports its contribution to the final clinical grade of medical students. The objective of this study was to perform item analysis using OSCE stations as the unit of analysis and evaluate the extent to which OSCE score reliability can be improved using item analysis data. OSCE scores from eight cohorts of fourth-year medical students (n = 435) in a 6-year undergraduate program were analyzed. Generalizability (G) coefficients of OSCE scores were computed for each cohort. Item analysis was performed by considering each OSCE station as an item and computing the corrected item-total correlation. OSCE stations which negatively impacted the reliability were deleted and the G-coefficient was recalculated. The G-coefficients of OSCE scores from the eight cohorts ranged from 0.48 to 0.80 (median 0.62). The median number of OSCE stations that negatively impacted the G-coefficient was 3.5 (out of a median of 25 total stations). When the ''problem stations'' were deleted, the median G-coefficient across eight cohorts increased to 0.62--0.72. In conclusion, item analysis of OSCE stations is useful and should be performed to improve the reliability of total OSCE scores. Problem stations can then be identified and improved.