The psychometric development and review of an evaluation system for wind band performance using rasch measurement theory
Edwards, Andrew Scott
MetadataShow full item record
The purpose of these studies, presented here in three manuscripts, were to develop a valid and reliable rubric to be used for the evaluation of large ensemble wind band performances. These three manuscripts seek to demonstrate the advantages of using rubrics for performance evaluations, develop a valid and reliable rubric using the Rasch Measurement Theory, and test a newly developed rubric in a real-world performance evaluation. The guiding questions for the rubric development were: (a) What are the psychometric qualities (i.e., reliability and validity) of the scale developed to assess wind band ensemble performance at the high school level? (b) How do the items fit the model and vary in difficulty? (c) How does the structure of the rating scale vary across individual items? (d) How can the rating scale be transferred into an informative rubric? The primary data analysis tool used was the Multifaceted Rasch Partial Credit Measurement Model. Music content experts (N = 20) were solicited to evaluate forty wind band performances, each evaluator listening to four performances. A four-category Likert-type rating scale was used to evaluate each recorded performance. Results indicated good model data fit and resulted in a final rubric containing 24 items ranging from two to four performance categories. Implications for classroom teaching and consequential validity are discussed. The validated rubric was then used in a live action pilot test. The guiding questions for this study were: (a) What are the psychometric qualities (i.e., reliability and validity) of the evaluation tools used to assess wind performance at the high school level? (b) How do the items fit their respective models and vary in difficulty? (c) How does the condition A evaluation tool compare to the condition B evaluation tool, with special attention to ranking and differentiation of ensembles? To test these questions three evaluators at one state’s ensemble performance evaluation used the condition A evaluation tool that was analyzed using the principals of classical test theory. These three evaluators were compared to three different evaluators that used the condition B tool that was analyzed using the principals of the Rasch Measurement Model.