Item Response Theory
28 มีนาคม 2567 - เวลาอ่าน 4 นาทีProbability of the correct answer should determine test score.
Have you ever wondered how the difficulty of test questions is determined? Whether it's an IQ test or a personality test, we generally assume that the total score at the end will indicate the ability of the test taker. After all, the test designers believe that all the questions are equally difficult.
However, in reality, the differences between each question will affect the overall difficulty. This article will introduce you to Item Response Theory (IRT), the concept behind world-class tests such as TOEFL, TOEIC, and SAT.
What is Item Response Theory (IRT)?
IRT is a statistical model used to design, analyze, and score various tests. Its purpose is to determine how well the test can measure what it is intended to measure.
How does IRT work?
IRT works by finding the relationship between the "ability" of the test taker and the "probability" of answering each question correctly. This relationship is usually represented by a "logistic curve" as shown in the image below.
The characteristics of the question itself determine the shape of the relationship in the image. There are three main characteristics:
1. Item difficulty: This is the point on the horizontal axis where people start to answer the question correctly. It represents a low or high level of ability. For example, if the graph starts to curve upwards when the horizontal axis value is -1, that question is easier than a question that starts to curve upwards when the horizontal axis value is 0.25.
2. Item discrimination: This is the sensitivity of the graph to change. If the graph curves upwards quickly, such as when the horizontal axis changes by 1 unit and the probability of answering correctly changes by 70%, the question will have a higher discrimination power than a question where the probability of answering correctly changes by 40%.
3. Guessing parameter: This is the percentage chance that a person will answer correctly when they have no ability at all (i.e., when their ability is very far to the left). If the value is high, it means that the question is easy to guess. If the value is low, it means that there is a low chance of someone randomly answering correctly.
Why is IRT important?
IRT is important because it helps to ensure that tests are fair and accurate. By taking into account the difficulty of each question, IRT can provide a more accurate measure of a test taker's ability.
How is IRT used?
IRT is used in a variety of settings, including:
Educational testing: IRT is used to score tests such as the SAT, TOEFL, and TOEIC.
Psychological testing: IRT is used to score tests such as IQ tests and personality tests.
Medical testing: IRT is used to score tests such as diagnostic tests and patient-reported outcome measures.
Conclusion
IRT is a powerful tool that can be used to improve the quality of tests. By taking into account the difficulty of each question, IRT can provide a more accurate measure of a test taker's ability. This can lead to fairer and more accurate decisions about things like admissions, hiring, and treatment.