Quality Assurance
What is Inter-Rater Reliability?
Inter-rater reliability is a measure of how much agreement there is between different evaluators scoring the same interactions. High reliability means scores reflect the call, not who happened to review it.
Low inter-rater reliability is a red flag: it means an agent’s score depends partly on luck of the draw, which destroys trust and makes coaching and analytics unreliable.
Tracking and improving inter-rater reliability — through calibration and drift analytics — is one of the most important and overlooked levers in a mature QA programme.