Most people will have seen headlines about hospitals being rated "inadequate" by the Care Quality Commission after an inspection, and put into "special measures". It often results in changes in senior leadership, and even to takeovers by other, better performing hospitals. But how do we know that these very important judgements and ratings are right? Our research into the inspection process shows that while teams of inspectors may often agree about their assessments, there can also be uncertainty and disagreement. The good news is that there are things that regulators can do to improve reliability and maintain the confidence of service providers and the public.
Reliability is important
Inspections of services are a key part of most systems of regulation. They provide independent assurance that services are safe and meet basic quality standards. Inspection reports and published ratings can also help service users to make informed choices about which service to use. Inspections are a big deal for service providers. A poor rating can result in a service being closed down, or whole teams of senior managers being replaced. It can adversely affect staff morale and recruitment. And if inspections fail to identify poor practice, service users will suffer.
With such high stakes, it is crucial that the assessments inspectors make are reliable. Furthermore, if stakeholders do not trust assessments of performance, then the credibility of the whole system of regulation could be undermined. The Care Quality Commission (CQC), which regulates health and social care services, has radically changed its inspection regime in recent years, after being criticised in the wake of the Francis Inquiry into failings in care at Mid Staffordshire NHS Foundation Trust between 2005-2009. CQC commissioned researchers from Alliance Manchester Business School and The King’s Fund to evaluate its new regulatory model for acute hospitals.
How we assessed reliability
As part of the evaluation we investigated the extent to which different inspectors might agree in their assessments of services. We provided short descriptions of aspects of service provision to inspection team members who had inspected hospitals during late 2013 and early 2014, when CQC was piloting its new approach. The descriptions included:
“Managers are developing a plan to address bullying following concerns reported in the national annual staff survey”
“Complementary therapies are available to patients nearing the end of life to aid relaxation and symptom control”
“40% of staff are not up to date with their mandatory training”
We asked inspection team members to use the CQC’s rating system to say what ratings these descriptions indicated.
Agreements and disagreements among inspectors
We found that while inspectors largely agreed about the ratings of some descriptions, there were substantial disagreements about others. However, if inspectors discuss ratings together, as is typically the case in CQC inspections, then this should increase agreement to quite high levels. When we observed actual inspection teams during inspections, they typically did experience consensus about ratings. As one inspector said:
“[rating]actually worked. People did agree, with one exception across risk, but that was across all 40. So 1 out of 40 they disagreed with. Was actually amazing.”
CQC inspectors produce separate ratings for different aspects of services, termed “domains”: safety, effectiveness, caring, responsiveness and leadership. Inspection team members found it harder to decide what domain a description related to than they did to rate the description. Overall, levels of agreement were not high, even for teams of inspectors. This corresponded with our observations of inspection teams in action – we saw many examples of inspectors being uncertain about which domain was relevant. This could have implications for ratings, as one inspector said:
“[the ratings]did change quite dramatically when we finally pulled the report together. But the reason why it changed was that debate over which domain does this fit in … Is this safety? Is this responsive? Is this caring? And that’s a greying area”
Making ratings more reliable
Clearly and fully defining rating and domain categories can increase reliability, as can providing training for inspectors about making judgements. Training is particularly likely to be valuable when the categories are highly subjective, as is the case for those used by CQC.
Giving inspectors more time to discuss their judgements should also increase reliability, particularly if there is appropriate training, guidance and facilitation that minimises bias in decision making and enables different perspectives to be heard and taken account of.
Reliability could potentially also be increased by merging some of the domains. This would simplify the task inspectors face and might reduce the resources needed for inspection, which is also an important consideration.
Alan Boyd and Kieran Walshe
Our full research article on the reliability of judgements made by CQC inspectors is freely available and can be downloaded at https://journals.sagepub.com/doi/10.1177/1355819616669736