The MoT Files: The story behind the data
Our MoT data story picks up where it left off, but it hasn’t been without its problems
Indeed, those with good memories will recall that the previous HonestJohn.co.uk MoT Files were published back in 2014, so why has it taken us so long to update the data?
The Vehicle and Operator Services Agency (VOSA) - the body responsible for MoTs in the UK – was merged with the Driving Standards Agency (DSA) in 2014 in a bid to improve efficiency. However, while the newly-created DVSA (Driver and Vehicle Standards Agency) was praised by those in power for its money-saving endeavours, it also ended public access to MoT data.
At the time the DVSA remained somewhat coy on why it had ended public access to MoT data. In fact, in 2014, it claimed to be fully committed to the Government’s OpenData policy and rebuffed criticism by saying the data was simply delayed due to IT issues. However, as the months rolled by and the deadlines passed, the excuses began to dry up. And then nothing.
After two years of silence, HonestJohn.co.uk submitted a Freedom of Information (FOI) request to ask why the DVSA wasn't adhering to the Government’s OpenData policy. In response the DVSA refused to release the data, claiming that it was “no longer able” to provide public access due to changes in the way it recorded information.
HonestJohn.co.uk rejected the DVSA's response and demanded a full departmental review, as per the terms and conditions of the FOI request. After all, the publication of anonymised MOT tests and results data is important and in the public interest, why should it be rejected over an IT issue?
Reluctantly, after many weeks of delays, the DVSA finally relented and provided HonestJohn.co.uk with exclusive access to 400m MoT test results.
As in 2014, we have had a few problems dealing with the 47gb of MoT data that's provided by the Government. Firstly, it was huge and difficult to work with. Secondly, as it's sourced from thousands of technicians, it was littered with errors. There were plenty of cars registered in the 1800s and a few steam-powered Renault Clios to boot. We've done our best to ensure it's as clean as possible, but with such a huge data set, there may still be the odd error.
Then there was the structure. One person's BMW 320 is another's BMW 3 Series - if we went with the DVSA data as it was, we would have ended up with a huge amount of separate BMW 3 Series models (and 1, 5, 6 and 7 Series for that matter) - amongst others. Plus, there were the complications of generations, bodystyles and other variants. We've done the best we can with the data we have to classify these models in a sensible, useful, way.