←back to thread

124 points edent | 5 comments | | HN request time: 0.824s | source
Show context
abigail95 ◴[] No.42726784[source]
Something is missing here, why do batch jobs take 13 hours? If this thing was started on an old mainframe why isn't the downtime just 5 minutes at 3:39 AM?

Exactly how much data is getting processed?

Edit: Why does rebuilding take a decade or more? This is not a complex system. It doesn't need to solve any novel engineering challenges to operate efficiently. Article does not give much insight into why this particular task couldn't be fixed in 3 months.

replies(6): >>42727086 #>>42727097 #>>42727182 #>>42727884 #>>42730222 #>>42732143 #
ajnin ◴[] No.42727182[source]
The batch jobs don't take 13 hours. They're just scheduled to run some time at night where the old offices used to be closed and the jobs could be ran with some expectations regarding data stability over the period. There are probably many jobs scheduled to run at 1AM then 2AM, etc, all depending on the previous to be finished so there is some large delay to ensure that a job does not start before the previous one is finished.

As to your "not a complex system" remark, when a system is built for 60 years, piling up new rules to implement new legislation and needs over time, you tend to end up with a tangled mess of services all interdependent that are very difficult to replace piece-wise with a new shiny architecturally pure one. This is closer to a distributed monolith than a microservices architecture. In my experience you can't rebuild such a thing "in 3 months". People who believe that are those that don't realize the complexity and the extraordinary amount of specifics, special cases, that are baked into the system, and any attempt to just rebuild from scratch in a few months hits that wall and ends up taking years.

replies(3): >>42727376 #>>42729588 #>>42731073 #
abigail95 ◴[] No.42727376[source]
The code will be spaghettified and hideous. The queries will be nonsense.

That doesn't change the fact that the ultimate goal of the system is to manage drivers licenses.

> In my experience you can't rebuild such a thing "in 3 months".

Me and my team rebuilt the core stack for the central bank of a developing country. In 3 months. The tech started in the 70s just like this. Think bigger.

replies(4): >>42729887 #>>42732027 #>>42733184 #>>42735854 #
mootothemax ◴[] No.42735854[source]
> Think bigger.

One of the easier parts of this involves addressing, which in the UK is notoriously easy, reliable, and easy to process - especially the best-in-class Ordnance Survey stuff like AddressBase Premium, right?

A quick trawl of Github will shed some light on it - especially how much of a pain it is to get ABP into a usable state - and this is data that's core and integral to the service, the "are you a real user, a typo, a fraudster, a data supply issue, or getting things wrong in good or bad faith?" kind of business logic.

And it's doubly hard, because the government requires people to update their license when they change address - which often enough involves a new-build property, where the address (let alone UPRN - sometimes even the USRN!) is completely new to you.

Thinking bigger: imagine sitting at your desk during the first couple of weeks on the job, database validation checks running merrily in the background while you're staring at a screen. There's a mild frown forming on your face. You'd been scrolling over a list of rejected records in front of you, largely looking good - _how did they miss THAT fraud _ you'd briefly chuckled to yourself - but _this_ one...

It's a valid business entity, trading from the valid address, and you've hand-checked both _and_ got a junior who lives nearby to send you a photo of it, and, well, the wit running the business has decided to trade under the name _FUCKOFFEE_, and... that's... just going to have to be someone else's problem, you shrug.

(to be clear: the hard part of the DVLA project is _not_ implementing the coding, database, and systems design work)

replies(1): >>42736404 #
robertlagrant ◴[] No.42736404[source]
You've sort of identified how to do it: break it up into problems.

Addresses are hard? Use https://postcodes.io or make your own - that's a project in its own right.

Separately out trading name from registered names needs to be an API from Companies House, or an internal service that API-ifies Companies House data.

Fraud detection? That needs to sit somewhere - let's break out all the fraud detection into a separate system that can talk to the other systems, and have it running continuously over the data. It'll need people to update fraud queries and also to make sure the other systems' data stays integrated with it.

Finally you need something on top that orchestrates the services and exposes them via a gov.uk website, and copes with things like "I don't have my address yet; can I use What3Words instead?" and another one with a UI and lots of RBAC and approvals for DVLA users to do lookups and internal admin.

replies(1): >>42737833 #
1. mootothemax ◴[] No.42737833[source]
Heh, you’ve fallen into the exact trap I was trying to expose, which is why I chose addresses as an illustration point :)

The first step with anything address-y is to try and nail down exactly what an address is in the project context. Quick example - property shells, a building at 1-2 Street Name that contains a bunch of flats, but doesn’t itself have residents or its own postal delivery point. They’re mega useful for an address autocomplete (sadly, the vast majority of geocoders are trash for the uk’s addresses), are they sth people should be able to use (without a flat number etc.) for their driving license? Probably not. Commercial venues? Maybe, what about pubs? Ok, so dual-use maybe, but man this stuff gets painful in a hurry.

Next up - historic addresses and how’re you going to link ‘em all together. It’s nasty, edge-case-strewn work - and for the most part, unavoidably so. It’s why people get their backs up when someone dismisses it out of turn, cos if they have worked with it in the past, they’d qualify anything they wrote with: * presuming a well-formed address source + pipeline.

Edit: for what it’s worth, companies house only lists corporate entities and partnerships as defined in whatever act of parliament. Self employed etc can call themselves whatever - and do! - and the only record of it can be as vague as a nondescript line from the VOA.

replies(1): >>42740585 #
2. robertlagrant ◴[] No.42740585[source]
I like this trap. Why would you need historic addresses for this service? In my mind the main reason for the DVLA knowing your address at all is so they know where to post a fine and a new driving licence/car documentation to. Why do they need historic addresses in their core system?
replies(3): >>42742142 #>>42742706 #>>42745131 #
3. mootothemax ◴[] No.42742142[source]
I’m happy you’re taking it in the spirit intended :) it’s a trap I frankly despise but that’s cos I’m old and bitter.

The problem being addressed - if you’ll forgive the pun - is that you’re not storing someone’s current address; what you have is their _most recently known to us_ address, which obvs over time can become a problem, least of all if you’re wasting time and money sending undeliverable post. (I have a vague memory of Royal Mail fining bulk delivery users for not pre-screening, not sure if that was a particularly dull dream or not tho).

The thing it’s important to keep in mind is - there is no single nor centrally-held repository of addresses within the UK. I don’t mean about oh mr so-and-so lives at 11 acacia avenue. I mean for just the addresses themselves.

Throw in the mad mixture of Scotland having a separate national statistics agency that’s independent of the ONS, plus Northern Ireland having the same -plus- a separate OS in the form of OSNI, the whole landscape’s set up for pain and failure.

4. multjoy ◴[] No.42742706[source]
Now you've got two addresses to handle - the vehicle's keeper and the licence holder.

Further, the DVLA isn't sending correspondence relating to criminal matters, that's coming from the police who use the Police National Computer, into which the driver and vehicle files are fed along with data from the motor insurers bureau.

5. daveoc64 ◴[] No.42745131[source]
>Why would you need historic addresses for this service?

The police (and other authorities like councils) who issue penalties, need to know who was the registered keeper of a vehicle on the date of an alleged offence.

That's where the DVLA's Keeper At Date Of Event (KADOE) system comes in.

It's currently being transitioned to a modern API:

https://developer-portal.driver-vehicle-licensing.api.gov.uk...