Most active commenters
  • _bxg1(8)
  • karatestomp(4)
  • generationP(3)
  • ativzzz(3)

←back to thread

669 points danso | 49 comments | | HN request time: 1.318s | source | bottom
1. _bxg1 ◴[] No.23260967[source]
This is the latest in a string of incidents where critical software systems, facing new pressure due to the pandemic, are catastrophically failing their users. I think what's happened in the past is that most public-facing software systems either a) were not really critical (because people had the alternative of doing things in-person), or b) (as in the case of all the ancient COBOL systems underpinning the US gov) had been made reliable over the years through sheer brute force as opposed to principled engineering. But in the latter case, as we saw with New Jersey's unemployment system, that "reliability" was fragile and contingent on the current state of affairs, and had no hope of withstanding a sudden shift in usage patterns.

Now we have various organizations - governmental and otherwise - hastily setting up online versions of essential services and it seems like every single one of them breaks on arrival.

We need some sort of standard for software engineering quality. I don't think this is an academic question anymore. Real people's lives are being impacted every day now by shoddy software, and with the current crisis they often have no alternative. Software that you or I could probably have executed better, but that the people who were hired to do it either a) couldn't, or b) didn't bother. It's nearly impossible for non-technical decision makers in these orgs to evaluate the quality of the systems they've hired people to build. We need quality assurance at an institutional level.

If not governmental, maybe an organization around this could be made by developers themselves. Not the "certified for $technology" certifications we have now, but a certification of fundamental software engineering skills and principles. A certification you can lose if you do something colossally irresponsible. At the end of the day, this dilution of quality is having a negative impact on our job field, so it concerns all of us. It leads to technical debt, micro-management, excessively rigid deadlines and requirements, which we all have to deal with. All of these are either symptoms of or coping mechanisms for management's inability to evaluate engineering quality.

replies(15): >>23261019 #>>23261187 #>>23261210 #>>23261239 #>>23261289 #>>23261414 #>>23261666 #>>23261696 #>>23261835 #>>23261851 #>>23261876 #>>23262059 #>>23262102 #>>23262525 #>>23263763 #
2. commandlinefan ◴[] No.23261019[source]
> who were hired to do it either a) couldn't, or b) didn't bother

Or c) told the decision makers that it would take longer than a few hours to do and were told not to "waste" any time on it.

replies(1): >>23261056 #
3. _bxg1 ◴[] No.23261056[source]
Regardless, there's this contentious relationship between decision makers and engineers because the former can't properly evaluate the work of the latter. Because of this, they either a) let bad engineers get away with stuff they shouldn't, or b) over-compensate and refuse to trust good engineers. It's a lose-lose.
4. karatestomp ◴[] No.23261187[source]
We keep making a bunch of products where protocols and existing software would do just fine, while hitting fewer edge cases.

Know what would be better than the ten goddamn apps and the iPad and shit they're using for our kid's school? Mailed (or emailed) worksheet packets with guidance, recorded lessons on Youtube. Mail back the worksheets, have the food-delivering schoolbuses pick them up, drop them off at the school every week or so, or just do photos-to-PDF on a phone and email them. Or they could just give each kid workbooks and textbooks like they did when I was in school but that's out of fashion now for no reason. eyeroll

Several logins to manage. Apps that erase your work if you hit the wrong thing. Weird interfaces. Jank galore. Just use the fucking basics. You don't need a custom app for every single thing. Email exists. Use it.

replies(3): >>23261651 #>>23262483 #>>23262899 #
5. im3w1l ◴[] No.23261210[source]
I don't know.. I don't think we can do better than we are already. At least not at anything close to current cost.

In the case described in the article it's arguable whether the testing software was even the culprit.

6. AlchemistCamp ◴[] No.23261239[source]
I think you're correct in your assessment that top-down bureaucracies really struggle with software but I don't think the solution is to inject a top-down bureaucratic gatekeeper in the path of every software career.
replies(1): >>23261283 #
7. _bxg1 ◴[] No.23261283[source]
I'm only talking about creating a certification, not enforcing which orgs do and don't use it. A lot of software isn't important enough for such a thing, but a lot of it is. The point is that even when decision-makers do want software to be highly reliable, they have nothing but very blunt instruments for attempting to enforce that, because they're working in the dark.
replies(1): >>23263677 #
8. bobthepanda ◴[] No.23261289[source]
> We need some sort of standard for software engineering quality. I don't think this is an academic question anymore. Real people's lives are being impacted every day now by shoddy software, and with the current crisis they often have no alternative. Software that you or I could probably have executed better, but that the people who were hired to do it either a) couldn't, or b) didn't bother. It's nearly impossible for non-technical decision makers in these orgs to evaluate the quality of the systems they've hired people to build. We need quality assurance at an institutional level.

Even if you were to put this in place today (which I don't necessarily agree with) you would still need bean-counters to sign off on paying for replacement services for their sweat, tears and duct tape solution. A good half of the electorate and the politicians, give or take, whip up into a frenzy if a bureaucrat so much as looks at a dollar bill the wrong way, so I doubt this would gain any traction.

replies(1): >>23261333 #
9. _bxg1 ◴[] No.23261333[source]
New sweat, tears and duct tape solutions are being created every day. Let's start by focusing on the ones coming down the pipe.
replies(2): >>23261968 #>>23264149 #
10. wmf ◴[] No.23261414[source]
18F released a pretty good guide about these topics but I can't shake the feeling that many organizations aren't willing to learn these lessons. https://github.com/18F/technology-budgeting/blob/master/hand...
replies(1): >>23261660 #
11. WrtCdEvrydy ◴[] No.23261651[source]
> Several logins to manage. Apps that erase your work if you hit the wrong thing. Weird interfaces. Jank galore. Just use the fucking basics. You don't need a custom app for every single thing. Email exists. Use it.

Yeah, but if you do that, how will you funnel money from the school system into private companies?

12. _bxg1 ◴[] No.23261660[source]
Guidelines are well and good, but they aren't really helpful when the people who care about them can't enforce them and vice-versa. What we need is accountability when it comes to the engineers who work on systems that are critical to large swaths of society.
replies(1): >>23261986 #
13. majormajor ◴[] No.23261666[source]
> But in the latter case, as we saw with New Jersey's unemployment system, that "reliability" was fragile and contingent on the current state of affairs, and had no hope of withstanding a sudden shift in usage patterns.

"Reliable" and "Can survive a sudden shift in usage patterns" are extremely different things.

I think you have the causality backward. Engineering is about trade-offs. No quality guild will be able to wave those away. As long as the primary pressure is "get something that is functional enough at minimum time and cost" you're gonna have this.

(Software is particularly complicated because engineers, not just managers, have poor understanding of system quality and of each other's contribution quality. There's a combination of "it's not that complicated" complexity-blindness to business requirements and trade-offs that have to be traced through deep call stacks and across networks. We build things like chaos monkey - to prove resilience by seeing how hard it is to break the thing - because we don't have cost-effect techniques for actually understanding the system well enough short of operating it.)

14. HumblyTossed ◴[] No.23261696[source]
They specify they support PNG and JPEG, the two most common format. Why is it their fault that Apple made HEVC the default?
replies(2): >>23261753 #>>23266592 #
15. prepperpotts ◴[] No.23261753[source]
Considering the market share of iPhone in this demographic, it's inexcusable not to correct for this — even if you disagree with Apple's decision on this.
16. generationP ◴[] No.23261835[source]
Missing HEIC support (and Apple support in general) is not an issue of quality; it's an issue of "knowing your customers". I doubt there could be any certification body for that.
replies(3): >>23262176 #>>23262724 #>>23262836 #
17. ◴[] No.23261851[source]
18. panic ◴[] No.23261876[source]
I just wish these large organizations would be held accountable for failing to perform the single task we rely on them to do. The College Board has one job -- to administer exams. Experian has one job -- to keep financial data. But when they screw up this single job in a fundamental way, nothing seems to happen to them.
19. macintux ◴[] No.23261968{3}[source]
Which would still face the same problem: the government would rather pay multiple times for crappy software at a lower price than one big bill for quality software the first time.
replies(2): >>23262148 #>>23264316 #
20. cybwraith ◴[] No.23261986{3}[source]
You think this was an engineering decision? These failing systems were probably contracted to a politically connected company that subcontracted to lowest bidder. Not only that, but that usually these systems were created with COBOL means that it was likely created a very long time ago and minimally updated as laws/requirements changed to be compliant but thats it.

Thats not the fault of the engineer(s). A surge in traffic in the 80s or whenever it was initially created very well may have been able to be handled as designed and its normal traffic in modern pre-COVID times was the equivalent of a constant "surge" when initially designed. It was already on life support and needed a rewrite 10 years ago. Some software engineering certification/quality board wouldn't account for 30 year old systems design and population. Those are political and budget/prioritization issues. It would be a near equivalent of a bridge that was built then ignored for 50 years collapsing when a modern 18 wheeler drives over it.

All the new systems getting spun up ASAP are just quick hacks to try and get some way of addressing the problem. They are bound to be full of failures by the nature of the rapid development cycle and current crisis. In a situation like this, a quality board like proposed would be granting exceptions left and right because theoretically, something is better than nothing.

replies(1): >>23262052 #
21. _bxg1 ◴[] No.23262052{4}[source]
> These failing systems were probably contracted to a politically connected company that subcontracted to lowest bidder.

And what if that government body established a policy that all contractors had to be certified engineers who hadn't lost their certification due to past negligence? Suddenly there's a much higher floor for "lowest bidder".

replies(2): >>23263554 #>>23268511 #
22. jacques_chester ◴[] No.23262059[source]
> If not governmental, maybe an organization around this could be made by developers themselves.

These exist. The ACM and IEEE CS are best-known, but there are also various national bodies (ACS in Australia, BCS in the UK etc).

> Not the "certified for $technology" certifications we have now, but a certification of fundamental software engineering skills and principles.

The IEEE Computer Society has such a thing, maintained in various forms since about 2002[0]. The ACM and IEEE CS also publish a software engineering curriculum that they are prepared to recognise[1]. They also have a jointly-published Code of Ethics[2].

I sincerely agree with you that our profession is mostly a disaster area. But one thing other professions have that we lack is (1) fairly worked-out fundamental theoretical bases, or at least long experience to draw on, and (2) legal enforcement of standards.

[0] https://www.computer.org/education/certifications

[1] https://www.acm.org/education/curricula-recommendations

[2] https://ethics.acm.org/code-of-ethics/software-engineering-c...

replies(1): >>23262344 #
23. ashtonkem ◴[] No.23262102[source]
I think you're being a bit unfair to the unemployment systems: expected RPS is absolutely a design requirement when creating a large computer system. My entire job is easily described as "it's conceptually very easy, but it's really hard when you and a few million of your closest friends try to do it at once".

Unemployment systems typically don't see spikes like this, it's not terribly surprising that some didn't handle demand well outside of the expected range.

24. ashtonkem ◴[] No.23262148{4}[source]
That's not a problem unique to governments, although the consequences in that realm tend to be worse.
25. _bxg1 ◴[] No.23262176[source]
It suggests that they didn't even bother to try the app on an iPhone, which is what probably half of their target users reach for when they need to take photos. That belies a significant degree of laziness and/or incompetence.
replies(1): >>23262522 #
26. majormajor ◴[] No.23262344[source]
> But one thing other professions have that we lack is (1) fairly worked-out fundamental theoretical bases, or at least long experience to draw on, and (2) legal enforcement of standards.

A world with (2) without (1) would be pretty miserable.

Trying to do this today wouldn't be enforcement of standards, it would be "pray you got it right."

We could build standards for building more-robust software, but every piece of software would become vastly more expensive. We would need massive improvements in tools to avoid that.

And then there's the whole security angle... Is it a failing to have your software be impervious to attackers? To what degree? You wouldn't expect most bridges to withstand a determined attacker...

27. jimhefferon ◴[] No.23262483[source]
> Mail back the worksheets, have the food-delivering schoolbuses pick them up, drop them off at the school every week or so, or just do photos-to-PDF on a phone and email them. Or they could just give each kid workbooks and textbooks like they did when I was in school but that's out of fashion now for no reason. eyeroll

I'm a college teacher and my wife is a high school teacher. Education is much more complicated than eyeroll suggests.

For one thing, teachers would not accept physical papers in the present state of the disease. Even if a district says papers are OK after they have sat for three days (or whatever), that means that (1) they get picked up delivered to some repository (2) they would sit there for days (3) the teachers come and get them (delivering to the teachers would mean more decontamination time) (4) they take a day or two to grade. So assignments on Monday might be ready the following Monday? Then the teacher writes an email, "John you did the wrong page. Please resubmit." It is just not workable. (On my assignments there was something like a 10% confusion rate, for instance where someone did 1-10 odd instead of 1-10. I sympathise. It is a confusing time.)

I did photos to PDF. After two or three weeks of back and forth with my students we got so that most of them would reliably send legible one-PDF-per-assignments. Again, life is more complicated than, "Any moron can do this."

Finally, email is not a panacea. Having a hundred students emailing their assignments is an invitation for disaster. I was able to go through the college's system (we use Canvas) so it kept track of who sent what and when they sent it. As this article points out any large system has issues, but these systems exist for a reason. I and my students had issues and just had to work around them. With patience and good will we figured it out.

That's what happens a lot in education. People have all kinds of life situations, there are all kinds of tech and comfort with tech, etc. It is complicated.

Folks who are not teachers but are interested in some of the issues could check out the last dozen or so epsidoes of Mr Barton's Maths Podcast http://www.mrbartonmaths.com/podcast/ which are about teaching from home for primary and secondary teachers. Really good stuff.

replies(1): >>23262982 #
28. generationP ◴[] No.23262522{3}[source]
There is no "app". There's a website, from what I understand. Not everything needs to be tested on mobile. In their place I'd have left an option to upload non-supported formats "just in case" to leave a trail, without guarantee of acceptance; but this is not something I'd have expected them to do.

I'm also wondering how many iPhone users have that .heic setting at its default value, as opposed to having switched to .jpg.

replies(1): >>23262926 #
29. athenot ◴[] No.23262525[source]
> We need some sort of standard for software engineering quality.

I contend we need defined SLAs because ultimately that's what matters.

Create the software in Visual Basic or Rust, I don't care. But it needs to work. Define SLAs with consequenses and the rest will sort itself out.

replies(1): >>23263030 #
30. user5994461 ◴[] No.23262724[source]
>>> I doubt there could be any certification body for that.

Plot twist: There is. It's a standard clause in software procurement contracts, especially for government.

"The application must be supported and tested on browsers with at least 10% market share, as defined by the yearly Gartner report"

Replace browser with device or what is relevant for the project.

31. n_e ◴[] No.23262836[source]
> Missing HEIC support (and Apple support in general) is not an issue of quality; it's an issue of "knowing your customers".

The test portal "not responding" when receiving an unexpected file format is a quality issue. Not making it clear to the user that the upload has failed before it is too late is another quality issue.

replies(1): >>23263423 #
32. liveoneggs ◴[] No.23262899[source]
yeah my 6 y/o hates remembering passwords. Fear of getting her password wrong almost made her afraid to use computers at all
replies(1): >>23263608 #
33. _bxg1 ◴[] No.23262926{4}[source]
I was using "app" in the broad sense.

> Not everything needs to be tested on mobile

It does when it requires taking and uploading a photo in the year 2020. Especially when its target users are high school kids. What adult is going to use anything other than their phone to take an off-handed photo to upload somewhere these days? Much less a child who probably almost never uses a traditional computer.

34. karatestomp ◴[] No.23262982{3}[source]
> I'm a college teacher and my wife is a high school teacher. Education is much more complicated than eyeroll suggests.

Wife's a middle school teacher and ~40-50% of the other people in my social circle (not via her & her colleagues, oddly enough) are teachers, too. What they've done here (this state, post NCLB) is get rid of comprehensive curriculums with prepared material (workbooks, sheet packets, textbooks) and now districts and teachers all come up with this stuff themselves, which is clearly wasteful—why have a committee at the state-level do this once when every goddamn district can hire a couple new people to handle curriculum and rope teachers into those same committees, because they don't already have so many friggin' meetings they're starting to overlap?—so yes, hard eyeroll at the trend away from textbook + workbook as a foundation for (middle grade and lower, at least) classes. The state could have made their own such resources several times over for the waste the current system has produced, if they didn't trust a company to provide it (as was usually the case in the past). The whiplash-inducing pointless policy shifts in education, usually implemented by what sure appear to be given their observed behavior certifiable morons, is tiresome and harmful to educators and families alike (we have both perspectives).

Now there are CDC suggestions that kids should have their own resources next year, but gee, we just switched away from textbooks + workbooks, which would have been great, to a mess of shared "learning centers" and junk like that (oh and got rid of all the indoor-recess toys in the kindergarten classrooms statewide to make room for those). It's pure fad-chasing, well-intentioned at best and the school admin version of résumé-driven-development at worst (and it's often the latter). When they accidentally stumble on an idea that might be good they fail to implement it correctly (i.e. they can't even follow simple directions or understand how games or human systems work, these highly-paid jokes of PhDs that run the schools). Very frustrating.

Maybe your schools are doing a better job than ours but there's no possible way the tech support load & assignment screw-up rate here isn't a bigger hassle here than if it were on regular ol' paper, including the effort of shuffling that around and disinfecting it, and I think they've actually done a decent job given the tools they're being told to use (webshit and apps) and the time they had to prepare. Hell they could probably buy some kind of UV disinfectant chamber for submitted papers for what they spend on all these stupid apps every year, stick a drop-box just inside the door of the meal-delivery schoolbuses and outside the school, and call it good.

What I know for sure: the only part of this where it felt like my kid was almost getting the kind of education they would in the classroom without a ton of extra effort on our parts, and it felt like we understood what they needed and what needed to be done about 100% of the time, was the first couple weeks when we did have organized packets of paper instructions and assignments they sent home before spring break just in case there were closures (they didn't yet know it'd be the whole rest of the school year, of course). And with the paper we didn't have to deal with "this login isn't working" and "I hit the wrong thing and now my work I just did is gone" and "what the fuck, I, the adult and a software professional, can't even find this thing they say is at the other end of this link (or where in the app this thing is supposed to be, or whatever)", and so on, and so on.

That's for the younger kids. For the older ones, drop-off rates have been... high. Many of those kids weren't even attempting the majority of assigned work, if they were doing any of it, by a week or two into this. Very high levels of effort by some teachers had noticeable but still low effects on keeping kids engaged. We're talking north of half the kids in my wife's school essentially just skipping 4th quarter this year, and a good chunk of the rest getting maybe 10% as much out of it as they would have in school—it's that bad. I think some assignments missed due to logistics or scanning errors or whatever are nothing next to those effects.

35. vb6sp6 ◴[] No.23263030[source]
I have to agree. Adding lawyers to the process will definitely make things better.
36. generationP ◴[] No.23263423{3}[source]
You're right about that; I missed it. It should be transparent about invalid input, and this indeed is a non-brainer to include in a test suite or QA process.
37. cybwraith ◴[] No.23263554{5}[source]
If the software engineers have the legal capability a current professional engineer certification does to tell the project manager 'no', that might work. Its still less about engineering capability, and more about leverage and protection against retaliation for pushing back on bad ideas/timelines. Even in traditional engineering disciplines, not everyone working on the project is a certified professional engineer, in fact they are usually the minority
38. ativzzz ◴[] No.23263608{3}[source]
Time to set up a family 1password, or if you're up for teaching your family, keepass.
replies(1): >>23263733 #
39. ativzzz ◴[] No.23263677{3}[source]
The same technologically incompetent leaders who manage failed software projects are going to be the ones to write these standards/certifications.

The real problem is a lack of technologically competent leadership. Many of the skills required to excel at technology do not overlap with the skills required to be a good leader. Then, both technology and leadership are difficult skills to train and develop individually. And lastly, the few people who are competent technological leaders would rather work for big tech where they will get paid so much more and would not have to fight with technologically incompetent leadership to set up good standards.

40. karatestomp ◴[] No.23263733{4}[source]
Not on school-owned devices :-/

(but for something like school app/site passwords, though best practice it may not be, "written on a sheet of paper, kept in a drawer" is in fact totally fine)

replies(2): >>23264012 #>>23264587 #
41. angryrant1727 ◴[] No.23263763[source]
It's because the software industry doesn't respect experience. This issue is the kind of thing an experienced engineer with years of building past systems would notice. And they would know how to talk to management so things are done properly.

But how are experienced engineers treated? Like shit. As soon as we get older and have families to support, we get leetcoded out of positions since we can't keep up with months of studying for basically a mental twitch reflex test. That's what it's become, interviewers will consider you a lesser engineer if you fail to vomit out the rote memorized solution a few minutes slower than another candidate. After all, time to write the solution is an "objective" measure right? So the interview process is now "objective", what a joke.

And if an experienced engineer dares to recommend that hey maybe we shouldn't use the latest fad tech that just got announced on a HN post? They will be ridiculed and laughed as lazy, not "keeping up with technology", called a bunch of COBOL dinosaurs holding everyone back. For simply daring to say, hey maybe this latest new technology has tradeoffs that don't fit for us and we should stick to what we have since it has a better balance of tradeoffs. Nope, nobody cares about that, stupid dumb old engineer, stop getting in our way, need to make our resumes look good.

And the industry itself? Encourages constant job hopping, so nobody even gets experience maintaining a system for a long time. All those shitty decisions made? Don't care, off to another company.

And within a company? Constant indirect and direct pressure to go to management. Why aren't you a engineering manager? Oh you want to be a principal? Well here's the ridiculous requirements for that, still want it? What's the difference between a principal and senior anyways? Actually why do we even need seniors, let's just get more juniors. Management doesn't know the value of experience, they just want lower costs. And the engineers themselves seem to be saying experience is worthless, so everyone's in agreement right?

We are failing to build good software systems because it requires experience to know how to do it. And this industry does not value experience.

replies(1): >>23272332 #
42. ativzzz ◴[] No.23264012{5}[source]
Sounds like the school owned device needs to come pre-installed with a password manager then, at least for school related activities.

> though best practice it may not be, "written on a sheet of paper, kept in a drawer" is in fact totally fine

This works, until you have to bring the paper to school where your kids friends will inevitably find the paper and login and mess with their stuff (source: I was that friend)

replies(1): >>23264167 #
43. bobthepanda ◴[] No.23264149{3}[source]
What matters to the bean counters is if it can be done without expensing something new, everything else be damned. Unless you can get a clear, popular government mandate to spend money to make things more efficient, this is not a palatable solution.

Given how large refactors tend to go in general, this also doesn't necessarily lead to good outcomes; even with a relatively technocratic administration led by Mike Bloomberg (relative to comparable mayors, at least), upgrading NYC's 911 system massively spun out of control: https://www.nydailynews.com/new-york/911-overhaul-2b-disaste...

44. karatestomp ◴[] No.23264167{6}[source]
Oh, yeah, paper should probably be stored with your liquor and only pulled out when needed (or with your guns if you're that sort, I guess). Otherwise siblings will pull some pranks.
45. jimbokun ◴[] No.23264316{4}[source]
Or you can pay one big bill and still get crappy software.

It's difficult to evaluate the quality of software development contractors.

46. liveoneggs ◴[] No.23264587{5}[source]
yes it's easier at home to have this stuff on the fridge for reference (also she can just ask me to login for her) but at actual school it was a real problem
47. Polylactic_acid ◴[] No.23266592[source]
Its their fault that the system did not reject the file, show any actionable error message or allow users to try again. Its also their fault for saying "take the test again in a few weeks" instead of "we fucked up, send us the file again today.
48. astura ◴[] No.23268511{5}[source]
Good engineering can't undo bad management/process. Project management is what we really should work on to improve software quality
49. subhobroto ◴[] No.23272332[source]
> It's because the software industry doesn't respect experience

Untrue.

They go and found their own companies.

Silicon valley literally started off with a engineer-manager who left Shockley Semiconductor Laboratory in 1957 to found Fairchild Semiconductor, because William Shockley, while a brilliant academic, was authoritarian and just sucked at managing people.

On the other hand, if the point you are making, is that the software industry simply has to respect experience because a lot of blood, sweat, tears and divorces were weathered by these engineers as they got manipulated and brainwashed - no.

Yes, it sucks that these engineers got manipulated and brainwashed but now they have the experience to detect manipulation and brainwashing and the divorces and health issues were the price they paid to gain this experience.

In summary - the best way to get the value you deserve is to start your own thing. Otherwise, complain all one wants but they will get the minimum someone else can get away with giving.

Everyone wants what is best for them, even your manager and the company they work for, which includes paying as little as possible for the labor they get.

The difference is called profit.