Hacker Newsnew | past | comments | ask | show | jobs | submit | branko_d's commentslogin

From https://kristoff.it/blog/contributor-poker-and-ai/:

"Unfortunately the reality of LLM-based contributions has been mostly negative for us, from an increase in background noise due to worthless drive-by PRs full of hallucinations (that wouldn’t even compile, let alone pass CI), to insane 10 thousand line long first time PRs. In-between we also received plenty of PRs that looked fine on the surface, some of which explicitly claimed to not have made use of LLMs, but where follow-up discussions immediately made it clear that the author was sneakily consulting an LLM and regurgitating its mistake-filled replies to us."


Pretty much sums up the LLM fanbase.

I don't think it's the complete fanbase. However, there are lots of people in the world who live their whole life by vibing. It's a viable way to live and sometimes it's the only way to live. But they have a very loose relationship with truth and reason. Programming was a domain that filtered out those people because they found it hard to succeed at it. LLM's have changed that and it's a huge problem. It's hard to know if LLMs will end up being a net win for the industry. They may speed up the good programmers a little, but those people were able to program anyway without LLMs. They will speed up the bad programmers a lot and that's where the balance sheet goes into the red.

"They may speed up the good programmers a little, but those people were able to program anyway without LLMs."

I don't think this is realistic. I'm a good programmer, and it speeds up my work a lot, from "make sense of this 10 repo project I haven't worked on recently" to "for this next step I need a vpn multiplexer written in a language I don't use" to, yeah, "this 10k line patch lets me see parts of design space we never could have explored before." I think it's all about understanding the blast radius. Sonetimes a lot of code is helpful, sometimes more like a lot of help proving a fact about one line of code.

Like Simon says, if I'm driving by someone else's project, I don't send the generated pull request, I just file the bug report / repro that would generate it.


> to "for this next step I need a vpn multiplexer written in a language I don't use"

but that acceleration is exactly because you're not good at that language


Can't we reach a compromise where proven track record of good use of LLM by a contributor or a company (eg. Bun) be pre-approved or entertained? Blanket ban on a new technology shouldn't be the default option.

Certainly not in the case of asking it to do something you'd be slow at because you are unfamiliar. If you are not familiar enough with the system, how are you confident that what the LLM has produced is valid and complete? IMO the people saying LLMs make then 10x faster were either very bad to start with (like me!) or are not properly looking at the results before throwing them over the wall.

And how do you know if that is the case or the person/team using the LLMs is one of the good ones?

So the safest answer is just "no".


This is the crux of the problem. LLMs make me significantly faster at writing code I was mediocre or bad at. But when I use it to write code in domains I have more knowledge in I see design and correctness problems all over the place and actively fix them and it slows down my output.

Speed is seductive.

The bar isn't "this is a known good contributor". Its "this is a known good contributor working in a space they have knowledge in and has a track record of actually checking and thinking about LLM output before submitting it." It's much higher and I don't see how you can approve people on an organization-wide basis.


  > LLMs make me significantly faster at writing code I was mediocre or bad at. But when I use it to write code in domains I have more knowledge in I see design and correctness problems all over the place and actively fix them and it slows down my output.
I think a very similar phenomena is called Gell-Mann amnesia effect: https://en.wikipedia.org/wiki/Michael_Crichton#%22Gell-Mann_...

if they had a good track record, the current submission that led to this article damaged it.

i am reminded of this quote: it takes more cleverness to debug code than it takes to write it. if you write code as clever as you can, by definition you are not clever enough to debug it. using LLM makes your code many times more clever than what you could write yourself. which means by the same definition the code is to clever for you to understand or debug it.


I like the new corollary to that rule, which is that if the AI is the best coder in the room and writes code too clever for itself, then no one including the AI can debug the code. Then where does that leave you?

i love it. just a moment of thought makes clear that LLMs are not capable to debug their own code because if they were they would be able to write better code. the LLM code doesn't even need to be clever.

That’s why you don’t use SOTA xhigh models to write your code, so you can use the xhigh model to debug the code.

I kneel to Poe's law.

Why would it be pre-approved ? Code is code, whether it's bad quality LLM code or meatbag code it shouldn't matter.

The entire problem is that before the meatbag code was either not submitted at all (developer knowing they are not competent enough to do the fix) or the volume of it was low.

With LLM people not competent to even review, let alone write are emboldened to just throw shit on the wall at rapid pace. So the wall is entirely covered by the shit



> I'm a good programmer, and it speeds up my work a lot

The problem with this line of thinking is the same with "I so good as C developer, my code is so-safe!".

And we see what reality instead tell: Yes, exist people where this claims are true, not, is not even a decently sized minority.


I use LLM as a tutor. It tailor their answers exactly to the situation I am in, even if it hallucinate. I can correct them on the fly and that also serves as training. I try not copy and paste and type every line of code by hand. That doesn't always happen, but I usually understand the code I am writing.

Why are you writing a vpn multiplexer written in a language you don't use?

You can't review it.

Are you relying on your colleagues to do that, or is this riddled with bugs? Or is it code you're producing for personal use only so it's not worth mentioning as it's not sped your work up, it's just let you write a little play program.


No no no it's speed at all costs. Sure. I'm writing junk but the speed of what I'm doing is *impressive* You don't understand.

It's great when I know how the code should look. Sometimes I just can't bring myself to write yet another http handler.

Already libraries for that which are battle tested, why vibe code a unique solution each time?

yep. as an expert programmer there are things i did not have access to. for example, i have an embedded-lite hardware project that required a one line patch to a linux kernel Module.

i know what a kernel module is and im reasonably certain that the patch is safe, but there is no way in hell i would have found that solution (i would have given up). in a world without llms, the project would have died.


I really hope that you have gone over what the LLM decides to do.

Time and time again I've had a project (such as a DSL to SQL compiler, automatic Rust codegen, CSS development) stall because the LLM took a short sighted decision.

I later found better solutions by querying Reddit and upon consulting the LLM, it basically said "oh shit I'm sorry"


We have all had that experience, that's just the way this new world is.

It's honestly pretty arrogant to tell a senior engineer that you "really hope" they've gone over some code. AI generated or otherwise.


Sorry. I forgot to add to add the respect form

I really hope usted checked your code

At this point I'm pretty sure I did the homework for people in college who are now senior engineers


I think your parent didn’t word this correctly.

This is commonplace. So commonplace that most have worked “checking the LLM” into their workflow so deeply that essentially all that’s done is prompt followed by a mini code review.

To suggest a senior engineer blindly accepts modifications without code review kinda hints at you not using LLMs to realize how quickly it will make a mess of things if you don’t hold it’s hand.


Lol why is it arrogant? My workplace is evidence that having a senior engineer title or even a computer science degree doesn't mean you are a good engineer. I honestly think some people have fake credentials and got their jobs via nepotism.

i am writing the sw stack for my own pharma startup.

we have 2 very high value DAU, one of whom is me, and probably will max out at 1000 in our wildest dreams.

long term, our biggest concern is a security regression that lets outsiders see our internal information


> I'm a good programmer, and it speeds up my work a lot,

Whenever I see this arguement, I'm reminded that most programmers don't know what they do for work


> However, there are lots of people in the world who live their whole life by vibing

Why are they often so desperate to lie and non-consensually harass others with their vibing rather than be honest about it? Why do they think they are "helping" with hallucinated rubbish that can't even build?

I use LLMs. It is not difficult to: ethically disclose your use, double check all of your work, ensure things compile without errors, not lie to others, not ask it to generate ten paragraphs of rubbish when the answer is one sentence, and respect the project's guidelines. But for so many people this seems like an impossible task.


> Why do they think they are "helping" with hallucinated rubbish that can't even build?

Because they can't tell the difference between what the machine is outputting, and what people have built. All they see is the superficial resemblance (long lines of incomprehensbile code) and the reward that the people writing the code have got, and want that reward too.


the target audience of the cyber typer terminal [0]

[0] https://hackertyper.net/


"Main character energy". What they're really doing is protecting their view of themselves as smart, and they're making a contribution for the sake of trying to perform being an OSS dev rather than out of need or altruism.

AI is absolutely terrible for people like that, as it's the perfect enabler.


> Why do they think they are "helping"

It's not about helping. It's about the feeling of clout. There are still plenty of people who look at Github profile activity to judge job candiates, etc. What gets measured gets repeated.

I believe that most of the ills of social media would disappear, if we eliminated the "like" and "upvotes" buttons and the view counts. Most open source garbage pull requests may likewise go away if contributions were somehow anonymous.


Anything you say back to them calling out their nonsense, they'll feed back into their LLM and it will tell them why you're wrong and they're right.

https://github.com/huggingface/transformers/issues/45246


Holy... that was quite the read.

LLMs are in this case enabling bad behavior, but open source software has always been vulnerable to this. Similarly, people who use LLMs to do this kind of thing are the kind of people who would have done it without LLMs but for the large effort it would have taken. We're just learning now how large that group is.

This is a good thing, it's an opportunity to make open source development processes robust to this kind of sabotage.


> LLMs are in this case enabling bad behavior

Yeah that seems to be their primary use case, if I'm honest. It's possible to use them ethically and responsibly, much in the same way it's possible to write one's own code, and more broadly, do one's own work. Most people however, especially in our current cultural moment and with the perverse incentives our systems have created, are not incentivized to be ethical or responsible: they are incentivized to produce the most code (or most writing, most emails, whatever), and get the widest exposure and attention, for the least effort.

Hence my position from the start: if you can't be bothered to create it, I'm not interested in consuming it.


People who made use of LLM responsibly to create high quality output doesn't look like they're using AI.

For example, using AI as an editor. It doesn't write anything for you and you try and avoid suggestions unless you're stuck.


You're asking why oil doesn't act like water. It's not really an impossible task, it's just not one they agree with.

I think a lot of people who haven't given it more thought might see it as an arbitrary rule or even some kind of gatekeeping or discrimination. They haven't seen why people would want to not deal with the output.

This might not be helped by the fact that there are a lot of seemingly psychotic commenters attacking anything which might have touched an LLM or any generative model at some point. Their slur and expletive filled outbursts make every critical response look bad by vague association.

Having sensible explanations like in TFA for the rules and criticism clearly visible should help. But looking at other similar patterns, I'm not optimistic. And education isn't likely to happen since we're way past any eternal september.


It's the same as cheating in a game. You are given an """advantage""", so lying about it seems like the best option

I wonder how many are account farming.

Tangential side story, but an interesting one none the less.

I was a food delivery driver back in the mid 00's to the mid teens. Early on, GPS was rare and expensive, so to do deliveries and do them effectively, you had to be able to read a map and mentally plan out efficient routes from the stochastic flow of orders coming out.

This acted as a natural filter, and "delivery driver" tended to be an interesting class of people, landing somewhere in the neighborhood of "lazy genius". Higher than average intelligence, lower than average motivation.

Then when smartphones exploded in the early 10's, the bar for delivering fell through the floor, and the job became swamped with people who would be best identified as "lazy unintelligent". Anyone who had a smartphone and not much life motivation was now looking to drive around delivering food for easy money.

Not saying the job was ever particularly glamorous, but it did have a natural mental barrier that tech tore down, and the result was exactly as one would predict. That being said, I'm not sure end users noticed much difference.


> That being said, I'm not sure end users noticed much difference.

I have friends who order a lot of DoorDash and UberEats and they complain constantly about how awful the delivery service is.

The problem isn't that they haven't noticed, it's that they keep paying for the terrible service, even as the price goes up.


Sums up pretty much how offshoring works on our industry.

There are cool people on the other side as well, unfortunately those aren't usually who get assigned unless escalations take place.

Most shops are built based on juniors that need to build enough curriculum to go elsewhere as soon as they get some scars.

Yet not only those projects keep coming, now plenty managers dream about replacing those juniors with agents.


I love this anecdote. It highlights what our industry continues to forget: The end user doesn't care.

Don't get me wrong, tech is why I am here. But if it works, Alice and Bob don't care one bit about how the product exists.


> The end user doesn't care.

well, they think they don't. until their pii gets leaked all over the internet because whoops our s3 bucket was publicly accessible, or until the service goes down because whoops our llm deleted the prod db...


PII leaks are normalized now. Most people aren't even aware, or just shrug "oh well" and head to the app store to download the latest gacha game or whatever.

That is why Alice and Bob get Electron apps, Webviews on mobile, mostly coded by offshoring teams.

> It's hard to know if LLMs will end up being a net win for the industry. They may speed up the good programmers a little, but those people were able to program anyway without LLMs. They will speed up the bad programmers a lot and that's where the balance sheet goes into the red.

If you will forgive an appeal to authority:

The hard thing about building software is deciding what one wants to say, not saying it. No facilitation of expression can give more than marginal gains.

- Fred Brooks, 1986


Before LLMs we could already see a growing abundance of half baked engineers only in for the good pay. Willing to work double time to pull things out.

Management, unsurprisingly deemed those precious. They could email them out anytime, working weekend to fix problems their kind were the cause. Sure sir.

They excel at communication. Perfecting the art.

Now LLMs are there to accelerate the trend.


You're at least describing someone who sounds hard-working... what's the problem?

I'd be more concerned if I was someone who signed up to play ping pong two hours a day and do a bi-weekly commit.

There was a time not so long ago where I was watching "a day in the life of a software engineer" videos on Youtube and I was wondering if some of these were parodies. I still remember one in particular which I'm pretty sure was a parody, but it was only marginally distinguishable from the others.


I do believe in hardship. As sacrifice. It yields long term benefits for oneself, and for society.

But submissions into slavery for immediate gain accomplishes little, and costs society a lot more (physical and mental health issues are a huge burden).

Those parodies you saw, they were caricature of elite engineers, who sacrificed decades of his life to become so competent. Can work from home, eat pasta while glancing over a PR and just hit approve.

That you resent the luxury doesn't make it undeserved privilege.


I've met programmers who severely outclassed me. It was extremely uncomfortable and it took me months to accept that reality and reshape my hurt ego into curiosity and desire to learn from someone clearly superior in the craft.

That being said, most people in the privileged positions you described are there by sheer luck and connections. In the very very best-case scenario that offends them the least: they stumbled upon an opportune position and were smart enough to make full use of it... in the first 6 months (when people pay the most attention and lasting impressions are formed). And then rode the reputation they made for years. Their value as engineers on the team after the initial honest burst of productivity becomes... very unclear from that point and on, shall we say.

Again, I've met engineers who fully deserved their privileges. 2-3 times over 24 years of career though (a good chunk of it as a contractor so I've been around). My anecdotal evidence obviously means nothing but we all develop pattern-matching skills with time, making me think what I saw is generally the statistical curve that would apply almost everywhere. Maybe.


Working long hours due to incompetence is not a good thing.

> It's hard to know if LLMs will end up being a net win for the industry.

True, regardless of that, for sure with LLM we are borrowing Technical debt like never before.


Why are we not paying it off? I sure am. I refactor code left and right. It is up to you.

> Why are we not paying it off? I sure am. I refactor code left and right. It is up to you.

Do you work alone i presume? Everyone now is engineer. In my department, even managers are "writing code". Producing thousand of lines of ansible code, that nobody can review, with multiple lines of doc that nobody will read. It is just a mess.


That's a management problem. If you can't stop non-coders from coding perhaps you can introduce an AI reviewer to take a load off, demand that they be able to defend every line of code, and put them all on pager duty, since they're coders now ;)

"Claude, don't create any technical debt please"

i've been told that it's totally fine because once the codebase turns into spaghetti you can simply tell the agent to refactor it and then everything will be ok

I know this is a tongue-in-cheek response, but this brings me great pain. The spaghetti begins quickly, and your unit/functional tests won't help you unless you hammered out your module API seams before you even began. Oh, your abstractions are leaking? Your modules know too much about each other? Multiply the spaghetti!

the multiple layers of vibe, makes the dozen of code bases even harder to maintain.

For at least the last 3 decades programming was a field that rewarded utter mediocrity with (relatively to other fields) massive remuneration. It has been filled with opportunists for as long as I remember.

This is an excellent point. LLMs might merely be exposing and amplifying behaviors that were always there. This can be an opportunity, in that shining light on it may allow us to cleanse ourselves of it. It's fundamentally about integrity, and sadly it's becoming clearer how few possess it (if it ever wasn't!). But maybe we'll get better at measuring integrity, and make hiring/collaboration decisions based on it.

You are talking about bad programmers who are at least able to fool their managers for at least several years. The people OP is talking about could not even do that and most likely would have dropped out in the first week trying to program full time since they just don’t have the aptitude and patience to get unblocked after their first compilation error. Now they can go very far with a LLM.

Thing is, it's not how incompetent they are, but the opportunism itself. The property I mentioned pulls in opportunists regardless of their competence. So eventually if you work in a field like this, you end up surrounded by them. There's always _some_ around you, of course, everywhere - but across time different fields tended to pull so many of them they would become suffocating to anyone who isn't one. And if you think you can interview your way out of this - an opportunist will often have an easier time to pass a harsh interview process than someone who cares.

IT isn't the only one - finance and law had the issue since forever, AFAIK - but now I'd rather be in a field that's _actively repellent_ to them.


I think worth noting that a more impactful and maybe even bigger proportion of those opportunists is in management.

Regarding quality overall, I agree, it's truly a cursed field. It was bad before; and with LLMs, going against that tide seems more difficult than ever.


wouldnt llm do all the tasks that determistic programs are doing. like chatgpt files taxes for you instead of using turbotax.

> Programming was a domain that filtered out those people because they found it hard to succeed at it.

I think this is a very rosy view of programmers, not borne out by history. The people leading the vibe coding charge are programmers, rather than an external group.

I know it's popular to divide the world into the technically-literate and the credulous, but in this case the technical camp is also the one going all in.


> there are lots of people in the world who live their whole life by vibing. It's a viable way to live and sometimes it's the only way to live. But they have a very loose relationship with truth and reason

This response 1000% was crafted with input from an LLM, or the user spends too much time reading output from llms.


I have never used an LLM to write. Writing forces me to think (and I edited the comment a couple of times when writing it which helped me clear up my thinking). "It's a viable way to live and sometimes it's the only way to live" is a personal realization that has taken me some time to understand. You can go back through my comment history to the time before LLMs to check if my style was different then.

It says a lot that most readers can't distinguish good writing from something an LLM spat out.

Ray Kroc's genius was to make people forget that you get what you pay for.


False equivalency. If you had the humility to run your own writing through an LLM first, it would have caught it. Just saying.

Not picking on you in particular, but most of the anti-AI crowd can’t present their case compellingly and have an utter lack of humility.


If you run your writing through an LLM, it can poke holes in your argument, organize your ideas better, or point out that your tone is hostile/dismissive. It doesn’t need to be a replacement for writing or thinking, especially if you’re learning along the way.

So - in that way - LLM will be Your mentor, it will shape Your way of thinking according to algorithms and datasets stuffed into by corporate creators.

Do You really want it?

There is also a second face of that: people are lazy. They wouldn't develop their own skills but rather they would off-load tasks to LLM-s, so their communicative abilities will be fade away.

That's looks like a strong dystopia for me.


> LLM will be Your mentor, it will shape Your way of thinking according to algorithms and datasets stuffed into by corporate creators.

How is this mutually exclusive with teaching better than most humans? Part of these "corporate" datasets include deep knowledge of the world's best literature and philosophy, for instance. Why can't it be both?

> Do You really want it?

If I'm in a hurry, don't know where to start, or don't have money for someone to teach me—sure.

> There is also a second face of that: people are lazy. They wouldn't develop their own skills but rather they would off-load tasks to LLM-s, so their communicative abilities will be fade away.

This is a recapitulation of the Luddite argument during the Industrial Revolution. And it's valid, but it has consequences for all technological change, not just this one. There was a world before Google, the Web, the Internet, personal computing, and computers. The same argument applies across the board, and the pre-AI / post-AI cutoff looks arbitrary.


> teaching better than most humans

Ah, so now we get to the "ed tech" question. What is teaching? Is there a human element to it, and if so, what is it? Or is it something completely inhuman? Or do we need to clarify what meaning of "teaching" we're talking about before we have a discussion?


> Part of these "corporate" datasets include deep knowledge of the world's best literature and philosophy

Part of those datasets also include 4chan.


All of which are parts of the writing and thinking skillset, no?

Right. It can enhance that skillset. Are you suggesting it can’t?

This wouldn’t be a plausible position.


Rather that avoiding delegating it to LLM for these tasks helps you practice that skill.

That said, I think it depends how you use it. You can learn from explanations, and you'd better avoid "rewrite this for me and do nothing else" kind of approach.


Right, but the LLM can help you practice the skill too. Without the LLM, you're in a self-guided, autodidactical mode. Obviously, that can have its own advantages, but most people—but especially novices—aren't in a position to assess their skill level or their progress. The average person isn't going to magically get better at thinking or writing without formal training, or at least some direction.

I don't get that impression at all. LLMs would have avoided the stylistic repetition of "live". Asking an LLM to reformulate the sentences you quoted yields this slop:

> There are a lot of people who go through life by vibing. And honestly: that’s not automatically “bad.” Sometimes it’s even the only workable way to get through things. The issue is that “vibe-first” people tend to have a pretty loose relationship with truth, rigor, and being pinned down by specifics. They’ll confidently move forward on what sounds right instead of what they can verify.

I'll finish this post with a sentence containing an em-dash -- just to confuse people -- and by remarking on how sad I find it that people latch onto dashes and complete sentences as the signifiers of LLM use, instead of the inconsistent logic and general sloppiness that's the actual problem.


I'm firmly in the LLM fanbase. Not because I can't type code (was doing it for over 17 years, everywhere from low level hardware drivers in C to web frontend to robot development at home as a hobby - coding is fun!), but because in my profession it allows me to focus more on the abstraction layer where "it matters".

I'm not saying that I'm no longer dealing with code at all though. The way I work is interactively with the LLM and pretty much tell it exactly what to do and how to do it. Sometimes all the way down to "don't copy the reference like that, grab a deep copy of the object instead". Just like with any other type of programming, the only way to achieve valuable and correct results is by knowing exactly what you want and express that exactly and without ambiguity.

But I no longer need to remember most of the syntax for the language I happen to work with at the moment, and can instead spend time thinking about the high level architecture. To make sure each involved component does one thing and one thing well, with its complexities hidden behind clear interfaces.

Engineers who refuse to, or can't, or won't utilize the benefits that LLMs bring will be left behind. It's just the way it is. I'm already seeing it happening.


This mindset is fine (it's mine essentially too).

But it absolutely has to be combined with verification/testing at the same speed as code production.


I generally do have that mindset, but over the past 1y of Claude code I do notice that I’m clearly losing my understanding of the internals of projects. I do review LLM generated code, understand it, no problem reading/following through. But then someone asks me a question, and I’m like… wait, I actually don’t know. I remember the instructions I gave and reviewing the code but don’t actually have a fine-details model of the actual implementation crystallized in my mind, I need to check, was that thing implemented the way I thought it was or not? Wait, it’s actually wrong/not matching at all what I thought! It’s definitely becoming uncomfortable and makes me reconsider my use of Claude code pretty significantly

> I’m like… wait, I actually don’t know.

reminds me of the experience of reading a math text without doing the exercises, thinking that you've understood the material, and then falling flat on your face when you attempt to apply your "understanding" to a novel problem. there's a significant difference between passively reading something and really putting active effort into it. only the latter leads to actual understanding ime


Same experience. I've been writing code for many decades, but that experience doesn't mean I can remember what I read when reviewing generated code. I write small, focused commits, but I have to take a day off each week to make changes by hand just to mentally keep up with my own codeset knowledge, and I still find structures that surprise me. It's not necessarily that the code quality is poor, but it's not like I (thought) I had designed it. It's lead to a weakening of my confidence when adding to or changing existing architecture.

I've had this issue too, and I feel it was an important lesson—kind of like the first time getting a hangover.

On the other hand, LLM-generated code comments better than I do, so given a long enough time horizon, it could be more understandable at a later time than code I've written myself (we've all had the experience of forgetting how things work).


It's not. Invariably, the code is locally fine and globally nonsense.

  > On the other hand, LLM-generated code comments better than I do, so given a long enough time horizon, it could be more understandable at a later time than code I've written myself (we've all had the experience of forgetting how things work).
Writing and rewriting piece of software performs what is called "spaced repetition" [1].

[1] https://en.wikipedia.org/wiki/Spaced_repetition

You ask questions about code when you implement something and if you cannot answer these questions, you go to code to find answers out and refresh your understanding of it.

For this to work you have to be interested in the understanding of the code and code should be created at the pace you can keep up.

Software engineers usually do create code economically because they need to remember and understand it. Vibe coders do not have this particular constraint, they just do not aim for most understandable code possible. Even if there are more comments in code.


I do think that this is natural. When you use LLM coding tools, you're becoming a lot more like an architect/staff/manager, rather than the direct coder. You're setting out the spec, coming up with the design, and coming up with the high level structure of the project.

However, this comes at the cost of losing track of the minute details of the implementation because you didn't write it yourself. I find it a bit analogous to code I've reviewed vs code I've written.

However, I've found using AI for code structure summary and questioning tends to be a good way to get around it. I might forget faster, but I also pick it up faster.


I've found that for non-trivial features, I typically benefit from 3-4 rounds of: are you sure this isn't tech debt, are you sure this is thoroughly tested for (manually insert the applicable cases, because they aren't great at this, even if explicitly asked), are you sure this isn't re-inventing wheels, adding unnecessary complexity by not using existing infrastructure it should or that other existing code would not benefit from moving to this, are you sure you can't find any bugs, in hind sight, are you sure this is the best design?

Then, after it says, yes I'm sure this is production ready and we're good to move on, you have Codex and Gemini both review it one last time, and ask it to address their feedback if it's valuable or not.

After all this, it's the only time I'll look at the code and review it and make sure it's coherent.

Until then, I assume it's garbage.

I'd estimate this still improves velocity by 10x, and more importantly, allows me to operate at a pace I couldn't without burning out.


working this way would drive me nuts

Why? It's not that different from managing engineers.

You're just getting less work done on a slower cadence and asking the questions in design review and in code reviews...


it's very different. LLMs don't behave like people. they don't learn.

i don't mind managing people, but i don't want to manage machines unless i can control them with the precise languages that the commandline and programming languages use. prompting a LLM is to vague an interface for me, the outcome is to unreliable, to unpredictable.


One-off tasks and parts of the stack that already have lots of disposable code do not need the same scrutiny as everything else. Just as there is a broad continuum of code importance, there is a broad continuum of testing requirements, and this was the case before AI. Keeping this in mind, AIs can also do some verification and testing, too.

> Engineers who refuse to, or can't, or won't utilize the benefits that LLMs bring will be left behind. It's just the way it is. I'm already seeing it happening.

Any examples how you see some engineers being left behind?


> Any examples how you see some engineers being left behind?

I don't know where you live, but around where I live in Denmark you'd fail for not using AI at a senior interview in a lot of places. Even places which aren't exactly AI fans use AI to some extend.

The biggest challenge we face right now is figuring out how you create developers who have enough experience to know how to use the AI tools in a critical manner. Especially because you're typically given agents for various taks, which are already configured to know how we want things to be written.


Around here on your southern neighbour, everyone is supposed to be doing AI and being evaluated by this, yet in many projects if clients don't sign off on the use of AI tools, there is no AI to use anyway.

Additionally there are the AI targets set by C suites based on what everyone is saying on TV, and what we can actually deliver based on the available data sets, integration points, and naturally those sign offs for data governance, and hallucinations guardrails.


I work for a fortune 50 that is heavily tech based.

If you can’t interview without immediately reaching for an LLM you are considered unfit to work here.


Around here C levels have AI adoption goals and are actively pushing it throughout organisations. Even when it doesn't exactly make sense.

> Everyone is jumping off the cliff

> If you don't jump off the cliff you're falling behind


I was just giving them an anecdotal example of what they were asking for. I think the answer is somewhere in the middle, but I'm not in a position to push any form of change on the C levels.

I've noticed that back in Europe everyone's in a panic mode, but that's because of the inferiority complex most people have vs both US and China. It's unwarranted.

Probably in cognitive surrender. I have one such colleague and he is driving me crazy. "Claude sad that ..."

I'm starting to notice how those who don't use AI end up having to hand tasks over to people who can get them done quicker.

It is anecdotal for sure, but it's a pattern that seems to be emerging around me that expectations of velocity increases, and those who don't use AI can't keep up.


Why is velocity the overriding goal?

Shit processes. I don't know what places most of those people work at that crap is being merged into production at insane pace. You would expect any serious piece of software would be important enough to have the code be reviewed by at least one human.

Kind of.... I don't know. To get placed such requirements from the top down and not fight back, just take it head on, not even maliciously, don't even oppose it on a technical basis, just be like "yeah, you've now gotta ship faster or you're left behind, so therefore LLMs must be the future!", no critical thought attached. Is this shit coming from experienced engineers?

Preposterous we're relying on "it's better because I feel like", "dudes who don't use it are falling behind at work", "they ask for it in job interviews".


Again, I have to point out that AI is not an abstraction layer. It blows my mind that engineers with years of experience somehow don’t understand this.

It would be an honor to be “left behind” by people who practice their craft with such carelessness.

(Frankly, I should probably stop replying to self-professed LLM boosters entirely since there’s a good chance I’m just chatting with an LLM.)


Fanbase, maybe. Software engineers using these projects? Probably forking and updating themselves.

FWIW, I've opened a half dozen PRs from LLMs and had them approved. I have some prompts I use to make them very difficult to tell they are AI.

However if it is a big anti-llm project I just fork and have agents rebase my changes.


Your employer allows/encourages this? Do you run that stuff in production? Would you mind telling us where you work so we can avoid using their products? It is just not possible to trust the software that emerges from the process you've described.

so, they are approved, which means they were most likely reviewed. yet you still think the software cannot be trusted of that and even want to name and shame a company. utterly stupid.

Yes. If a company is running vibeslopped compilers to build their production artifacts I absolutely want to know which one it is, so I can protect myself from their software.

> utterly stupid

That's completely uncalled for.

EDIT: What exactly do you mean by:

> most likely reviewed.

Let's say every line was actually reviewed. That's still nowhere near good enough. The changes are being reviewed by the wrong people. Not the maintainers of the project, just some random folks who have inherited a vibecoded fork.


"I aM someWhAt oF a DeVelOpER MySelF"

Not really - I imagine as with almost everything in life there's a normal distribution, in this case of the quality with which people use AI tools.

The normal distribution doesn't account for things like "huge megacorporations pour billions of dollars into accelerating product adoption" or "other companies force their employees to use AI whether they want to or not" though.

This is a spam problem more than anything else. It's not really an AI problem except that it's AI that is enabling this new type of spam.

Imagine there's no AI, but for some reason you have people hiring armies of cheap overseas devs and using them to produce mediocre quality drive-by PRs. The effect would be the same.

AI can be used to make quality code, but that requires careful use of the tool... like any other tool. This isn't careful contributions made by someone who knows the project and its goals and is good at using the tool. This is spam.


Exactly, people could have "consulted Google" or "consulted stack overflow" and had the same issues. It's about the end result, not how the code got to that end result, and the submitter is responsible to make sure of the quality of the submission regardless of whether AI was used or not.

To reject submissions where the dev "consulted ai" is like rejecting iron ore that was mined by a machine rather than a human. The quality of the ore is what should be measured, not how it was obtained.


I agree, but the problem comes back to how to evaluate quality at scale. That is very hard. It’s easier to just say no AI because that at least turns off the fire hose.

It sounds like they are even rejecting submissions where they even get a whiff of ai being "consulted" though. That's not quite the same as turning off the firehose.

No that’s just reactionary.

The discourse around AI in the arts, and other creative and craft fields, is utterly identical to the discourse around photography when it came out to the point that you could search and replace terms and have the same dialogue.


I'm personally amazed that _Large_ OSS projects don't have the appropriate automation in place to prevent non-compiling or non-linter-passing submissions.

- Hooks (although there's no clean way to enforce they be "installed" on a clone), GHA Workflows (or their equivalents on other forges).

This might be my bias showing, but these are items I would consider table-stakes for a project of a certain size / level of popularity.

It feels like a lot of the "AI is shit at contributing" problems could be addressed in part by better automated checks and balances.


Those things cost resources, and now you're introducing a new attack vector: open up a bunch of shit PRs, burn a lot of cash for the target organization.

You're right. It doesn't solve for all scenarios and doesn't block malicious actors.

I do believe, however, that it would have a meaningful impact on the "drive-by" PRs that keep being used as examples; the thoughtless, throw-spaghetti-at-the-wall PRs that do not have malignant intent behind them.

Many large OSS projects would have the resources to eat that cost with Donors, Sponsors, and OSS hand-outs. That's why I clarified in my original post because I know this is not a general solution.


The problem is you can get the LLM to iterate until it compiles and lints and even passes LLM review, but will that actually improve the quality of the contribution or just produce more line noise to mechanistically meet criteria?

To large complex projects often the kernel of an idea is the core value of a contribution, and it can take a lot of iteration to figure out how to structure it. Token bashing until CI is green does nothing to ensure the best approach is selected.


> The problem is you can get the LLM to iterate until it compiles and lints and even passes LLM review

Worst of both worlds with this, if you're doing it in a github workflow. You wind up effectively paying for the testing/validation layer of someone else's irresponsible LLM use.


For sure, but that's not what I was referring to in my posts. I'm specifically referring to the callout that the contributions are so low quality they don't even pass linting or compile.

I could have been more explicit on that nuance, I suppose.


That's why you sandbox. You can mitigate most low-hanging DoS fruits by running your server side hooks in a per-tenant cgroup that limits CPU and memory usage. One tenant per public key for trusted contributors, and one general-purpose tenant shared by all new/unknown contributors.

Can't you prevent pushing from the client side with pre-commit hooks? I would expect a hook to fire on the developer's computer that prevents them from even committing/pushing (unless they nuke the hook in their local repo copy).

You have to manually install hooks in your local repository. They aren't propagated as part of the repo. Git has intentionally made hooks require a very explicit opt-in.

Oh, good to know. I haven't used them much, so I'm a bit ignorant as to how they work in larger projects.

> Hooks (although there's no clean way to enforce they be "installed" on a clone), GHA Workflows (or their equivalents on other forges).

Git supports pre-receive hooks. But big multitenant forges like GitHub.com don't allow you to configure them because they're difficult to secure well. (Some of their commercial features are likely based on them, though.)

If you self-host a forge, though, you can configure arbitrary pre-receive hooks for it in order to do things like prevent pushes from succeeding if they contain verifiably working secrets, for example. You could extend that to do whatever you want (at your own risk).


You're still talking about compute resources that need to be paid for and maintained for that. Spamming AI PR's is going to cost a lot of money.

At the end of the day, LLM slop PR spammers are essentially adversarial actors. Git hooks are ultimately a tool for good faith developers within a given community (your team, your company, your regular contributors) in maintaining good hygiene and avoiding lapses into preventable mistakes. That's true for all CI, too.

And the truth is, too, that it's super easy for an LLM agent to run a build and tests. Good faith contributors using LLMs will never open PRs that don't build not because they're willing to "go the extra mile" and do manual work, but because they give the slightest fuck and have any respect or consideration for the humans they're working with.

LLM spam presents a different problem than any of that stuff was meant to solve. It's a malicious act, and you're right that tooling that burns the defender's compute can't be a solution. :-\


All of my personal projects, many of which will never be publicized, use hooks and GHA to ensure compilation of changes.

It is quite strange that a large project like Zig would not have such a thing. I'm sure it's not trivial but it seems important to invest time into.


One of my pet peeves with git (and systems both similar, and based on it) is that automated tests run after you've made the commit and push.

In my mind the commit (let alone the push to a publicly accesible server) should be done after, and only if, the automated tests are successfully executed. And there's no easy way to implement this, other than having a dirty branch that you discard after rebasing onto a more long lived one.


There are lots of reasons to commit when things are yet working. How else would you share code that you need help with?

The solution is gated merges. No merging to main unless ci passes.

Every org I have worked at bemoaned a flaky release process and refused e2e black box acceptance tests because "they are too slow." And every org I have worked at has realized they were wrong. We got appropriate gates that run in 5 minutes and an ops person is the only person who can force past any gate in case of emergency.

Guardrails like this only become more important with the accelerant that is ai.


You can use a pre-receive hook on a git server to reject pushes that fail compilation. Downside is that it requires admin access on git forges, so you're only able to do this if you self-host.

Pre commit hooks exist. People just don't like being prevented from committing for reasons such as this.

But... this particular project does have such automation in place? It isn’t hard to find:

https://codeberg.org/ziglang/zig/src/branch/master/.forgejo/...


I mean even having linters and everything still creates a whole bunch of noise in their PR section, not to mention that a lot of the changes I make to stuff that's written by codex is not stuff that's caught by linters.

It's just bad/wrong/context lacking decisions and mental models it introduces, that if not carefull will just create a massive mess of a codebase. (I know, because I've tried, and had to deal with it)

And if someone vibecodes a PR and it works, why dont they just share the prompt so a repo owner could vibecode it themselves?


Vibe coding is often not a single prompt, it's an entire workflow (if you're doing it right).

Don't disagree, but the "if you're doing it right" is a big asterisk for an open source project with people you have no idea what quality bar they're at.

And in my experience it's quite hard to figure that out by quickly looking at it.

Not to mention that contributions on github (almost?) never include the prompt chain anyway, so the status quo is even worse


That's a fair point. I was just speaking generally.

Fake it ‘till you make it. Seems like LLM’s have caught-on to that too.

You can curb an LLM into doing what you want. Unfortunately people don't have the patience or the skill.

People who have skill can do the same without LLMs, maybe slightly slower on average but on more predictable schedule.

I wouldn’t say slightly slower; LLMs are massively useful for software engineering in the right hands.

For some personal projects I still stick to the basics and write everything by hand though. It’s kinda nice and grounding; and almost feels like a detox.

For any new software engineer, I’m a strong advocate of zero LLM use (except maybe as a stack overflow alternative) for your first few months.


It's significantly slower to use LLMs for some things. The only thing it excels at is generic, broad tasks. Getting the 90% done. I find that it's less cumbersome to get it mostly right and touch it up yourself than to prompt over details like syntax.

The chat UX with a fake-human lying to you and framing things emotionally really doesn’t help. And it is pretty much not possible to get away from it, or at least I haven’t found yet how.

I would love to see a model trained to behave way more like a tool instead of auto-completing from Reddit language patterns…


This is just a footnote in the article, but is incredibly important, IMO:

”There’s a risk that codebases begin to surpass human comprehension as a result of more AI in the development process, scaling bug complexity along with (or perhaps faster than) discovery capability. Human-comprehensibility is an essential property to maintain, especially in critical software like browsers and operating systems.”

This aligns with my own experience, and I believe experience of most practitioners in the field: writing a piece of code is just a beginning of a very long journey.

We should be careful about optimizing that first step at the expense of the journey.


The wording "outside of transaction" irks me. Everything in a relational database is done within a transaction, the only question is whether it's the transaction you think it is, or some other.

I believe this is largely an API design problem. Many client APIs (especially ORMs) will start a transaction implicitly for you if you haven't explicitly specified your own, leading to problems like in the article.

Having implicit transactions is just wrong design, IMO. A better-designed API should make transactions very explicit and very visible in the code: if you want to execute a query, you must start a transaction yourself and then query on that transaction supplied as an actual parameter. Implicit transactions should be difficult-to-impossible. We - the programmers - should think about transactions just as we think about querying and manipulating data. Hiding from transactions in the name of "ergonomy" brings more harm than good.


I love how this is done in Software Transactional Memory (STM) in Haskell. There, the transaction code happens in its own type (monad), and there is an explicit conversion function called `atomically :: STM a -> IO a`, which carries out the transaction.

This means that the transaction becomes its own block, clearly separated, but which can reference pure values in the surrounding context.

    do
       …
       some IO stuff 
       …
       res <- atomically $ do
          …
          transaction code
          which can reference
          results from the IO above
          …
       …
       More IO code using res, which is the result of the transaction
       …


I think this is especially problematic (from Part 4 at https://isolveproblems.substack.com/p/how-microsoft-vaporize...):

"The team had reached a point where it was too risky to make any code refactoring or engineering improvements. I submitted several bug fixes and refactoring, notably using smart pointers, but they were rejected for fear of breaking something."

Once you reach this stage, the only escape is to first cover everything with tests and then meticulously fix bugs, without shipping any new features. This can take a long time, and cannot happen without the full support from the management who do not fully understand the problem nor are incentivized to understand it.


This isn't incentivized in corporate environment.

Noticed how "the talent left after the launch" is mentioned in the article? Same problem. You don't get rewarded for cleaning up mess (despite lip service from management) nor for maintaining the product after the launch. Only big launches matter.

The other corporate problem is that it takes time before the cleanup produces measurable benefits and you may as well get reorged before this happens.


This is the root of the issue. For something like Azure, people are nor fungible. You need to retain them for decades, and carefully grow the team, training new members over a long period until they can take on serious responsibilities.

But employees are rewarded for showing quick wins and changing jobs rapidly, and employers are rewarded for getting rid of high earners (i.e. senior, long-term employees).


> For something like Azure, people are nor fungible

What I've learned from a decade in the industry is that talent is never fungible in low-demand areas. It's surprisingly hard to find people that "get it" and produce something worthwhile together.


I would say "systems design" rather than low-demand.

People who can "reduce" a big system to build on a few simple concepts are few and far between. Most people just add more stuff instead.


I think those people are around, they are just not rewarded by this kind of system. They can propose plans and fixes, they just don't get implemented.


“Simplicity is a great virtue but it requires hard work to achieve it and education to appreciate it. And to make matters worse: complexity sells better.” - Edsger Wybe Dijkstra


When things become too complicated, no one dares to make new systems. And if you don’t make new systems ofc you have to learn system design the other way around — by fixing every bug of existing systems.


Simple ain’t Easy

- Rich Hickey


There are often retention problems with lean budgets, and after training staff they often do just leave for a more lucrative position.

Loyalty will often not be rewarded, as most have seen companies purge decade long senior staff a year before going public.

It is very easy to become cynical about the mythology of silicon valley. =3


What is a low-demand area?


A geographic area where there's not abundant opportunity for software developers. Usually everywhere outside the major metro areas. It was primarily meant to discount experiences from SF or Seattle where I'm sure finding talent is easy enough, assuming you are willing to pay.


I thought of this not as geographic but in terms of what’s sexy vs not. Low Demand = not


Right, like running a sanitation department for a city. Who wants to do that? No one, but it's pretty important and everyone will raise hell and almost riot when it's not working.


Totally. I’m in insurance. So much is unsexy but critical. And that’s where you see a lot of folks churning on core systems, process, etc that makes insurance actually work vs any headline tech/investment/AI stuff. Don’t get me wrong - wins there too. But 22 year old Harvard grads aren’t going for underwriting assistant jobs (to use an example)


This is a human problem. We humans praise the doctors that can put the patients with terminal illnesses alive for extended periods, but ignore those who tell us the principles to prevent getting those illnesses in the first place. We throw flowers and money to doctors who treat cancers, but do we do the same to the ones who tell us principles to avoid cancers? No.

The same for MSFT or any other similar problem. Humans only care when the house is on fire — in the modern Capitalism it means the stock goes down 50%, and then they will have the will to make changes.

That’s also why reforms rarely succeeded, and the ones that succeeded usually follows a huge shitstorm when people begged for changes.


> Humans only care when the house is on fire

In corporate context it's because that's, in theory, an effective use of resources:

If 20 teams are constantly "there is a huge risk of fire", a lot of mental energy is wasted figuring out how to stack rank those 20 and how real of a fire risk there is. If instead you wait when there is a real fire, you can get the 15 teams actually fixing that one.

In practice, you've probably noticed that the most politics-playing & winning teams are the teams which are really effective at :

1) faking fires

2) exaggerating minor fires

3) moving fast & breaking things on purpose (or at least as a nice side effect) to create more fires in their area of ownership* , and get rewarded with more visibility & headcount to fix those fires.

* As long as they have firm grip of that area... If they don't, they risk having it re-orged to another team.


>If instead you wait when there is a real fire, you can get the 15 teams actually fixing that one.

In this case, with Microsoft's really amazing revenue stream, a charismatic management team can distort reality for quite some time and convince the right people within the company that there is no fire.


Yeah the more "honest" side at least tried to fix it after the fire. The demagogue ones like to fake fire and move fast.


This is a capitalism problem.

If you treat people well and give them the means to survive without trying to wring every red cent you can out of them, they'll be more likely to stick around and keep providing value.


> You don't get rewarded for cleaning up mess (despite lip service from management) nor for maintaining the product after the launch

I have never worked at a shop or on a codebase where "move fast & break things, then fix it later" ever got to the "fix it later" party. I've worked at large orgs with large old codebases where the % of effort needed for BAU / KTLO slowly climbs to 100%. Usually some combination of tech debt accumulation, staffing reduction, and scale/scope increases pushing the existing system to its limits.

This is related to a worry I have about AI. I hear a lot of expectations that we're just going to increase code velocity 5x from people that have never maintained a product before.

So moving faster & breaking more things (accumulating more tech debt) will probably have more rapid catastrophic outcomes for products in this new phase. Then we will have some sort of butlerian jihad or agile v2.


People are still trying to figure out how to use AI. Right now the meme is it's used by juniors to churn out slop, but I think people will start to recognize it's far more powerful in the hands of competent senior devs.

It actually surprised me that you can use AI to write even better code: tell it to write a test to catch the suspected bug, then tell it to fix the bug, then have it write documentation. Maybe also split out related functionality into a new file while you're at it.

I might have skipped all that pre-AI, but now all that takes 15 minutes. And the bonus creating more understandable code allows AI to fix even more bugs. So it could actually become a virtuous cycle of using AI to clean up debt to understand more code.

In fact, right now, we're selling technical debt cleanup projects that I've been begging for for years as "we have to do this so the codebase will be more understandable by AI."


Having worked on many long-lived projects for 5+ years at big firms, I think theres an aspect of project management being a dark art which will conflict with the hopes & dreams of AI.

Developer productivity is notoriously difficult to measure. Even feature velocity, cadence or volume improvements are rarely noticed & acknowledged by users for long. They will always complain about speed and somehow notice slowdowns (and invent them in their head as well).

I once joined a team that was in crises, they couldn’t ship for 6 months due to outages. We stabilized production, put in tests, introduced better SDLC, and started shipping every 1-2 weeks. I swear to you that it was not more than a few months before stakeholders were whinging about velocity again. You JUST had zero, give me a break.

If you get a 3x one-off boost by adopting AI and then that’s the new normal, you’ll be shocked how little they pat you on the back for it. Particularly if some of that 3x is spent on tickets to “make the code easier for AI to understand”, testing, and low priority tickets in the backlog no one had bothered doing previously (seen a lot of these anecdotes). And god help you if your velocity slips after that 3x boost, they will notice the hell out of that.


Problem is that if you want to be a serious cloud provider, you have to do exactly that. I slowly move my apps off of any Microsoft services, because they tend to be slow and buggy.

Also they too often remove features of their products and I have no desire to migrate working stuff because MS wants to move people to other products.

And these tend to be worse in recent times. Exemplary for that is PowerAutomate for me. Theoretically a neat tool that is well integrated into the cloud landscape. Practically you cannot implement reliable workflows with it because of numerous reasons.

> If you’re running production workloads on Azure or relying on it for mission-critical systems, this story matters more than you think.

Well, it doesn't explode, but I really question how reliable some of these systems really are. In my experience, not at all. There was or is some genuinely good engineering below some of these systems, but I think all the buggy fluff build upon it really introduces friction.


Meanwhile, failure to clean up this particular mess was a key factor in losing a trillion dollars in market cap, according to the author.


Perhaps an important question is: why is it not incentivized in corporate environments?

I think, however, that perhaps I'm asking in the wrong arena. Unless there are people here reading this who work in the areas of a corporate environment at the level at which those decisions are made, it would really amount to guessing and stereotypes. Generally, I like to think that just about anyone can grasp that a well-made product will sell better due to its nature. I think that there must be some kind of mutual disconnect between both sides where one continues to see improvements important, and the other fundamentally does not (or does not have a functional means to measure and verify it).


Its a cool talent filter though, if you higher people the set of people that quit on doomed projects and how fast they quit is a real great indicator of technological evaluation skills.


It’s also a customer problem.

In a product where a customer has to apply (or be aware of updates), it’s easier to excite them about new features instead of bug fixes.

Especially for winning over new customers.

If the changelog for a product’s last 5 releases are only bug fixes (or worse “refactoring” that isn’t externally visible), most will assume either development is dead or the product is horribly bug ridden - a bad look either way.


> This isn't incentivized in corporate environment.

Course it is. But only by the winners who reward the employees who do the valuable work. Microsoft has all sorts of stupid reasons why they have lots of customers - all basically proxies for their customers' IT staff being used to administrating Microsoft-based systems - but if they mess up the core reasons to use a cloud enough they will fail.


You do but you then make a career out of it : you become the fixer ( and it can be a very good career , either technical or managerial)


No joke, I worked at a place where in our copy of system headers we had to #define near and far to nothing. That was because (despite not having supported any systems where this was applicable for more than a decade) there was a set of files that were considered too risky to make changes in that still had dos style near and far pointers that we had to compile for a more sane linear address space. https://www.geeksforgeeks.org/c/what-are-near-far-and-huge-p...

Now, I'm just a simple country engineer, but a sane take on risk management probably doesn't prefer de facto editing files by hijacking keywords with template magic compared with, you know just making the actual change, reviewing it, and checking it in.


Once you reach this stage, the only escape is to jump ship. Either mentally or, ideally, truly.

You're in an unwinnable position. Don't take the brunt for management's mistakes. Don't try to fix what you have no agency over.


unfortunately, what you will find is that unless you get lucky, the next ship is more of the same.

The system/management style is ingrained in corporate culture of large-ish companies (i would say if it has more than 2 layers of management from you to someone owning the equity of the business and calling the shots, it's "large").

It stems from the fact that when an executive is bestowed the responsibility of managing a company from the shareholders, the responsibility is diluted, and the agent-principle problem rears their ugly head. When several more layers of this starts growing in a large company, the divergence and the path of least resistance is to have zero trust in the "subordinates", lest they make a choice that is contrary to what their managers want.

The only way to make good software is to have a small, nimble organization, where the craftsman (doing the work) makes the call, gets the rewards, and suffers the consequences (if any). That aligns the agent-principle together.


Hierachy is the enemy of succeding projects and information flow. The more important and complex hierarchy in a culture the less likely it is to have a working software industry. Germanys and japanese endless :"old vs young, seniority vs new, internal vs external, company wide management vs project local management come to mind. Its guerilla vs army, startup vs company allover..


As someone on DACH space, the internal/external goes to the extreme of not being allowed any company infrastructure used by the internals, including some basic stuff like the coffee machine, or canteen.

I had team lunches that only happened, because naturally the team couldn't care less about the regulations, and found workarounds, like meeting by "chance" on the same place, and apparently there were no other set of tables available.


> I would say if it has more than 2 layers of management from you to someone owning the equity of the business and calling the shots, it's "large"

By that metric, my 50 employee company is "large".


well, does this company have more than 2 layers of management? Why do you need that much for only 50 people, instead of enpowering those people to make choices (after training and providing guidance on what makes for a good choice in various circumstances)?


I was once in such a position. I persuaded management to first cover the entire project with extensive test suite before touching anything. It took us around 3 months to have "good" coverage and then we started refactor of parts that were 100% covered. 5 months in the shareholders got impatient and demanded "results". We were not ready yet and in their mind we were doing nothing. No amount of explanation helped and they thought we are just adding superficial work ("the project worked before and we were shipping new features! Maybe you are just not skilled enough?") Eventually they decided to scrap whole thing. Project was killed and entire team sacked.


I’m a developer and if a team spent five months only refactoring with zero features added I would fire you too.

Refactoring and quality improvements must happen incrementally and in parallel with shipping new features and fixing bugs.


I'm a director and one of our teams just spent 8 months doing just that and it was totally justified. They're finally coming up for air and the foundation is significantly improved.

There's nuance here. Every project/team/org is different.


Welcome to Microsoft! Enjoy the ever-growing backlog of bugs to fix!


> first cover everything with tests

Beware this goal. I'm dealing with the consequences of TDD taken way too far right now. Someone apparently had this same idea.

> management who do not fully understand the problem nor are incentivized to understand it

They are definitely incentivized to understand the problem. However the developers often take it upon themselves to deceive management. This happens to be their incentive. The longer they can hoodwink leadership, the longer they can pad their resume and otherwise play around in corporate Narnia.

It's amazing how far you can bullshit leaders under the pretense of how proper and cultured things like TDD are. There are compelling metrics and it has a very number-go-up feel to it. It's really easy to pervert all other aspects of the design such that they serve at the altar of TDD.

Integration testing is the only testing that matters to the customer. No one cares if your user service works flawlessly with fake everything being plugged into it. I've never seen it not come off like someone playing sim city or factorio with the codebase in the end.


Customers don’t care about your testing at all. They care that the product works.

Like most things, the reality is that you need a balance. Integration tests are great for validating complex system interdependencies. They are terrible for testing code paths exhaustively. You need both integration and unit testing to properly evaluate the product. You also need monitoring, because your testing environment will never 100% match what your customers see. (If it does, you’re system is probably trivial, and you don’t need those integration tests anyway.)


Integration tests (I think we call them scenario tests in our circles) also only tend to test the happy paths. There is no guarantees that your edge cases and anything unusual such an errors from other tiers are covered. In fact the scenario tests may just be testing mostly the same things as the unit tests but from a different angle. The only way to be sure everything is covered is through fault injection, and/or single-stepping but it’s a lost art. Relying only on automated tests gives a false sense of security.


Unit tests are just as important as integration tests as long as they're tightly scoped to business logic and aren't written just to improve coverage. Anything can be done badly, especially if it is quantified and used as a metric of success (Goodhart's law applies).

Integration tests can be just as bad in this regard. They can be flakey and take hours, give you a false sense of security and not even address the complexity of the business domain.

I've seen people argue against unit tests because they force you to decompose your system into discrete pieces. I hope that's not the core concern here becuase a well decomposed system is easier to maintain and extend as well as write unit tests for.


The problem with unit tests these days is that AI writes them entirely and does a great job at it. That defeats the purpose of unit tests in the first place since the human doesnt have the patience to review the reams of over-mocked test-code produced by AI.

The end-result of this are things like the code leak of claude code presumably caused by ai generated ci/cd packaging code nobody bothered to review since the attitude is: who reviews test or ci/cd code ? If they break big-deal, ai will fix it.


“Premature abstraction” forced by unit tests can make systems harder to maintain.


It can but more often it’s the opposite.

Code that’s hard to write tests for tends to be code that’s too tightly coupled and lacking proper interface boundaries.


the problem is people make units too small. A unit is not an isolated class or function. (It can be but usually isn't) a unit is one of those boxes you see on those architecture diagrams.


Inability to unit test is usually either a symptom of poor system structure (e.g. components are inappropriately coupled) or an attempt to shoehorn testing into the wrong spot.

If you find yourself trying to test a piece of code and it’s an unreasonable effort, try moving up a level. The “unit” you’re testing might be the wrong granularity. If you can’t test a level up, then it’s probably that your code is bad and you don’t have units. You have a blob.


If you're writing the tests after writing the code, you're not doing TDD though.


> Once you reach this stage, the only escape is to first cover everything with tests and then meticulously fix bugs

The exact same approach is recommended in the book "Working effectively with legacy code" by Michael Feathers, with several techniques on how to do it. He describes legacy code as 'code with no tests'.


"Show me the incentives, and I will show you the outcomes" - Charlie Munger

I once worked in a shop where we had high and inflexible test coverage requirements. Developers eventually figured out that you could run a bunch of random scenarios and then `assert true` in the finally clause of the exception handler. Eventually you'd be guaranteed to cover enough to get by that gate.

Pushing back on that practice led to a management fight about feature velocity and externally publicized deadlines.


It is so hard to test those codebases too. A lot of the time there's IO and implicit state changes through the code. Even getting testing in place, let alone good testing, is often an incredibly difficult task. And no one will refactor the code to make testing easier because they're too afraid to break the code.


> I submitted several bug fixes and refactoring, notably using smart pointers, but they were rejected for fear of breaking something.

And that, my friends, is why you want a memory safe language with as many static guarantees as possible checked automatically by the compiler.


Language choices won't save you here. The problem is organizational paralysis. Someone sees that the platform is unstable. They demand something be done to improve stability. The next management layer above them demands they reduce the number of changes made to improve stability.


Usually this results in approvals to approve the approval to approve making the change. Everyone signed off on a tower of tax forms about the change, no way it can fail now! It failed? We need another layer of approvals before changes can be made!


Yeah I've seen that move pulled. Funnily enough by an ex-Microsoft manager.


Hence the rewrite-it-in-Rust initiative, presumably. Management were aware of this problem at some level but chose a questionable solution. I don't think rewriting everything in Rust is at all compatible with their feature timelines or severe shortages of systems programming talent.


In a rewrite you can smuggle in a quality lift


I had a memory management problem so I introduced GC/ref counting and now I have a non-deterministic memory management problem.


Ref counting is deterministic. Rust memory management is also deterministic: the memory is freed exactly when the owner of the data gets out of scope (and the borrow checker guarantees at compile time there is no use after that).


Cool now use the reference on another thread.


If you would use Rust, you would know that problem is solved too.

Rust solves a lot of problems, and introduces others

The promiscuous package management, chiefly. Not unusual for building a imlle programme in Rust brings in 200+ crates, from unknown authors on the Internet...

What could possibly go wrong?


They could have started with simple Valgrind sessions before moving to Rust though. Massive number of agents means microservices, and microservices are suitable for profiling/testing like that.


Visual Studio has had quite some tooling similar to it, and you can have static analysis turned on all the time.

SAL also originated with XP SP2 issues.

Just like there have been toons of tools trying to fix C's flaws.

However the big issue with opt-in tooling is exactly it being optional, and apparently Microsoft doesn't enforce it internally as much as we thought .


> However the big issue with opt-in tooling is exactly it being optional,

That's true, and that's a problem.

> and apparently Microsoft doesn't enforce it internally as much as we thought .

but this, in my eyes, is a much bigger problem. It's baffling considering what Microsoft does as their core business. Operating systems high impact software.

> Visual Studio has had quite some tooling similar to it, and you can have static analysis turned on all the time.

Eclipse CDT, which is not capable as VS, but is not a toy and has the same capability: Always on static analysis + Valgrind integration. I used both without any reservation and this habit paid in dividends in every level of development.

I believe in learning the tool and craft more than the tools itself, because you can always hold something wrong. Learning the capabilities and limits of whatever you're using is a force multiplier, and considering how fierce competition is in the companies, leaving that kind of force multiplier on the table is unfathomable from my PoV.

Every tool has limits and flaws. Understanding them and being disciplined enough to check your own work is indispensable. Even if you're using something which prevents a class of footguns.


I think the core business of MSFT has always been — building a platform, grab everyone in and seek rent. Bill figured out from 1975 so it has been super successful.

OS was that platform but in Azure it is just the lowest layer, so maybe management just doesn’t see it, as long as the platform works and government contracts keep coming in. Then you have a bunch of yes-man engineers (I’m so surprised that any principle engineer, who should be financially free, could push out plans described by the author in this series) who gives the management false hopes.


One reason why Windows is a mess, is that Satya sees Azure as actually Azure OS, Windows version of OS/360.

Ideally everyone would be using it via services hosted there, with the browser or mobile devices as thin clients.

Just two months ago,

https://blogs.windows.com/windowsexperience/2026/02/26/annou...


It’s org-dependent. On Windows, SAL and OACR are kings, plus any contraption MSR comes up with that they run on checked-in code and files bugs on you out of the blue :) Different standards.


I was waiting for that comment :) Remember that everybody, eventually, calls into code written in C.


If 90% of the code I run is in safe rust (including the part that's new and written by me, therefore most likely to introduce bugs) and 10% is in C or unsafe rust, are you saying that has no value?

Il meglio è l'inimico del bene. Le mieux est l'ennemi du bien. Perfect is the enemy of good.


That is an unexpected interpretation. Use the best tool for the job, also factoring what you (and your org) are comfortable with.


Depends on which OS we are talking about.

I know a few where that doesn't hold, including some still being paid for in 2026.


If you're sufficiently stubborn, it's certainly possible to call directly into code written in Verilog, held together with inscrutable Perl incantations.

High-level languages like C certainly have their place, but the space seems competitive these days. Who knows where the future will lead.


If you want something extra spicy, there are devices out there that implement CORBA in silicon (or at least FPGA), exposing a remote object accessible using CORBA


You didn’t miss the smiley, did you? :)


I didn't miss the smiley =)


It’s worse than that. Eventually everybody calls into code that hits hardware. That is the level that the compiler (ironically?) can no longer make guarantees. Registers change outside the scope of the currently running program all the time. Reading a register can cause other registers on a chip to change. Random chips with access to a shared memory bus can modify the memory that the comipler deduced was static. There be dragons everywhere at the hardware layer and no compiler can ever reason correctly about all of them, because, guess what, rev2 of the hardware could swap a footprint compatible chip clone that has undocumented behavior that. So even if you gave all you board information to the compiler, the program could only be verifiably correct for one potential state of one potential hardware rev.


Sure, but eliminating bugs isn't a binary where you either eliminate all of them or it's a useless endeavor. There's a lot of value in eliminating a lot of bugs, even if it's not all of them, and I'd argue that empirically Rust does actually make it easier to avoid quite a large number of bugs that are often found in C code in spite of what you're saying.

To be clear, I'm not saying that I think it would necessarily be a good idea to try to rewrite an existing codebase that a team apparently doesn't trust they actually understand. There are a lot of other factors that would go into deciding to do a rewrite than just "would the new language be a better choice in a vaccuum", and I tend to be somewhat skeptical that rewriting something that's already widely being used will be possible in a way that doesn't end up risking breaking something for existing users. That's pretty different from "the language literally doesn't matter because you can't verify every possible bug on arbitrary hardware" though.


The hardware only understand addresses and offsets, aka pointers :)


All the more reason to have memory safety on top.


Did you miss the part that writes about the "all new code is written in Rust" order coming from the top? It also failed miserably.


That was quite interesting and now I will take another point of view of the stuff I shared previously.

However given how Windows team has been anti anything not C++, it is not surprising that it actually happened like that.


It came from the top of Azure and for Azure only. Specifically the mandate was for all new code that cannot use a GC i.e. no more new C or C++ specifically.

I think the CTO was very public about that at RustCon and other places where he spoke.

The examples he gave were contrived, though, mostly tiny bits of old GDI code rewritten in Rust as success stories to justify his mandate. Not convincing at all.

Azure node software can be written in Rust, C, or C++ it really does not matter.

What matters is who writes it as it should be seen as “OS-level” code requiring the same focus as actual OS code given the criticality, therefore should probably be made by the Core OS folks themselves.


I have followed it from the outside, including talks at Rust Nation.

However the reality you described on the ground is quite different from e.g. Rust Nation UK 2025 talks, or those being done by Victor Ciura.

It seems more in line with the rejections that took place against previous efforts regarding Singularity, Midori, Phoenix compiler toolchain, Longhorn,.... only to be redone with WinRT and COM, in C++ naturally.


Because neither C nor C++ creates friction.

The whole memory safety chapter is a human problem first and foremost.

Some humans haven’t written a memory-safety bug in decades, but it requires a discipline the recent hire never acquired.

I always advocated fixing issues at their root. Humans write bugs, fix the humans. Somehow this was always regarded as taboo ever since I started at Microsoft in 2013.


May I ask, what kind of training does the new joins of the kernel team (or any team that effectively writes kernel level code) get? Especially if they haven't written kernel code professionally -- or do they ONLY hire people who has written non-trivial amount of kernel code?


There is no formal training (like bootcamp or classes) but the larger org has extensive documentation (osgwiki) and you are expected to learn and ramp-up by yourself.

I don’t think there is any kernel code writing experience requirement but the hiring bar is sky-high, you have to demonstrate that you are a programmer.


Once you reach this stage, honestly the only escape is real escape. Put your papers in and start looking for a job elsewhere, because when they go down, they will go down hard and drag you with them. It's not like you didn't try.


Though this doesn't make much sense on its surface - a bug means something is already broken, and he tells of millions of crashes per month, so it was visibly broken. 100% chance of being broken (bug) > some chance of breakage from fixing it

(sure, the value of current and potential bug isn't accounted for here, but then neither is it in "afraid to break something, do nothing")


I've experienced a nearly identical scenario where a large fleet of identical servers (Citrix session hosts) were crashing at a "rate" high enough that I had to "scale up" my crash dump collection scripts with automated analysis, distribution into about a hundred buckets, and then per-bucket statistical analysis of the variables. I had to compress, archive, and then simply throw away crash dumps because I had too many.

It was pure insanity, the crashes were variously caused by things like network drivers so old and vulnerable that "drive by" network scans by malware would BSOD the servers. Alternatively, successful virus infections would BSOD the servers because the viruses were written for desktop editions of Windows and couldn't handle the differences in the server edition, so they'd just crash the system. On and on. It was a shambling zombie horde, not a server farm.

I was made to jump through flaming hoops backwards to prove beyond a shadow of a doubt that every single individual critical Microsoft security patch a) definitely fixed one of the crash bugs and b) didn't break any apps.

I did so! I demonstrated a 3x improvement in overall performance -- which by itself is staggering -- and that BSODs dropped by a factor of hundreds. I had pages written up on each and every patch, specifically calling out how they precisely matched a bucket of BSODs exactly. I tested the apps. I showed that some of them that were broken before suddenly started working. I did extensive UAT, etc.

"No." was the firm answer from management.

"Too dangerous! Something could break! You don't know what these patches could do!" etc, etc. The arguments were pure insanity, totally illogical, counter to all available evidence, and motived only by animal fear. These people had been burned before, and they're never touching the stove again, or even going into the kitchen.

You cannot fix an organisation like this "from below" as an IC, or even a mid-level manager. CEOs would have a hard time turning a ship like this around. Heads would have to roll, all the way up to CIO, before anything could possibly be fixed.


Yeah, long periods of total disfunction get ingrained

Though just to ref my original point

> burned before, and they're never touching the stove again

Except they are sitting on the stove with their asses burning, which cuts all the needed cooling off their heads!


The better analogy is that they ran out of the kitchen in a panic, and left the pots on the burners. Some time later there is smoke curling up from under the kitchen door, but they’re used to the burning smell by now so it’s “not that big a deal”.


> Once you reach this stage, the only escape is to first cover everything with tests and then meticulously fix bugs, without shipping any new features.

Isn't this where Oracle is with their DB? Wasn't HN complaining about that?


Or to simplify the product and rebuild.


“Rebuild” is also a four-letter word though at this stage too. The customer has a panel of knob-and-tube wiring and aluminum paper-wrapped wire in the house. They want a new hot tub. They don’t want some electrician telling them they need to completely rewire their house first at huge expense, such that they cannot afford the hot tub anymore. They’ll just throw the electrician out and get some kid in a pickup truck (“You’re Absolutely Right Handyman LLC”) to run a lamp cord to their new hot tub. Once the house burns to the ground, the new owners will wire their new construction correctly.


Exactly. But he’s right about management, first the problem must be acknowledged and that may make some people look bad.


writing tests and then meticulously fixing bugs does not increase shareholders' value.


Dave Cutler and his team are a clear counter-example. They famously shipped Windows NT with zero known bugs, which clearly brought enormous shareholder value.

The problem, of course, is that this sort of thing doesn’t bring value next quarter.


once you reach the stage, the only escape is to give up on it. and move on.

somethings are beyond your control and capabilities


if the service is so shitty, why are people paying so much fucking money for it?

is microsoft committing an accounting fraud?


I worked at a startup that was using Azure. The reason was simple enough - it had been founded by finance people who were used to Excel, so Windows+Office was the non-negotiable first bit of IT they purchased. That created a sales channel Microsoft used to offer generous startup credits. The free money created a structural lack of discipline around spending. Once the startup credits ran out, the company became faced with a huge bill and difficulty motivating people to conserve funds.

At the start I didn't have any strong opinion on what cloud provider to use. I did want to do IT the "old fashioned way" - rent a big ass bare metal or cloud VM, issue UNIX user accounts on it and let people do dev/test/ad hoc servers on that. Very easy to control spending that way, very easy to quickly see what's using the resources and impose limits, link programs to people, etc. I was overruled as obviously old fashioned and not getting with the cloud programme. They ended up bleeding a million dollars a month and the company wasn't even running a SaaS!

I ended up with a very low opinion of Azure. Basic things like TCP connections between VMs would mysteriously hang. We got MS to investigate, they made a token effort and basically just admitted defeat. I raged that this was absurd as working TCP is table stakes for literally any datacenter since the 1980s, but - sad to say - at this time Azure's bad behavior was enabled by a widespread culture of CV farming in which "enterprise" devs were all obsessed with getting cloud tech onto their LinkedIn. Any time we hit bugs or stupidities in the way Azure worked I was told the problem was clearly with the software I'd written, which couldn't be "cloud native", as if it was it'd obviously work fine in Azure!

With attitudes like that completely endemic outside of the tech sector, of course Microsoft learned not to prioritize quality.

We did eventually diversify a bit. We needed to benchmark our server software reliably and that was impossible in Azure because it was so overloaded and full of noisy neighbours, so we rented bare metal servers in OVH to do that. It worked OK.


"Basic things like TCP connections between VMs would mysteriously hang"

This is like a car that can't even get you two blocks from home. Amazing.


I have had bad experiences across all major vendors.

The main reason I used to push for Azure instead during the last years was the friendliness of their Web UIs, and having the VS Code integration (it started as an Azure product after all).


Friendliness?

VSCode integration out of the box, that I can understand. But I have a really hard time calling Azure UI "friendly". Everything is behind layers of nested pointy-clicky chains with opaque or flat out misleading names.

To make things worse, their APIs also follow the same design. Everything you actually would want to do is behind a long sequence of pointer-chasing across objects and service/resource managers. Almost as if their APIs were built to directly reflect their planned UI action sequences.


Yes, some of us grew out of the 1970's approach to command line, unless there is no other way.

GCP is the worse some options are only available on the CLI, without any visual feedback on the dashboard.


Corporate inertia. Sibling comment uses the term "hostage situation" which I admit is pretty apt.

Microsoft is an approved vendor in every large enterprise. That they have been approved for desktop productivity, Sharepoint, email and on-prem systems does not enter the picture. That would be too nuanced.

Dealing with a Large Enterprise[tm] is an exercise in frustration. A particular client had to be deployed to Azure because their estimate was that getting a new cloud vendor approved for production deployments would be a gargantuan 18-to-24 month org-wide and politically fraught process.

If you are a large corp and have to move workloads to the cloud (because let's be honest: maintaining your own data centres and hardware procurement pipelines is a serious drag) then you go with whatever vendor your organisation has approved. And if the only pre-approved vendor with a cloud offering is Microsoft, you use Azure.


The US government’s experts called Azure “a pile of shit”; they got overruled.

https://www.propublica.org/article/microsoft-cloud-fedramp-c...


Because Azure customers are companies that still, in 2026 only use Windows. Anyone else uses something else. Turns out, companies like that don't tend to have the best engineering teams. So moving an entire cloud infrastructure from Azure to say AWS, probably is either really expensive, really risky or too disruptive to do for the type of engineering team that Azure customers have. I would expect MS to bleed from this slowly for a long time until they actually fix it. I seriously doubt they ever will but stranger things have happened.


Turns out outside companies shipping software products aspiring to be the next Google or Apple, most companies that work outside software industry also need software to run their business and they couldn't care less about HN technology cool factor.

They use whatever they can to ship their products into trucks, outsourcing their IT and development costs , and that is about it.


Agreed, though only up to a point. Companies that need software to run their business, need that software to run.

When your operations are constantly hampered by Azure outages, and your competitors' are not, you're not going to last if your market is at all competitive. Thankfully for many companies, a lot of markets aren't, I suppose, at least for the actors who have established a successful rent and no longer need to care how their business operations are going.


I have worked at two retail companies where AWS was a no no. They didn't want to have anything depending on a competitor(Amazon). So they went the Azure route.


CFOs love it because Microsoft does bundle pricing with office. Plus they love to give large credits to bootstrap lock-in.


You’re assuming the alternatives don’t have just as many issues. There’s been exactly one “whistleblower” who is probably tiptoeing the line of a lawsuit. I wouldn’t assume just because there isn’t a similar disgruntled gcp or aws engineer doesn't mean they don't have similar ways.


this made me look into how cloud hypervisors actually work on HW level.. they all offload it to custom HW (smart nic, fpga, dpu, etc..). cpu does almost nothing except for tenant work. AWS -> Nitro, Azure -> FPGA, NVIDIA sells DPUs.

Here is interactive visual guide if anyone wants to explore - https://vectree.io/c/cloud-virtualization-hardware-nitro-cat...


VM management does not run on the FPGA; it’s regular Win32 software on Windows, with aspirations to run some equivalent, someday, on the SoC next to the FPGA on the NIC. The programmable hardware is used for network paths and PCIe functions, where it can project NICs and NVMe devices to VMs to bypass software-based, VMBus-backed virtual devices, all of which end up being serviced on the host who controls the real hardware. Lookup SR-IOV for the bypass. So yes, that’s I/O bypass/offload, but the VM management stack offload is a distinct thing that does not require an FPGA, just a SoC.


most the upper management of companies who use them have dont have the technical competence to see it. (eg: banks, supermarket chains, manufacturing companies)

once they are in, no one likes to admit they made a mistake.


Depending on the space you work in, you have almost no choice at all. If you're building for government then you're going to use Microsoft, almost "end of story".


It’s more of a hostage situation.


Yeah it’s entirely business people and executives who make these decisions in most companies. Not the ones who use it or implement on it.


Because the alternatives are also in similar state.

AWS or GCP are all pretty crap. You use any of them, any you'll hit just enough rough edges. The whole industry is just grinding out slop, quality is not important anywhere.

I work with AWS on a daily basis, and I'm not really impressed. (Also nor did GCP impress me on the short encounter I had with it)


I don't know about AWS or the rest of GCP, but in terms of engineering, my experience of GCE was at least an entire order of magnitude better than what the article alleges about Azure. Security and reliability were taken extremely seriously, and the quality of the engineering was world-class. I hope it has stayed like this since then. It was a worthwhile thing to experience.


This isn't it at all. AWS does not have the same sorts of insane cross-tenancy exploits that Azure has had, for example.

The reason that Azure has so many customers is very simply because Azure is borderline mandated by the US government.


If memory serves, Windows 2000 was the last version where search worked reliably. It was a simple linear search through files which could take a while on larger folders, but was reliable and predictable since it did not rely on a background indexing service which seems to get stale or just plain wrong most of the time.

If I search for “foo”, I’d like to get all files containing “foo” please, without a shadow of a doubt that some files were skipped, including those that I have recently created. I still can’t get that as of Windows 11!


> It was a simple linear search through files which could take a while on larger folders, but was reliable and predictable since it did not rely on a background indexing service which seems to get stale or just plain wrong most of the time.

It would be easy to have your cake and eat it too. Have the file search default to the index. Allow frustrated users to then click a button that says "search harder" which would initiate the full enumeration of the relevant filesystems. Of course some UX professional will tell me I'm wrong, they don't like anything they didn't think of themselves.


My guess would be bad hashing, resulting in too many collisions.


> often you have to inject data into new tables or columns

No tool can help you with that, simply because this kind of data migration depends on your particular business logic that the tool has no way of knowing about.

While SQL Server Data Tools has its warts, it has been immensely useful for us in making sure every little detail gets handled during migration. That doesn't usually mean that it can do the entire migration itself - we do the manual adjustments to the base tables that SSDT cannot do on its own, and then let it handle the rest, which in our case is mostly about indexes, views, functions and stored procedures.

After all that, SSDT can compare the resulting database with the "desired" database, and reliably flag any differences, preventing schema drift.


Why use string as status, instead of a boolean? That just wastes space for no discernable benefit, especially since the status is indexed. Also, consider turning event_type into an integer if possible, for similar reasons.

Furthermore, why have two indexes with the same leading field (status)?


Boolean is rarely enough for real production workloads. You need a 'processing' state to handle visibility timeouts and prevent double-execution, especially if tasks take more than a few milliseconds. I also find it crucial to distinguish between 'retrying' for transient errors and 'failed' for dead letters. Saving a few bytes on the index isn't worth losing that observability.


> Boolean is rarely enough for real production workloads. You need a 'processing' ... 'retrying'... 'failed' ...

If you have more than 2 states, then just use integer instead or boolean.

> Saving a few bytes on the index isn't worth losing that observability.

Not sure why having a few well-known string values is more "observable" than having a few well-known integer values.

Also, it might be worth having better write performance. When PostgreSQL updates a row, it actually creates a new physical row version (for MVCC), so the less it has to copy the better.


Postgres supports enum that would fit this use case well. You get the readability of text and the storage efficiency of an integer. Adding new values used to require a bit of work, but version 9.1 introduced support for it.


Postgres does index de-duplication. So it's likely that even if you change the strings to enums, the index won't be that much smaller.

> Furthermore, why have two indexes with the same leading field (status)?

That indeed is a valid question.


That's true for seeks into the clustered (primary) index because that index includes all fields, so you don't need to "jump" to the heap to get them.

However, seeking into a secondary index, and then reading a column not included in that index incurs an additional index seek (into the clustered index), which may be somewhat slower than what would happen in a heap-based table.

So there are pros and cons, as usual...


I have found very minimal penalty on secondary index reads in practice such that it has never made a difference.

Remember some databases always use clustered index internally (SQLite, MySql) such that even if you have no primary key they will create a hidden one instead for use with the index.

https://www.sqlite.org/rowidtable.html

It is nice to have the choice which way to go and would be nice if PG implemented this. It can have significant space savings on narrow table with one primary index and performance advantages.


For inserts, you cannot escape writing into the base table and all indexes. However, my understanding is that for updates PostgreSQL has a write amplification problem due to the fact that each time a row is updated this creates a new row (to implement MVCC), and a new physical location in the heap, so all indexes need to be updated to point to the new location, even those not containing the updated columns.

OTOH, with a heap-less (aka. clustered, aka. index organized) table, you would only have to update the indexes containing the columns that are actually being updated. You don't need to touch any other index. Furthermore, only if you are updating a key column would you physically "move" the entry into a different part of the B-tree. If you update an included column (PK columns are automatically "included" in all secondary indexes, even if not explicitly mentioned in the index definition), you can do that in-place, without moving the entry.

Here is how this works in SQL Server - consider the following example:

    CREATE TABLE T (

        ID int,
        NAME nvarchar(255) NOT NULL,
        AMOUNT int NOT NULL,

        CONSTRAINT T_PK PRIMARY KEY (ID)

    );

    GO

    CREATE INDEX T_I1 ON T (NAME);

    GO

    CREATE INDEX T_I2 ON T (AMOUNT);
Now, doing this...

    UPDATE T SET AMOUNT = 42 WHERE ID = 100;
...will only write to T_PK and T_I2, but not T_I1. Furthermore T_PK's entry will not need to be moved to a different place in the B-tree. SQL Server uses row versioning similar to PostgreSQL, so it's conceivable that PostgreSQL could behave similarly to SQL Server if it supported clustered (index-organized) tables.


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: