I interpret YAGNI to mean that you shouldn't invest extra work and extra code complexity to create capabilities that you don't need.
In this case, I feel like using the filesystem directly is the opposite: doing much more difficult programming and creating more complex code, in order to do less.
It depends on how you weigh the cost of the additional dependency that lets you write simpler code, of course, but I think in this case adding a SQLite dependency is a lower long-term maintenance burden than writing code to make atomic file writes.
The original post isn't about simplicity, though. It's about performance. They claim they achieved better performance by using the filesystem directly, which could (if they really need the extra performance) justify the extra challenge and code complexity.
Honestly, at this point, if I had a design that required making atomic changes to files, I'd redo the design to use SQLite. The other way around sounds crazy to me.
"Why use spray paint when you can achieve the same effect by ejecting paint from your mouth in a uniform high-velocity mist?" If you happen to have developed that particular weird skill, by all means use it, but if you haven't, don't start now.
That probably sounds soft and lazy. I should learn to use my operating system's filesystem APIs safely. It would make me a better person. But honestly, I think that's a very niche skill these days, and you should consider if you really need it now and if you'll ever benefit from it in the future.
Also, even if you do it right, the people who inherit your code probably won't develop the same skills. They'll tell their boss it's impossibly dangerous to make any changes, and they'll replace it with a database.
> They'll tell their boss it's impossibly dangerous to make any changes, and they'll replace it with a database.
This, 100%. Development today is driven by appearances though, you can take advantage of that. Give it a cute name, make sure you have AI generate an emoji-rich README for it, publish it as an open source npm package, then trigger CI a few thousand times to get a pretty download count. They will happily continue using it without fear!
If you start a new job and on your first day they go "Yeah the last guy said we don't need a database, so he rolled his own." are you gonna be excited, or sweating?
Exception being perhaps "The last team chose to build their own data layer, and here's the decision log and architecture docs proving why it was needed."
Serious question, why are people here acting as if formatted files are somehow more reliable than a DB? That just simply isn't true. For most of software development's history, using flat files for persistence of data was the wrong thing to do with good reason. Flat files can easily be corrupted, and that happens much more often than a DB gets corrupted. The reason you might think otherwise is just sampling bias.
I do believe that you are missing a healthy dose of sarcasm. Such as faking downloads to give yourself inflated statistics so that your employer will trust untested and AI-written garbage.
That said, there really are good use cases for readable formatted files. For example configuration files that are checked into source control are far more trackable than a SQLite database for the same purpose. For another example, the files are convenient for a lot of data transfer purposes.
But for updateable data? You need a really good reason not to simply use a database. I've encountered such edge cases. But I've encountered a lot more people who thought that they had an edge case, than really did.
The problem is that most of the time when you want "atomic changes to files" the only safe API is copy the file, mutate it, then rename. That doesn't factor in concurrent writers or advisory locks.
If that kind of filesystem traffic is unsuitable for your application then you will reinvent journaling or write-ahead logging. And if you want those to be fast you'll implement checkpointing and indexes.
Advice like this turns almost everybody's normal state into a disorder.
"Go to sleep only when you are very tired" is a child's approach to sleep, it's what we all want to do, and by adulthood we learn that it's counterproductive. But we still want it so much that we regularly test it and are reminded why we don't operate that way.
It reminds me of the intuitive eating folks who say, "Ignore standard diet advice, just listen to your body and feed it what it knows you need," but then when you overeat, they say, "You aren't listening properly, you aren't in tune with your body." Then if you ask, "How will I know when I'm in tune with my body and listening to it properly?" they say, "When what it asks for matches standard diet advice."
If my Oura ring can be trusted, alcohol doesn't interfere with my total amount of sleep or my REM sleep, but it reduces my deep sleep drastically and can even result in me getting zero deep sleep, which hasn't happened a single time without alcohol.
Honestly, if I could quit my job for six months and work in a codebase like yours, I'm extremely curious what I could accomplish with AI.
We have a codebase at work that was "stuck." We've consistently done minor library upgrades, but no major upgrades in several years, and was recognized as a major piece of technical debt / minor disaster for almost two years, in that we urgently needed to dedicate an engineer to it for a month or more to bring it up to date. We also suspected that framework upgrades would improve performance enough to save us a little bit in operating costs. I got curious, created a branch, and threw Claude at it. Claude knocked it out in a couple of days while I mostly worked on other things. Then we dedicated several engineer days to doing extra manual testing. Done and deployed. Now we're ready to experiment with giving it less resources to see if the performance improvement holds up in practice.
This codebase was only about 200k lines of code, so probably smaller than yours. Really curious how it would go with a larger codebase.
EDIT: Claude may only have taken a couple of days because I was only checking in occasionally to give it further instructions. I don't know how fast it would have been with my complete attention.
You couldn't go far on those early Prius batteries. I had a circa-2009 Prius and semi-intentionally ran out of gas to see what happened. I was able to drive a couple of miles to a gas station, but the battery was depleting extremely quickly, and I doubt it would have lasted ten minutes.
These interactions really don't get the testing they need.
When they aren't designed, how do you know how to test?
Over the weekend, I was directed to file a police report with a chatbot and could not complete it because it was asking for information that did not exist and did not apply to my case.
(I'm sure somebody is going to say that this can be solved by having LLMs role play as victims and have an LLM observe and decide what's a failing test case and what isn't.)
This is exactly it. That's why the glasses have the same basic form (stem, bowl, and tapered rim) as wine glasses and snifters. The liquid sits in the bowl, and the aroma is captured in the empty space between the liquid and the rim.
I think it's important to think about architectural and domain bounds on problems and check if the big-O-optimal algorithm ever comes out on top. I remember Bjarne Stroustrup did a lecture where he compared a reasonably-implemented big-O-optimal algorithm on linked lists to a less optimal algorithm using arrays, and he used his laptop to test at what data size the big-O-optimal algorithm started to beat the less optimal algorithm. What he found was that the less optimal algorithm beat the big-O-optimal algorithm for every dataset he could process on the laptop. In that case, architectural bounds meant that the big-O-optimal algorithm was strictly worse. That was an extreme case, but it shows the value of testing.
Domain bounds can be dangerous to rely on, but not always. For example, the number of U.S. states is unlikely to change significantly in the lifetime of your codebase.
Anecdotally, what we found in Austin was a combination of two factors:
First, awareness of the futility and selfishness of "growth elsewhere" as a solution is much higher in younger people — and by younger, I mean currently under fifty. Generational turnover in Austin had been eating away at the NIMBY majority, and conversations about housing in Austin have long been polarized more by age than by left/right political sentiment. There's a caricature, with a strong vein of truth, of the old Austin leftist who has Mao's little red book on their shelves and thinks apartment buildings are an abomination, and Austinites of that generation are experiencing mortality. At the same time, younger people are adopting more and more urbanist mindsets compared to their parents.
However, I think a much much bigger factor was the influx of younger people, especially young people with experience of larger cities, diluting the votes of the older NIMBYs. Austin has been shaped by growth for half a century, but its "discovery" in the 2000s and very brief status as a darling of coastal hipsters (remember that term?) has had a lasting effect on Austin's popularity and its demographics. It's been twenty years since it was the "it" place for Brooklynites to visit, but in that twenty years, it's had a lot of exposure for young urban dwellers, and some of them discovered they liked it and moved here, bringing their comfort with dense living and their appreciation that growth can bring a lot of positives.
Personally, every homeowner I know in Austin has seen their houses depreciate significantly this decade, and I don't think it changed a single person's mind about Austin's housing policy. People who opposed the reforms are bitter about the outcome, and people who supported the reforms say it sucks for us personally, but it's what we set out to accomplish, and we're glad that it worked.
People see lower property taxes as a silver lining for short-term swings in the market, but I don't know anybody who thinks this is a short-term swing that they can ride out.
Nobody is happy about their property values going down long term. It exposes them to the risk of a big loss if they're forced to sell because of events in their life.
> Austinites of that generation are experiencing mortality.
This is such a funny and novel way of saying "old people in Austin are dying" I just had to point it out.
Also, I like the way this comment is written in general. Felt easy to read for its length, and most importantly the tone stayed fun and personal while still being informative and on topic.
In this case, I feel like using the filesystem directly is the opposite: doing much more difficult programming and creating more complex code, in order to do less.
It depends on how you weigh the cost of the additional dependency that lets you write simpler code, of course, but I think in this case adding a SQLite dependency is a lower long-term maintenance burden than writing code to make atomic file writes.
The original post isn't about simplicity, though. It's about performance. They claim they achieved better performance by using the filesystem directly, which could (if they really need the extra performance) justify the extra challenge and code complexity.
reply