Hacker Newsnew | past | comments | ask | show | jobs | submit | zulban's commentslogin

Curious. Not my experience whatsoever.

I tried Claude recently and it was able to one-shot fixes on 9/9 of the bugs I gave it on my large and older Unity C# project. Only 2/9 needed minor tweaks for personal style (functionally the same).

Maybe it helps that I separately have a CLI with very extensive unit tests. Or that I just signed up. Or that I use Claude late in the evenings (off hours). I also give it very targeted instructions and if it's taking longer than a couple minutes - I abort and try a different or more precise prompt. Maybe the backend recognizes that I use it sparingly and I get better service.

The author describes what sounds like very large tasks that I'd never hand off to an AI to run wild in 2026.

Anyway I thought I'd give a different perspective than this thread.


People hired to do jobs they cannot do have many, many more methods than that. For thousands of years.

You're presenting this as legally clear but it's not. To the detriment of your point.

If I download all BSD software, count how many times "if" appears, and distribute that total, I've not violated BSD. AI generated code is different than that but not totally different.

Ignore nuance and the adults will ignore you.


Nobody can be bothered to make my cat out of Lego and the size of mount Everest but if an AI did I'd sure love to see it.

Your quip is pithy but meaningless.


I'm not saying it's worthless for yourself, it's worthless to me as a viewer. AI content is great for your own usage, but there is no point posting and distributing AI generation.

I could have generated my own content, so just send the prompt rather than the output to save everyone time.


And when the distilled knowledge/product is the result of multiple prompts, revisions, and reiterations? Shall we send all 30+ of those as well so as to reproduce each step along the way?

Maybe reread my comment. Would you not want to see a mount Everest sized Lego cat? Even if it were my cat?

Again - your quip sounds good but when you think about it, it's flatly wrong.


This doesn't make sense, if I want to see a lego-cat slopimage I can just prompt a model myself (and have it be of my own cat). There's no reason for you to be involved in any part of that process, because the point of this stuff is that you are not doing anything.

The claim is that people don't / shouldn't want to see something if humans can't be bothered to make it. I provided a counter example. So the claim is nonsense.

Nothing unreasonable about wanting to live healthy and longer. It's not likely tho.

A guess?

That’s what people always say until science progresses. I remember when we believed HIV would not be treatable.

Science advances one funeral at a time.


Fun. I've upgraded my game a few times over the years. It started in 2018 so I started with a version slightly older than that. Some of these changes seem familiar to me. I had a fairly similar experience as my game also has always been C# and simple. I have always carefully avoided any fancy new Unity features and just use the core engine to deliver my game to many platforms. Neat to hear the author worked on the deprecated renames which I also remember.

Ridiculous. They are clearly not trying at all. A hard wall preventing going over budget by 100x in a couple hours is not some devilishly complicated decentralized system problem.

Don't tote the party line.

Same reason why Azure AI only has easy rate limits by minute, not by day or week or month. Open source proxy projects do it easily tho. Think about the incentives.

Going over a hard cap by 3% would be a reasonable failure to make, not by 30000%.


Generally, published papers don't give a damn about reproducibility. I've seen it identified as a crisis by many. Publishers, reviewers, and researchers mostly don't care about that level of basic rigor. There's no professional repercussions or embarrassment.

Agreed - if I was a reviewer for LLM papers it would be an instant rejection not listing the versions and prompts used.


I'm not so sure of that opinion on reproducibility. The last peer review I did was for a small journal that explicitly does not evaluate for high scientific significance, merely for correctness, which generally means straightforward acceptance. The other two reviews were positive, as was mine, except I said that the methods need to be described more and ideally the code placed somewhere. That was enough for a complete rejection of the paper, without asking for the simple revisions I requested. It was a very serious action taken merely because I requested better reproducibility!

(Personally I think the lack of reproducibility comes back mostly to peer reviewers that haven't thought through enough about the steps they'd need to take to reproduce, and instead focus on the results...)


I'm not sure how one example contradicts documented huge overall trends, but okay.


I think publishers care about this a lot, but most researchers do not seem to care as much about reproducibility.


> and instead focus on the results...

This points to (and everyone knows this) incentives misalignment between the funders of research and the public. Researchers are caught in the middle


Eh, I'm not so sure about the funding side there, researchers are not really caught at all and are fully responsible, IMHO. Peer reviewers exist to enforce community standards, and are not influenced to avoid reproducibility concerns by funding sources. The results are always more interesting than reproducibility, of course, and I think that's why the get the attention! Also, there needs to be greater involvement of grad students (who do most of the actual work) in peer review, IMHO, because most PIs spend their day in meetings reviewing results, setting directions, writing grants, and have little time for actual lab work, and are thus disconnected from it.

There needs to be more public naming and shaming in science social media and in conference talks, but especially when there are social gatherings at conferences and people are able to gossip. There was a bit of this with Google's various papers, as they got away with figurative murder on lack of reproducibility for commercial purposes. But eventually Google did share more.

Most journals have standards for depositing expensive datasets, but that's a clear yes/no answer. Reproducibility is a very subjective question in comparison to data deposition, and must be subjectively evaluated by peer reviewers. I'd like to see more peer review guidelines with explicit check boxes for various aspects of reproducibility.


> Reproducibility is a very subjective question in comparison to data deposition

Yeah I can definitely see why this is the case because it isn’t real until someone actually tries to reproduce the results. At that point it leaves the realm of subjectivity and becomes a question of cost.


The comment is wrong -- model versions are clearly specified in the supplement.


The same about surveys and polls. I know no one who has ever been polled or surveyed. When will we stop this fascination with made up infographics crisis?


> Generally, published papers don't give a damn about reproducibility

While this is sadly true, it's especially true when talking about things that are stochastic in nature.

LLMs outputs, for example, are notoriously unreproducible.


> LLMs outputs, for example, are notoriously unreproducible.

Only in the same way that an individual in a medical study cannot be "reproduced" for the next study. However the overall statistical outcomes of studying a specific LLM can be reproduced.


Do they reproduce any submitted papers at all?

Does this happen?

I can remember this room-temperature-super-conductor guy whose experiments where replicated, but this seems rare?


Yes, those are the only papers that worth a jot of reading.


Not comfortable. But making choices in the real world is about choosing the best option, not the perfect option.


I've learned a bit today about how often people on hn read the article when commenting. Or potentially bots who are way off. The title alone isn't enough to totally grasp what happened here, or the methods used.

Extremely conservative detection. The real number must be much higher.


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: