Hacker Newsnew | past | comments | ask | show | jobs | submit | andrekandre's commentslogin

yes, this is an issue i see too... also fixing it up takes alot of time (sometimes more if i just 'one-shotted' it myself)... idk these tools are useful, but i feel like we are going too far with 'just let the ai do everything'...

  > more tractable like tickets completed
easy, just make tickets smaller in size (i see people doing this with prs already now to hit 'ai kpis')...

  > This AI rollout has been fundamentally rushed and fucked from the very beginning
fake it till you make it has been modus operandi for tech for almost as long as i've been alive... i feel like this is the apotheosis of this kind of thinking...

  > Nope, it’s just “ModelGPClaude can make mistakes! Better be careful!”
"use at your own risk" and "no guarantees warranted or expressed" is basically in every single eula from tech as well... its not a new trend sadly...

  > Random people cure cancer for their dog
this not a serious comment


  > Planning someone's agenda, preparing relevant documents, arranging and coordinating things, translations (speech or text), narration, grammar checking
the issue is, these things "lie" subtly and not so subtly (they make up issues, rename agendas, forget questions and change meanings all the time) and for me that is a deal-breaker for a business tool that i need to rely on

Yes, for me as well, but large chunks of these tasks seem within the realm of what they can do when you break it up into small enough bits and control the prompt very tightly

Particularly machine translations are no worse than what an untrained native speaker would come up with, and much better than traditional translators (due to some level of context "understanding" - or simulation thereof, at least). At 50x human speed, the energy consumption is also lower than keeping a human alive for that time. There is no scenario in which this capability goes unused

Or grammar checking, if you catch 98% (as even some of the weaker models seem to achieve), the editor who'd otherwise do this can do more intellectually stimulating things

It's not that there's no downsides but it also seems silly to dismiss it altogether


> Particularly machine translations are no worse than what an untrained native speaker would come up with, and much better than traditional translators

Sometimes. I use Google Translate (literally the same architecture, last I heard), and when it works, great. Every single time I've tried demonstrating that it can't do Chinese by quoting the output it gives me from English-to-Chinese, someone replies to tell me that the translated text is gibberish*.

Even with an easier pair, English <-> German, sometimes I get duplicate paragraphs. And there's definitely still cases where even the context-comprehension fails, as you should be able to see from going to a random German website e.g. https://www.bahn.de/ in e.g. Chrome and translating it into English and noticing the out-of-place words like how destination is "goal", the tickets are "1st grade" and "2nd grade" instead of class.

* I'm curious if this is still true, so let's see:

这是一个简单的英文句子,需要翻译成中文。上次我翻译的时候,有人告诉我译文几乎无法理解。

我不懂中文,所以需要懂中文的人告诉我现在是否仍然如此。


(not the downvoter)

I'm not sure if we're on the same page. I mean LLMs right? Not whatever Google Translate and DeepL use. The latter was better than gtrans when it launched, nowadays it's probably similar idk, and both are machine learning clearly, but the products(' quality) predates LLMs. They're not LLMs. They haven't noticeably improved since LLMs. Asking an LLM produces better output (so long as the LLM doesn't get sidetracked by the text's contents). Presumably also orders of magnitude higher energy consumption per word, even if you ignore training

I agree that Google Translate, now on par with DeepL's free product afaik (but I'm not a gtrans user so I don't know), is decent but not a full replacement for humans, and that LLMs aren't as good as human translations either (not just for attention reasons), but it's another big step forwards right?


I'm not sure what DeepL uses, but Google invented the Transformer architecture, the T in GPT, for Google Translate.

IIRC, the original difference between them was about the attention mask, which is akin to how the Mandelbrot and Julia fractals are the same formula but the variables mean different things; so I'd argue they're basically still the same thing, and you can model what an LLM does as translating a prompt into a response.


I didn't know that! I had heard they made transformers and (then-Open)AI used it in GPT, but that explains how come Google wasn't then first to market with an LLM product when the intended application was translation

  > It's not that there's no downsides but it also seems silly to dismiss it altogether
definitely silly to dismiss them all together, but the issue is using it for everything where its not appropriate or unreliable; so in the context of my posting, i cant rely on it for the things i outlined, thats all

> these things "lie" subtly

Do you think they have intent?


I assume that's just a manner of speaking, like a judgmental form of hallucination

I remember HN piling on me for saying something along the lines of evolution causing a property (am I stupid, do I not understand that it's not intelligently chosen) rather than some unwieldy statement about a property having a positive selection pressure. I'm also much more familiar with the English phraseology of this non-tech topic now (so I can actually say that in the few words I just used), do we even have that vocabulary for LLMs?


pr comments from a human that is generated by ai has got me feeling the same... like why this person even here? its totally disrespectful; i want a person to interact with not a machine with a meatsuit.

  > Whenever I see a blog post that starts with an obvious AI hero image, when it has the "It's not X, it's Y" framing, when it has anything that smells like AI
yes, n=1 (ok n=2 i guess) but noticing that is an immediate back button press for me but its getting harder and harder to avoid as search results become inundated with this stuff

  > But then I'm like "hmm actually let me try this real quick" and I prompt Claude for 3 minutes, and 30 minutes later it has one-shotted the whole "two weeks project". It then gets reviewed and merged by the "non-believers". This happens repeatedly.
this is a nice anecdote but i think the real issue is the forcing and kpi-nization of llms top-down for nearly everything

there are still code-quality issues, prompting issues for long-running tasks, some things are just faster and more deterministic with normal code generators or just find-and-replace etc

people are annoyed at the force-feeding of llms/ai into everything even when its not needed

somethings can be one-shotted and some things cant, and that is fine and perfectly normal but execs don't like that because its not the new hotness


> somethings can be one-shotted and some things cant

True but my point is that people vastly underestimate what is one-shottable.

In my experience, 80% of the times an average "non-believer" SW engineer with 7 years experience says something is not one-shottable, I, with my 15 years of experience, think it is fact one-shottable. And 20% of the time, I do verify that by one-shotting it on my free time.


  > if no-one is forced to bake it.
not to take away from the point too much, but i think the whole idea of market economies is nobody needs to be forced to do anything, no?

Well, that is the point :) we don't fret about where the bread comes from too much, or talk about how we need to act now lest we never have bread again. People want bread, and the price goes up until someone is willing to make bread.

  > It is as if you had a group of execs determining what IDEs people could use.
its worse than that; its more like determining what ide you use and also mandating how much time you spend in it, and then chewing you out at review time because you used jira and confluence too much instead of writing md files in the blessed ide of their choice

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: