Nice to have these all collected nicely and sharable. For the amusement of HN let me add one I've become known for at my current work, for saying to juniors who are overly worried about DRY:
> Fen's law: copy-paste is free; abstractions are expensive.
edit: I should add, this is aimed at situations like when you need a new function that's very similar to one you already have, and juniors often assume it's bad to copy-paste so they add a parameter to the existing function so it abstracts both cases. And my point is: wait, consider the cost of the abstraction, are the two use cases likely to diverge later, do they have the same business owner, etc.
I think the trouble is that the headline is ambiguous and may confuse people about the theme of the article, although if you'd simply apply common sense, you could reason out that the author can't realistically ask for "fewer AI agents".
A hyphenation would assist in comprehension, in this and many other cases. However, while editing Wikipedia, I found that the manuals of style and editor preferences are anti-hyphenation -- I'm sorry, anti hyphenation, in a lot of cases!
Some more verbosity would've helped, e.g. "I want AI agents to be less human" but as always, headlines use an economy of words.
Oh I understood the aside was for me. Again, not a thing. This one in particular really bugs the shit out of me because it's brought up as utterly useless pedantry in 100% of cases.
> But for more than 200 years almost every usage writer and English teacher has declared such use to be wrong. The received rule seems to have originated with the critic Robert Baker, who expressed it not as a law but as a matter of personal preference. Somewhere along the way—it's not clear how—his preference was generalized and elevated to an absolute, inviolable rule. . . . A definitive rule covering all possibilities is maybe impossible. If you're a native speaker your best bet is to be guided by your ear, choosing the word that sounds more natural in a particular context. If you're not a native speaker, the simple rule is a good place to start, but be sure to consider the exceptions to it as well.
I'm fond of linguistic bugbears, and have actually sent that same article to people before :D But what you're missing is that the less/fewer debate is over their use as adjectives, and TFA's title uses "less" as an adverb. It's asking for AI agents to be less human, not for them to be fewer in number. Swapping it to "fewer" would make the title's meaning no longer match the article.
Now please sit a moment and reflect on what you've done. :P
What you're asking for is exactly what's in the link you replied about. It collects analysis of each solution (or attempt), and info about whether the AI's solution could be found anywhere in the literature.
The high-order bit for for each case is the category it's in and the "Outcome" column - that summarizes if the solution was full/partial/wrong, if AI had assistance, etc. Then further discussion for each one is linked from the number.
Then the "Literature result" columns have a citations for where similar published results were found. The ones with no "Literature" column, like in the first section, are cases where no similar published results have been found (implying that the solution would not have been trained on). Note that in some cases a published solution was found but it wasn't similar to the AI's.
(this is all explained with more detail and caveats at the top of the page)
Sorry, I suppose I'm asking for a lot of handholding here, which isn't really fair. I'm actually just sick right now and have crazy brain fog. Thanks for the assistance! I'll read through.
FWIW I've wavered on this topic quite a bit. Not too long ago I leaned more heavily towards "complex cognitive capabilities can be expressed using statistical token generation", I've started leaning the other way, but I'm not committed so it's great to circle back on the state of things.
Not at all - didn't mean to sound snarky, I just wanted to add that I was omitting details and caveats.
FWIW, personally I think it muddies things to frame the question as if "..using statistical token generation" was a limitation. NNs are Turing-complete, so what LLMs do can just be considered "computation" - the fact that they compute via statistical token generation is an implementation detail.
And if you're like most people, "can cognition happen via computation?" is a less controversial question, which then puts LLMs/cognition topics easily into the "in principle, obviously, but we can debate whether it's achievable or how to measure it" category.
> We went from 2 + 7 = 11 to "solved a frontier math problem" in 3 years, yet people don't think this will improve?
All that says is that the speaker thinks models will improve past where they are today. Not that it's a logical certainty (the first thing you jumped on them for), and certainly not anything about "limitless potential for growth" (which nobody even mentioned). With replies like this, invoking fallacies and attacking claims nobody made, you're adding a lot of heat and very little light here (and a few other threads on the page).
> All that says is that the speaker thinks models will improve past where they are today. Not that it's a logical certainty
Exceedingly generous interpretation in my opinion. I tend to interpret rhetorical questions of that form as “it’s so obvious that I shouldn’t even have to ask it”.
The term of art for that is steelmanning, and HN tries to foster a culture of it. Please check the guidelines link in the footer and ctrl+f "strongest".
It's not a side effect of tokenization per se, but of the tokenizers people use in actual practice. If somebody really wanted an LLM that can flawlessly count letters in words, they could train one with a naive tokenizer (like just ascii characters). But the resulting model would be very bad (for its size) at language or reasoning tasks.
Basically it's an engineering tradeoff. There is more demand for LLMs that can solve open math problems, but can't count the Rs in strawberry, than there is for models that can count letters but are bad at everything else.
LLMs are bad at arithmetic and counting by design. It's an intentional tradeoff that makes them better at language and reasoning tasks.
If anybody really wanted a model that could multiply and count letters in words, they could just train one with a tokenizer and training data suited to those tasks. And the model would then be able to count letters, but it would be bad at things like translation and programming - the stuff people actually use LLMs for. So, people train with a tokenizer and training data suited to those tasks, hence LLMs are good at language and bad at arithmetic,
This is like saying chess engines don't actually "play" chess, even though they trounce grandmasters. It's a meaningless distinction, about words (think, reason, ..) that have no firm definitions.
This exactly. The proof is in the pudding. If AI pudding is as good as (or better than) human pudding, and you continue to complain about it anyway... You're just being biased and unreasonable.
And by the way, I don't think it's surprising that so many people are being unreasonable on this issue, there is a lot at stake and it's implications are transformative.
We know that chess can be solved, in theory. It absolutely isn't and probably will never be in practice. The necessary time and storage space doesn't exist.
Chess is absolutely not a solved game, outside of very limited situations like endgames. Just because a best move exists does not mean we (or even an engine) know what it is
Emoji originally came from Docomo phones in Japan around 1999. (Or I think those were the first ones actually called "emoji"; some other earlier devices had similar character sets.)
> Fen's law: copy-paste is free; abstractions are expensive.
edit: I should add, this is aimed at situations like when you need a new function that's very similar to one you already have, and juniors often assume it's bad to copy-paste so they add a parameter to the existing function so it abstracts both cases. And my point is: wait, consider the cost of the abstraction, are the two use cases likely to diverge later, do they have the same business owner, etc.
reply