Hacker Newsnew | past | comments | ask | show | jobs | submit | rsfern's commentslogin

It seems like an opportunity for a hierarchical cache. Instead of just nuking all context on eviction, couldn’t there be an L2 cache with a longer eviction time so task switching for an hour doesn’t require a full session replay?

Living where? If it's in the GPU, then it's still taking up precious space that could be used for serving other sessions. If it's not in the GPU, then it doesn't help.

I think the exceptionalism is the other way around. What makes anyone think they understand what makes for intelligence when we barely understand our own neurology?

I'm reminded of a book on my bookshelf (which I still haven't read, story of my life...), by the recently deceased ethologist Frans de Waal, titled 'Are We Smart Enough to Know How Smart Animals Are?'. Of course, Betteridge's law applies to its title.

In my opinion, the vast multitude of different animal intelligences is a clear hint that language does not an intelligence make. We're animals, and our intelligences did not come from language; language allowed us to supercharge it. We can and do think and make decisions without using language, and the idea that a statistical model based solely on our language can be intelligent does not follow.


Hey, I also read that book, and came to basically the opposite conclusion!

The point of the book is that we've been very bad at testing animal intelligence because of a vast stack of human biases, including things like language and the geometry of our hands.

Animals with different geometries and no language are still intelligent, but we need to test them in ways which recognize their capabilities. Intelligence is general: it's adaptivity within one's set of constraints.

De waal also points out that there was massive shifting of the definition of language and intelligence as we became more aware of what animals are capable of.

From this angle, I would say that LLMs are intelligent: they do adapt to their inputs extremely readily, though they have a particular set of constraints (no physical body (usually), for starters). They are, like chimpanzees, smarter and more capable than humans in some ways, and much dumber in others.

Finally, the 'statistical learners can't be intelligent' line of argument is extremely short-sighted. Our brains are bags of electrified meat. Evolution somehow figured out a way to make meat think. No individual neutron is intelligent, yet the collection of cells is. We learn by processing experiences with hormonal signals because those hormonal signals are what the meat is capable of working with. LLMs, by contrast, learn by processing examples with backprop. If anything, the intelligence of meat is more surprising.


The meaning of tokens lose touch with language in the deeper layers of large language model’s neural nets.

Language is just the input/output modality.


I'll admit I am not an expert in the field, but the fact that "chain-of-thought" optimisations function by getting the model to extend its own context window with more language to me hints that what we consider an "intelligent" response is ultimately contingent of the language processing.

In any case though, if language is just the input/output modality, where is the intelligence when language is not involved? Is the "intelligence" of ChatGPT/Claude/Gemini models dependent on the human-decision-curated linguistic dataset they have been trained upon, or is it prior to that? If a SOA LLM were to be trained on the same dataset as them but was not in any way put through RLHF for it to respond to human prompts, would it be intelligent? What would be the expression of that intelligence?


I also achieve better performance on cognitive tasks when I use language to first describe the problem I'm trying to solve. In fact, it usually helps quite a bit (see: rubber-duck debugging)

I'm not sure the word "intelligence" really fits what these models are doing. I do however think it's safe to say that they are performing cognition - even if it's 'simply' cognition over their provided context and even if it's entirely limited by their training set. We still have a machine that can perform automated cognition over a increasingly wide distribution of data.


Explain the emergent capabilities of AI then.

Such as?

It is, but NSA reports to the director of national intelligence, not the defense secretary, so it’s unclear (to me at least) that SecDef’s opinion of Anthropic counts for anything here

I guess DOD is large enough they have multiple parallel cabinet level positions

https://en.wikipedia.org/wiki/National_Security_Agency


It’s not as clear as that. The NSA director is also, traditionally, dual-hatted as the Commander of CYBERCOM and thus a flag officer reporting ultimately to the SecDef. The DNI is responsible for coordinating/funding national intelligence activities but ultimately a lot of day to day operational decision making tends to flow through the pentagon. They would definitely need to abide by DoD policy

> They would definitely need to abide by DoD policy

The policy in question is a statement by SecDef being reviewed by courts. I think it’s fair to ask whether DNI is actually constrained by that, or if it’s a judgement call.


The reason is that electrons (like all quantum mechanical objects) are wavelike. In an isolated hydrogen atom, the electron is in a spherically symmetric environment, so the solutions to the wave equation have to be spherical standing waves, which are the spherical harmonics. The wave frequencies have to be integer divisions of 2pi or else they would destructively interfere. (Technically each solution is a product of a spherical harmonic function and a radial function that describes how fast the electron wave decays vs distance from the nucleus)

What’s interesting is if the environment is not spherically symmetric (consider an electron in a molecule) the solutions to the wave equation (the electronic wave functions) are no longer spherical harmonics, even though we like to approximate them with combinations of spherical harmonic basis functions centered on each nucleus. It’s kind of like standing waves on a circular drum head (hydrogen atom) vs standing waves on an irregular shaped drum head

Of course the nucleus also has a wave nature and in reality this interacts with the electrons, but in chemistry and materials we mostly ignore this and approximate the nucleus like a static point charge from the elctrons perspective because the electrons are so much lighter and faster


Ah amazing - thank you for the response! I have a couple of related questions - is it that the non 2 pi frequencies exist, but they destructively interfere so we can't see them? My understanding is that the radial function for the electron is zero at the nucleus - there is no possibility of it being found there - but why is that the case?

Admittedly my understanding of QM is a bit vibey but I’ll try to answer

In an atom, angular wavefunctions with wavelengths non-integer divisions of 2pi can’t exist because of the boundary conditions on the wave equation. A free electron can have any wavelength, but once you put it in a box (confine it to the potential around a proton in a Hydrogen atom) the non-integer wavelengths aren’t allowed

I think it’s instructive to think about what the wavefunction represents. It’s square is the electron probability density (technically the wavefunction is complex valued so it’s the wavefunction times it’s complex conjugate). If you have a non-integer multiple wavelength then the wavefunction goes out of phase with its complex conjugate after one period, and if you integrate over the angular domain the electron probability has to be zero everywhere.

This also answers your second question. The radial solution to the wave equation for hydrogen gives you the Laguerre polynomials. They don’t all go to zero at the nucleus though, actually the first one has a maximum at zero because it scales like exp(-r) (See fig 4.10.2 on chem.libretexts linked below). But when you do a volume integral to calculate the electron probability, the probability near the nucleus is low because the integration volume is small even though the wavefunction is large

https://en.wikipedia.org/wiki/Laguerre_polynomials

https://chem.libretexts.org/Courses/University_of_California...


While the treatment for methanol poisoning indeed includes ethanol, I don’t think your dosage suggestion is right. Your body would still have to process all the methanol, the job of the ethanol is just to slow down the reaction. If you suspect methanol poisoning you need the hospital, they will administer the ethanol intravenously and I think do dialysis to remove the methanol and the formic acid it metabolizes to (this is one of the toxins in ant venom)

https://doi.org/10.1053/j.ajkd.2016.02.058


There are groups that are actively working on automating conventional labs like this. Most of the efforts I know about use non-humanoid mobile robots or even just a six-axis arm on a rail and some lab space reconfiguration


This issue of accessibility is widely acknowledged in the academic literature, but it doesn’t mean that only large companies are doing good research.

Personally I think this resource mismatch can help drive creative choice of research problems that don’t require massive resources. To misquote Feynman, there’s plenty of room at the bottom


I like this analogy of always choosing “I’m feeling lucky” on Google, I feel like it clarifies a boundary between information retrieval and evaluation that gets blurred by language model summarizations. I’ve been frustrated with the LLM summary at the top of the Google search results for scientific topics because often the sources linked to don’t actually contain the information the summary is citing them for. Then I have a side quest of finding the right backing literature or deciding the summary was just wrong in the first place


There is https://orcid.org which is a persistent identifier for a researcher. It would be interesting if sending email to a researchers orcid handle resolved to their current institutional email address I guess?

My usual workflow is find the person on google scholar, find their uni/lab homepage, and hope they published their email there.


It’s all foreign guest researchers by the end of September, high risk countries by the end of March. Your first quote doesn’t imply the NIST sources for this article don’t have firsthand knowledge that this is coming, it’s just that it appears the lab management is avoiding putting things in writing


> Researchers from lower risk countries have been told they could lose access beginning in either September or December if at that point they have been at the lab more than 2 years or, under a waiver, 3 years.

The word "could" seems to conflict with "It's all foreign guest researchers by end of September."

If you think that's what he meant, then it's clear that Bob has made things incredibly ambiguous since we disagree. Do you think he might have written the article, and especially the headline, in such a way as to make it more clickable?

I do.


I don’t know why the author of the article wrote “could”, but I personally work closely with some non-high-risk-country NIST foreign guest researchers. It’s been filtered down verbally through the management chain that the end of this September is the re-review deadline, and it’s not been stated as a hypothetical.


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: