His claim is that we exp-minus-log cannot compute the root of an arbitrary quintic. If you consider the root of an arbitrary quintic "elementary" the exp-minus-log can't represent all elementary functions.
I think it really comes down to what set of functions you are calling "elementary".
The author discusses this in his third paragraph, and states explicitly in his fourth that he considers the result faulty for its unrealistically narrow definition of elementarity.
(I'm not a mathematician, so don't expect me to have an opinion as far as that goes. But the author also writes well in English, and that language we do share.)
> In layman’s terms, I do not consider the “Exp-Minus-Log” function to be the continuous analog of the Boolean NAND gate or the universal quantum CCNOT/CSWAP gates.
But is there actually a combination of NANDs that find the roots of an arbitrary quintic? I always thought the answer was no but admittedly this is above my math level.
Combinations of the NAND gate can express any Boolean function. The Toffoli (CCNOT) or Fredkin (CSWAP) can express any reversible Boolean function, which is important in quantum computing where all gates must be unitary (and therefore reversible). The posited analog is that EML would be the "universal operator" for continuous functions.
Yes, NAND gates can implement root finding algorithms for arbitrary polynomials. For example a variant of Newton’s method can be used (there are also better algorithms for circuits specifically).
This can be done in polynomial time as well.
This is fairly obvious if you think about that your computer can do the same thing and it’s just a fancy circuit.
According to the article “This work effectively rules out explanations of the Hubble tension that rely on a single overlooked error in local distance measurements". So any systemic errors would need to affect multiple measurement types.
Normally the reason to do a conservative GC is because it makes the VM’s implementation (and potentially FFI) simpler, not because it’s faster. Also for fun of course. (There are reasons a fully-conservative design might be faster, like not having to havr a ton of scanning functions if you have a ton of disparate types, but in general my perception is that production GCs are normally moving to at least some degree, which a fully conservative one can’t be.)
Yeah. I implemented the conservative collector because it simplified development of the language's primitive functions. Allowed me to put lisp values on the C stack without worrying about them being garbage collected.
The current version of the collector also compacts the heap and moves values around. All conservatively discovered values get pinned.
One recent interesting partly-conservative, moving design I remember reading about is Whippet[1] by Andy Wingo (Guile maintainer, SpiderMonkey contributor). He’s been partly forced into it, though, by the fact that Guile’s existing external API (unlike, say, Lua’s) exposes the fact that it uses a non-moving collector (by allowing the user to keep hold of raw pointers to Scheme objects), so I’m not sure if this should serve as an inspiration or a cautionary tale.
I really like his blog! I emailed him when I published my delimited continuations article because it addresses the overlapping native/lisp stacks problem he wrote about. Sadly I don't think he's seen it.
> One way is to inform the garbage collector of the locations of all roots [...] implicitly, in the form of a side table generated by the compiler associating code locations with root locations.
Getting GCC to do things for you is fraught. Probably still possible; just... fraught.
I believe Clang was intended to be able to do this[1] and I remember seeing that stuff even back when it was a particularly spunky research project. The facility doesn’t seem to have really gone anywhere even if it technically works; I wonder why.
In general, the problem with stack maps of any kind—even the very minimal ones you need for stack unwinding—is that they’re liable to be large, slow to process, or both. Now that I’m thinking about it, I wonder if you could map the stack at runtime using those same unwinding tables. A C++ compiler does have to know where it put things so it can call their destructors, and perhaps you could make your GC root a word-sized but otherwise opaque uncopyable word-sized thingy so the compiler can’t put it in more than one place at once. (I can’t be the first to have the idea, even with how stupid it sounds and with how utterly miserable Itanium ABI unwinding is.)
I haven't measured the performance. I would like to. I'm especially interested in comparing it with the current version of the collector. It is now capable of heap compaction which will be the subject of a future article. I'm also interested in knowing how big a problem the conservative part of the collector is. The C stack depth has been minimized significantly since I got rid of the recursive evaluator. I don't think it's a significant problem but I could be wrong.
I need to implement the compiler's profiling functions in lone. Documentation on the matter seems scarce. Not even sure if those instrumentation APIs are stable. I got away with not implementing the ASAN and stack protector since they added flags that make the compiler emit trapping instructions instead of function calls. Profiling is a lot more complex.
The issue with Regex for parsing is it can't handle balanced parentheses. https://en.wikipedia.org/wiki/Regular_expression. More generally, they can't handle nested structure. Context free grammars are the most natural extension that can. It adds a substitution operator to Regex that makes it powerful enough to recognize nested structure. So, Regex would be reinvented if history was rerun, but so would Context Free Grammars. Part of the complexity in parsing is attaching semantic meaning to the parse. Regex mostly avoids this by not caring how a string matches, just if it matches or not.
Now, I do agree that LR grammars are messy. Nowadays, they have mostly fallen from favor. Instead, people use simpler parsers that work for the restricted grammars actual programming languages have.
IIRC there is some research into formalizing the type of unambiguous grammar that always uses () or [] as nesting elements, but can use Regex for lexing.
I understand what a CFG is and why Dyck's language (matching parens) is not a regular language. My point was that CFG/CFL is less motivated by a reasonable and uniquely characterising constraint - such as making memory usage independent of the size of an input string - than regex is.
Then again, you are right that CFGs are very natural. And they do admit a few easy O(n^3) parsing algorithms, like Earley and CYK.
I think your last sentence relates to Visible Pushdown Grammars. See also Operator Precedence Grammars.
Rust's discussion boards has an idea of "keyword generics" for expressing some of these concepts. The idea is that a function can be generic over const, async or some other keyworded effect. I like this description. It shows the benefits without too much theory.
Yours might go a little less into the details, but its really clear and I like the diagrams and explanation around glitch hazards. Please do follow up on your tangents if you have time.
I have commented once or twice on articles being AI generated. I don't put them when I think the writer used AI to clean up some text. I added them when there are paragraphs of meaningless or incorrect content.
Formats, name collisions or back-button breakage are tangential to the content of the article. Being AI generated isn't. And it does add to the overall HN conversation by making it easier to focus on meaningful content and not AI generated text.
Basically, if the writer didn't do a good job checking and understanding the content we shouldn't bother to either.
The fascinating paradox: there are clearly "tells" (slop-smells, like code-smells?) of LLM-generated text. We're all developing heuristics rapidly, which probably pass a Pepsi challenge 95+% of the time.
And yet: LLMs are writing entirely based on human input. Presumably there exists a great quantity of median representative text, some lowest-common denominator, of humans who write similarly to these heuristics.
(In particular: why are LLMs so fond of em-dashes, when I'm not sure I've ever seen them used in the wilds of the internet?)
An LLM was used to help polish the text, and that section probably got over-polished. The ideas and code are authentic. The article is intentionally detailed for those who want the full reasoning, but you can always skip straight to the source code.
Out of curiosity what particular part of the original text needed to be polished and why couldn’t the writer accomplish said polish without a language model?
When writing the articles in that series, the focus was more on getting the technical ideas and details right, not on spelling, grammar and text flow in which LLMs excel. That specific section "Why this maps to Genetic Algorithms?" makes the point that the fit of Genetic Algorithms to state space exploration is not a coincidence, and argues that the evolutionary process itself is a state space exploration algorithm that allows a given species to advance further down life's timeline. But thanks for the question, I do agree that as LLM generated text becomes more ubiquitous everywhere, we all do appreciate more our own human writing style even for the highly technical text where the focus is on the technical ideas and not so much on the presentation language itself.
I think it really comes down to what set of functions you are calling "elementary".
reply