How could they not? When I penciled this out ~18 years ago, I included the amortized cost of all the interviews it took to hire a given engineer as well. It's not rocket surgery, as they say.
I'm reminded by the caveman skill of the clipped writing style used in telegrams, and your post further reminded me of "standard" books of telegram abbreviations. Take a look at [0]; could we train models to use this kind of code and then decode it in the browser? These are "rich" tokens (they succinctly carry a lot of information).
I would point out that the default BPE tokenization vocabulary used by many models (cl100k_base) is already a pretty powerful shorthand. It has a lot of short tokens, sure. But then:
Token ID 73700 is the literal entire (space-prefixed) word " strawberry". (Which neatly explains the "strawberry problem.")
Token ID 27128 is " cryptocurrency". (And 41698 is " disappointment".)
Token ID 44078 is " UnsupportedOperationException"!
Token ID 58040 is 128 spaces in a row (and is the longest token in the vocabulary.)
You'd be surprised how well this vocabulary can compress English prose — especially prose interspersed with code!
For a while I was missing the ability one uses all the time in stable diffusion prompts of using parentheses and floats to emphasize weight to different parts of the prompt. The more I thought about how it would work in an LLM though, the more I realized it's just reinventing code syntax and you could just give a code snippet to the LLM prompt.
When you're killing (C-u, C-k, C-w, etc) + yanking (C-y), you can also use yank-pop (bound to M-y in bash and zsh by default) to replace the thing you just yanked with the thing you had killed before it.
$ asdf<C-w>
$ # now kill ring is ["asdf"]
$ qwerty<C-a><C-k>
$ # now kill ring is ["qwerty", "asdf"]
$ <C-y> # "yank", pastes the thing at the top of the kill ring
$ qwerty<M-y> # "yank-pop", replaces the thing just yanked with the next
# thing on the ring, and rotates the ring until the next yank
$ asdf
If you have enough discipline to make sure you only create threads after all the forking is done, then sure. But having such discipline is harder than just forbidding fork or forbidding threads in your program. It turns a careful analysis of timing and causality into just banning a few functions.
And what do you do with that information? Refuse to fork after you detect more than one thread running? I haven’t seen any code that gracefully handles the unable-to-fork scenario. When people write fork-based code, especially in Python, they always expect forking to succeed.
But not the reverse, if its a bare fork and not strictly using basically mutex and shared resource free code (which is hard), and there's little or no warning lights to indicate that this is a terrible idea that fails in really unpredictable and hard to debug ways.
reply