I created a prize for short fiction with a $3500 prize pool, judged entirely by AI.
In short, the first stage of evaluation gets passed through four small models based on 21 literary criteria. If there’s agreement between the models, or if the first pass scores particularly well, the work is promoted to the second stage. Here, seven sub-agents which specialize in groups of the above criteria give two passes on the work. If the same scoring evaluations are met, all of the previous results are added to contexts for larger, thinking models to give a final score.
Currently, submissions are limited to short (2000 to 20,000 words) literary (not genre) fiction.
Once the submission period ends, and prizes are awarded, I’ll be opening it up for anyone to use.
I was expecting a bridge to some form of knowledge management, given the brief reference to personal wikis and databases, plus the fact that HN loves to debate the various methods of the above. But I was happily surprised to see nothing of the sort. The author's resignation that his practical research methodologies are no doubt outdated and inefficient was a breath of fresh air.
I often find myself spending far too much time fearing that the methods I've chosen in any kind of research are faulty, which turns out to be much greater time sink than actually just absorbing the material at hand.
An analogy: time spent planning what to do with your friends/family would be better used just being with them. Likewise, becoming closer to a historical subject--whether by immersing yourself in all the relevant material or by literally imagining yourself alongside them--will return more valuable results long-term than by running a scientific experiment about them.
I think ChatGPT wasn't able to interpret the final diagram of the double slit experiment, rather than the meme as a whole. I wonder what the output would be if the input was just that diagram.
That said, I agree with the overall sentiment: IMHO, LLMs won't ever be able to parse culturally dense and community specific information, like memes, because the goalposts are always moving. That goes for humor in general, I suppose. I think that's a good thing for humans
Totally agreed. I'm biased because I learn languages to mostly read texts in original languages. Narratives with well-crafted prose can have hundreds of years of linguistic and stylistic histories, as well as contemporary vernacular, which can tell more about a language than just understanding the plot of the story. I don't think that can ever be fully replaced by an LLM.
There are thousands of short stories at every level of language understanding for nearly every language in existence. I would be more interested in using AI for the languages that don't have these. (say, endangered/extinct languages, oral languages, etc)
>Narratives with well-crafted prose can have hundreds of years of linguistic and stylistic histories, as well as contemporary vernacular, which can tell more about a language than just understanding the plot of the story.
This seems like a werid hangup for what is essentially a substitute for graded readers for language learners. You're not getting any of the things you mentioned going the "human" route.
No one is saying go read these stories over full blown novels. There's no complexity difference between full blown novels and most native short stories either, just length so that's not really an option.
If you could read at that level, you wouldn't be using this or the non-LLM alternative anyway.
Just because a children's story, for example, is "simple" doesn't mean it isn't inflected by human complexities.
When you're learning a language, your brain is going through a unique process of both attention to small detail and rote memorization. If you see a pattern often enough at an early stage of language learning, you'll most likely carry that with you at later stages. Even if you don't notice it at first.
Would you trust an AI to present you with accurate language patterns--speech, vernacular, etc?
>Just because a children's story, for example, is "simple" doesn't mean it isn't inflected by human complexities.
Sure. And again this is why lots of people say recommending a Children's book or show is really a bad idea for beginner learners. Graded readers are an entirely different thing from children's fiction.
>Would you trust an AI to present you with accurate language patterns--speech, vernacular, etc?
Language patterns in text ? Yes. There's nothing special about it lol. For all of GPT-4's misgivings, "wrong" language patterns for English isn't one of them. How it writes usually is just the default hammered in by RLHF and can easily diverge when instructed. So if a native in some other language gives the A-Ok on a piece of text then that's it.
I really enjoyed this write up . Reminds me of projects in NYC like Urban Archive [0], at a microscopic level
I really wish this kind of snooping was incentivized more. It’s a great way to engage your community. Something I think many younger people, and especially those working remotely, can miss. (Myself included). Asking specifically “why” something exists is a great heuristic.
Side note—-if you’re ever traveling in Europe, try to find a Henry Holt Walks Series book for whatever city you’re in. They’re older (usually late 90s to early 00s), but they go into meticulous historical and narrative detail of overlooked sections/buildings/plaques/etc of otherwise very touristic cities. Will undoubtedly send you into a deep and labyrinthine rabbit hole
The lower bound of ~45k might be attributed to the German salary cutoff of the EU Blue Card (long term visa for skilled workers) at €45,552 [1]. When I was offered my first dev job (and first job in Germany), it was exactly this number.
Unfortunately, these too-low lower bands are negatively impacting the job market for everyone. Companies have no incentive to increase their pay if they know they'll always find someone willing to take the minimum band for a couple of years just to get into the country.
IMHO, the minimum salary required for a blue card/work visa, in any country not just Germany, should be slightly above what the median wage is for locals in similar positions as the point of skilled immigration should be to uplift the market, otherwise it's just another wage suppression scheme with extra steps that only benefits the employers at the expense of the workforce which now have less bargaining power.
Who cares what software company's in Germany want? They don't want to compete for talent and valuable projects, so talent will want to leave and the company actually wants to cease to be. If you are in Germany, you leave for swiss, Luxembourg, Netherlands for certain improvement, sv for the paygap, to chzekia, Estonia or Poland for a startup. Germany is a old folks home going broke, not a country to start a software development.
Around my area (Central Cali) H1B's are generally hired at a higher rate only because the position couldnt be filled by a 'local', especially for public positions. The FAANG's managed to abuse the system enough that rules have changed to something more like what you envision. At least from the hiring practices Im party to.
For a business in Australia to sponsor a visa for an international hire, they have to prove that they were more qualified than a local applicant and pay them more. I'm not sure if that's always followed, but it certainly made it very hard for myself to get a job.
Now I'm a permanent resident after being with my Australian partner for over a decade, self employed and running my own company.
Spotify's Soundtrap claims to have live collaboration, haven't tested it though so I don't know if it's actually realtime or just a step in that direction.
In short, the first stage of evaluation gets passed through four small models based on 21 literary criteria. If there’s agreement between the models, or if the first pass scores particularly well, the work is promoted to the second stage. Here, seven sub-agents which specialize in groups of the above criteria give two passes on the work. If the same scoring evaluations are met, all of the previous results are added to contexts for larger, thinking models to give a final score.
Currently, submissions are limited to short (2000 to 20,000 words) literary (not genre) fiction.
Once the submission period ends, and prizes are awarded, I’ll be opening it up for anyone to use.