More

vorticalbox · 2026-04-17T12:59:55 1776430795

i've not had the issue with codex, i was testing a public api i work on for issues, codex was happy to attempt to break it but did refuse to create a script that would automate the issue it found.

vorticalbox · 2026-04-16T16:35:33 1776357333

Yea I’ve seen this and stopped it and asked it about it.

Sometimes they notice bugs or issues and just completely ignore it.

Gracana · 2026-04-16T16:55:17 1776358517

This can result in some funny interactions. I don't know if Claude will say anything, but I've had some models act "surprised" when I commented on something in their thinking, or even deny saying anything about it until I insisted that I can see their reasoning output.

ceejayoz · 2026-04-16T16:57:02 1776358622

Supposedly (https://www.reddit.com/r/ClaudeAI/comments/1seune4/claude_ch...) they can't even see their own reasoning afterwards.

astrange · 2026-04-16T18:45:08 1776365108

It depends on the version. For the more recent Claudes they've been keeping it.

vorticalbox · 2026-04-16T10:31:41 1776335501

I some times play about with local models via ollama/comfyui and more recently ace-step to generate music.

This is short bursts of heat 5-10 m during the render I would not be happy with that for multiple hours a day. I am sure that would have a negative effect on battery health.

vorticalbox · 2026-04-15T19:08:19 1776280099

Most agent/chats have access to web search. I’m not overly surprised that it can do it but it is very nice when it actually works.

vorticalbox · 2026-04-13T11:52:55 1776081175

You can get abliterated versions that have no (or very limited) refusals.

I tend to use Huihuiai versions.

vorticalbox · 2026-04-07T12:26:35 1775564795

i'm using the cursor cli when i just its build in it scored 10/16 tokens but i also have my own custom cli tool that does tasks for my job when i used that it scored 15/16. it missed the token on the Content Negotiation test.

vorticalbox · 2026-04-02T20:53:21 1775163201

This is how use cursor 99% of the time. The other 1% is in zed.

vorticalbox · 2026-04-02T16:25:29 1775147129

what happened around jan this year(26) that caused such a climb in usage?

wcallahan · 2026-04-02T17:47:59 1775152079

Openclaw

vorticalbox · 2026-03-31T10:29:49 1774952989

i like ollama, mostly because the cli is pretty nice. its desktop app has stupid choices like if a model can support tools then the ui should give me the "search" option but it only shows for cloud models.

i have ran lmstudio for a while but i don't really use local models that much other than to mess about.

zozbot234 · 2026-03-31T10:32:16 1774953136

You can also use OpenWebUI locally which should give you a nice friendly UX once you set it up.

vorticalbox · 2026-03-28T19:15:37 1774725337

"OpenAI’s GPT-5" is ambiguous. Does that mean GPT-5, 5.1, 5.2, 5.3, or 5.4? Does it include the full model, or the nano/mini variants?

dns_snek · 2026-03-28T19:43:36 1774727016

GPT-5 is not ambiguous, it's the official name of the model that released in August last year.

> All evaluations were done in March - August 2025.

vorticalbox · 2026-03-28T21:08:03 1774732083

while true, all the others got precise identifiers but for openAI it makes it hard to reproduce because i have no idea "which" GPT-5 was used.

gardenerik · 2026-03-30T07:46:37 1774856797

It was called just GPT-5 at that point in time.

prjkt · 2026-03-29T18:07:17 1774807637

In that case, what tokenizer version? What was the temperature set to? topk? topp? FP32? FP16? Quantized? Hopper? Blackwell?