Hopefully we can blend those two options together so it’s not a choice.
Personally I find being able lean on our heavily documented standards in /review gives me back time to dive into what I want to craft next.
Same with scheduling repetitive tasks an agent can do for me well once instructed well. I am freed up to do something else I want to focus actively on because I like it and want it to be great.
Now stress about OKRs and OKRS in general… that’s a different issue
At this point you have to use more than one to get a complete picture, which I’m doing now. Mainly because:
1) some are not always up to date (started on helicone but felt a lag on price updates)
2) they don’t return every model / provider I want (https://ai-gateway.vercel.sh/v1/models has rich data but is a subset, so I combine with helicone)
I always hope for the best when someone has a new list because of this. I want a de facto source!
Indeed it’s both. Once humans start governing their AI. Once the blockchain community can check the GOV contract for validity. Founder of One becomes a reality. Contracts are institutional memory!
Let the LLM write the software. That’s ephemeral and evolves with time. Humans should govern the entire system to resilience. That is fixed with time. Thou shalt not kill has staying power. Weapons, poison, and other methods of death evolve. More governence deals with them over time.
your LLM prompt is, by definition, underspecified. the code is what describes the actual behaviour of the system, and there are known and understood ways to make that behaviour more robust, correct and resilient, that are independent of the domain the code is modelling, but consistent across different code bases. that's why I say writing code is its own domain.
as an analogy, an art museum couldn't paint their own paintings to hang up (or at least they would not be very good) but neither would monet or picasso have done a particularly good job at designing a space to let millions of people a year view their pictures. both skills are necessary to the overall product.
I have 2 (CC and Codex) running within most coding sessions, however can have up to 5 if I'm trying to test out new models or tools.
For complex features and architecture shifts I like to send proposals back between agents to see if their research and opinion shifts anything.
Claude has a better realtime feel when I am in implementation mode and Codex is where I send long running research tasks or feature updates I want to review when I get up in the morning.
I'd like to test out the git worktrees method but will probably pick something outside of core product to test it (like building a set of examples)
Personally I find being able lean on our heavily documented standards in /review gives me back time to dive into what I want to craft next.
Same with scheduling repetitive tasks an agent can do for me well once instructed well. I am freed up to do something else I want to focus actively on because I like it and want it to be great.
Now stress about OKRs and OKRS in general… that’s a different issue
reply