I get it, but why would anyone trust what these companies say about their model ...

		davebren 50 days ago \| parent \| context \| favorite \| on: Exploiting the most prominent AI agent benchmarks I get it, but why would anyone trust what these companies say about their model performance anyway. Everyone can see for themselves how well they complete whatever tasks they're interested in.