It's possible they have a software layer that does that. But I was assuming they...

		energy123 on July 11, 2024 \| parent \| context \| favorite \| on: Vision language models are blind It's possible they have a software layer that does that. But I was assuming they don't, because the open source multimodal models don't.