Can we start getting a similar flood of tools to generate the embeddings now? That’s my bottleneck. Searching them works well on numerous databases that support arrays/vectors.
I work at FeatureBase and I'm storing vectors from the Instructor Large library/model into our solution. Getting good results, which I should probably quantify at some point. One thing that FeatureBase does well is allow filtering of the vector space via SQL.
I would say that most people seem to prefer an engine that embeds and stores things as a service, but using Instructor is only a few lines of code and runs locally.
Just to quickly add to ukuina's comment, marqo.ai does embedding generation and vector search end to end, so you can put in documents and the embeddings are automatically generated.
Lot of tools that can do this and they've long been around. For example, txtai has been able to generate embeddings with sentence-transformers since 2020.