More

yding · 2026-04-24T16:24:40 1777047880

When you evaluated the tools, what stood out between which ones were better or worse?

burnte · 2026-04-24T21:51:05 1777067465

A few things. I'm price sensitive, so pricing was huge for me. The worst company also had the worst prices. I tried to ask them questions about how their backend works and they refused to answer. I spoke with the CEO and he said he couldn't reveal their "secret sauce". I said, "if you secret sauce is what infra providers you use and not your proprietary code, then you don't HAVE secret sauce and you're just reselling [Cloud Provider's Product]." Turns out that's exactly what they were doing. They were using Google Cloud for recording capture, and AWS for speech to text and then summary generation. I told them we would not ever be working with them.

For me the big things are price, ease of use, and data protection policies. I need to know the data never leaves the US, and I need to know what processors will touch it. Then if it meets those needs we'll do clinical demos and tests to get provider feedback. That's where we learn if it is clinically accurate. About half of them suck in the accuracy department.

What stands out to me the most is that the best companies have tended to be the small guys who have a strong grasp ion the entire stack and have somewhat simple apps. They focus on the tech, and have a minimal UI that just focuses on the main tasks and they don't spend engineering time on fancy pretty bells and whistles. If you see a simple UI, that's a good sign to me. Once you hit the big guys the quality goes down. Dragon Medical One is great for straight text to speech, but Dragon with Copilot for medical is really bad.

yding · on Jan 24, 2025

Thanks Simon. I think this might solve one of the most common questions people ask me: how do I get Perplexity-like inline citations on my LLM output?

This looks like model fine tuning rather than after the fact pseudo justification. Do you agree?

simonw · on Jan 24, 2025

Yeah, I think they fine-tuned their model to be better at the pattern where you output citations that reference exact strings from the input. Previously that's been a prompting trick, e.g. here: https://mattyyeung.github.io/deterministic-quoting

yding · on Jan 24, 2025

Makes sense. I wonder if it affects the model output performance (sans quotes), as I could imagine that splitting up the model output to add the quotes could cause it to lose attention on what it was saying.

yding · on Sept 12, 2024

As someone who interned at Palm, love this so much!

capitain · on Sept 12, 2024

Oh wow, do you have any interesting stories from back then? Saw some unreleased prototypes?

yding · on Sept 9, 2024

Depends on the language/standard library. For example in C if your library includes its own HTTP library that's probably not a plus.

yding · on Sept 4, 2024

Congrats Taranjeet and Deshraj!

So after using Mem0 a bit for a hackathon project, I have sort of two thoughts: 1. Memory is extremely useful and almost a requirement when it comes to building next level agents and Mem0 is probably the best designed/easiest way to get there. 2. I think the interface between structured and unstructured memory still needs some thinking.

What I mean by that is when I look at the memory feature of OpenAI it's obviously completely unstructured, free form text, and that makes sense when it's a general use product.

At the same time, when I'm thinking about more vertical specific use cases up until now, there are very specific things generally that we want to remember about our customers (for example, for advertising, age range, location, etc.) However, as the use of LLMs in chatbots increases, we may want to also remember less structured details.

So the killer app here would be something that can remember and synthesize both structured and unstructured information about the user in a way that's natural for a developer.

I think the graph integration is a step in this direction but still more on the unstructured side for now. Look forward to seeing how it develops.

deshraj · on Sept 4, 2024

Thanks yding! Definitely agree with the feedback here. We have seen similar things when talking to developers where they want:

- Control over what to remember/forget - Ability to set how detailed memories should be (some want more detailed vs less detailed) - Different structure of the memories based on the use case

yding · on Aug 15, 2024

The short answer is he works at Cohere. But longer answer is that the model probably doesn’t matter that much.

yding · on Aug 15, 2024

Looks great! thanks for sharing your architecture choices here.

yding · on Aug 13, 2024

Congrats on the launch!

yding · on Aug 10, 2024

Very cool!

yding · on May 30, 2024

Absolutely makes sense!