How much work would it be to use the C++ ONNX run-time with this instead of Pyth...

rohan_joshi · 2026-03-19T16:57:18 1773939438

shouldn't be hard. what backend/hardware are you interested in running this with? i'll add an example for using C++ onnx model. btw check out roadmap, our inference engine will be out 1-2 weeks and it is expected to be faster than onnx.

koolala · 2026-03-20T04:44:08 1773981848

I want to run it in a website with Wasm and having the browser do the audio playback

ilnmtlbnm · 2026-03-23T16:39:05 1774283945

I've been playing with running small models in browser tabs for some time, and finally decided to open some of it.

Added kitten (nano only, for now, will move on to mini) to my "web tts thing": https://github.com/idle-intelligence/tts-web

demo: https://idle-intelligence.github.io/tts-web/web/

fwsgonzo · 2026-03-19T19:34:57 1773948897

desktop CPUs running inference on a single background thread would be the ideal case for what I'm considering.