Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

How much work would it be to use the C++ ONNX run-time with this instead of Python? Is it a Claudeable amount of work?

The iOS version is Swift-based.



shouldn't be hard. what backend/hardware are you interested in running this with? i'll add an example for using C++ onnx model. btw check out roadmap, our inference engine will be out 1-2 weeks and it is expected to be faster than onnx.


I want to run it in a website with Wasm and having the browser do the audio playback


I've been playing with running small models in browser tabs for some time, and finally decided to open some of it.

Added kitten (nano only, for now, will move on to mini) to my "web tts thing": https://github.com/idle-intelligence/tts-web

demo: https://idle-intelligence.github.io/tts-web/web/


desktop CPUs running inference on a single background thread would be the ideal case for what I'm considering.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: