Hacker Newsnew | past | comments | ask | show | jobs | submit | nikodunk's commentslogin

Having read above article, I just gave llama.cpp a shot. It is as easy as the author says now, though definitely not documented quite as well. My quickstart:

brew install llama.cpp

llama-server -hf ggml-org/gemma-4-E4B-it-GGUF --port 8000

Go to localhost:8000 for the Web UI. On Linux it accelerates correctly on my AMD GPU, which Ollama failed to do, though of course everyone's mileage seems to vary on this.


Was hoping it was so easy :) But I probably need to look into it some more.

llama_model_load: error loading model: error loading model architecture: unknown model architecture: 'gemma4' llama_model_load_from_file_impl: failed to load model

Edit: @below, I used `nix-shell -p llama-cpp` so not brew related. Could indeed be an older version indeed! I'll check.


As it has been discussed in a few recent threads on HN, whenever a new model is released, running it successfully may need changes in the inference backends, such as llama.cpp.

There are 2 main reasons. One is the tokenizer, where new tokenizer definitions may be mishandled by the older tokenizer parsers.

The second reason is that each model may implement differently the tool invocations, e.g. by using different delimiter tokens and different text layouts for describing the parameters of a tool invocation.

Therefore running the Gemma-4 models encountered various problems during the first days after their release, especially for the dense 31B model.

Solving these problems required both a new version of llama.cpp (also for other inference backends) and updates in the model chat template and tokenizer configuration files.

So anyone who wants to use Gemma-4 should update to the latest version of llama.cpp and to the latest models from Huggingface, because the latest updates have been a couple of days ago.


I just hit that error a few minutes ago. I build my llama.cpp from source because I use CUDA on Linux. So I made the mistake of trying to run Gemma4 on an older version I had and I got the same error. It’s possible brew installs an older version which doens’t support Gemma4 yet.

Ah it was indeed just that!

I'm now on:

$ llama --version version: 8770 (82764d8) built with GNU 15.2.0 for Linux x86_64

(From Nix unstable)

And this works as advertised, nice chat interface, but no openai API I guess, so no opencode...



Good stuff, thanx!

And that's exactly why llama.cpp is not usable by casual users. They follow the "move fast and break things" model. With ollama, you just have to make sure you're getting/building the latest version.

Its not possible to run the latest model architectures without 'moving fast'. The only thing broken here is that they are trying to use an old version with a new model.

and Ollama suffered the same fate when wanting to try new models

What fate?

the impedance mismatch between when models are released and the capability of Ollama and other servers capability for use.

I'm a bit unsure what that has to do with someone running an outdated version of the program while trying to use a model that is supported in the latest release.

It's a new-ish project FYI. But to answer your questions:

- Apps: It's Linux (like desktop or server), but "image-based" so you install apps in containers like iOS or Android do (and therefore OS updates basically-never break). https://flathub.org is generally the main app store for Linux containerized phone apps.

- Screenshots: It'll look the same as other Linux-on-phones, so like https://en.wikipedia.org/wiki/PostmarketOS for instance. It's just built differently.


Nitpick regarding apps: https://flathub.org/en/apps/collection/mobile/1 is the better link IMHO, even if not all apps in it do actually perform great on mobile [0], and some apps that work well on Mobile are not part of the collection due to lacking some bits in app metadata [1]. Help with sorting this out is very much welcome :-)

[0]: https://framagit.org/linuxphoneapps/linuxphoneapps.frama.io/...

[1]: https://framagit.org/linuxphoneapps/linuxphoneapps.frama.io/...


I'm sure PRs would be welcomed if you have those devices to test on.


Updating without worries has made it much more daily-drivable for me on a Oneplus 6 (ie. it has rollbacks and image-based updates), despite being so new. It's fun that image-based OSs - which were arguably popularlized by phones - are now coming back to phones on the Linux side too.


This is based on bootc (bootable containers), so note that the OS build is described in a normal Dockerfile: https://github.com/pocketblue/pocketblue/blob/main/Container... which is then run by the Github action (or locally).

Very similar to how Universal Blue, Bazzite, Bluefin etc. build at https://github.com/ublue-os/bazzite (see their Containerfile), but for mobile.

Has a similar mission to https://postmarketos.org, but with a different build system AFAICT


  > Dockerfile
nitpick: Containerfile. I mention it because people still think container==docker. I am sure the Fedora people focus on podman, as part of the Red Hat ecosystem. For a better dev experience they offer podman-bootc¹, which you will miss when using Docker. Personally I am convinced that we should steer people to podman instead of Docker.

1. https://docs.fedoraproject.org/en-US/bootc/getting-started/


Red Hat obviously wants to change people's vocabulary but "Dockerfile" is basically an industry-standard generic term by this point.


That is true, the same for "to google" if people mean "to search". It does bury the generality of the concept though. Like I said, a nitpick.


I think it's also worth noting that the Dockerfile format is still driven by Docker, and there have been zero extensions to the format by Podman folks, so Containerfile==Dockerfile.


Are we really bringing OCI to freaking OS builds? Nothing about OCI is pleasant. A list of Tarballs is the most backwards boot format I can think of. Terrible for reproducibility. Terrible for security.

Boot images should be Dm-verity protected EROFS images. We should not be building new things on OCI. It's really mind-blowing to me that this is a new direction people who are supposed to be top of class OS builders are moving to as a direction.

They took the CoreOS dream and threw everything in the trash


How is OCI terrible for reproducibility and security? They are certainly more reproducible than what we had before. I haven't heard "Works on my machine for a long time". If you're talking about reproducible builds, there aren't any hard issues either that are directly caused by OCI images - except setting the clock correctly.

> Boot images should be Dm-verity protected EROFS images

Maybe I'm misunderstanding you - I gather that you think the boot images are distributed as OCI images? That's not the case, bootc is more about building the image, updating it and the overall structure. Booting an image built with bootc does not involve any container infrastructure (unless you start services that depend on containers, I guess - but that's deep in userspace). There's technically nothing preventing this from using verified read-only images.


> I gather that you think the boot images are distributed as OCI image

Yes? That's literally the sales pitch on the website. Am I missing something?

Quote from https://bootc-dev.github.io/ tells me that bootc is using OCI as a delivery format for bootable images.

Transactional, in-place operating system updates using OCI/Docker container images.

Motivation The original Docker container model of using "layers" to model applications has been extremely successful. This project aims to apply the same technique for bootable host systems - using standard OCI/Docker containers as a transport and delivery format for base operating system updates


For the record, bootc supports and has workflows for verity images.


  > Dm-verity protected EROFS images
First time I hear about it. Playing the devils advocate: how does it improve over checksums + tarballs?


checksums + tarballs don't help with runtime integrity verification. You'll need additional technologies for that like dm-verity or fs-verity; see composefs.


If you're thinking about actually using it, please not it only supports redhat distros.

https://github.com/bootc-dev/bootc/issues/865


I really hope _this_ quote is not fabricated - because what a fantastic quote!!



Fascinating repo, thank you for sharing!


Hot take from an AI skeptic: between this, Nano Banana and generative AI integrated into Gmail for repetitive emails, I’m starting to actually use Google’s AI for tasks I hate most.

Google appears to have their AI product game together!


Agreed! It's my default recommendation now for a "just works" Linux system nowadays.

It's also really great for development btw - been doing all of my development on it with Homebrew and Flatpaks for over a year now.


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: