More

fbuilesv · 2026-04-07T20:29:33 1775593773

> The Fix: Stop Deploying So Fast

I don't know how Ultrathink works, and I have no "real world" experience with Kamal, but I find it intriguing to see someone consider 11 deployments in 2 hours to be "fast".

Instead of handicapping yourself, fix your deployment pipeline, 10 min deploys are not OK for an online store.

fbuilesv · 2025-12-21T00:08:38 1766275718

I would have liked to read about the "high availability" that's mentioned a couple of times in the article; the WAL Configuration section is not enough, and replication is expensive'ish.

fbuilesv · on Feb 5, 2025

From the article:

> Nice to meet you. I'm an engineer who runs a small mobile app development company.

fbuilesv · on Aug 1, 2024

I'm not well versed in LLMs, can someone with more experience share how this compares to Ollama (https://ollama.com/)? When would I use this instead?

Star_Ship_1010 · on Aug 1, 2024

Best answer to this is from Reddit

"how does a smart car compare to a ford f150? its different in its intent and intended audience.

Ollama is someone who goes to walmart and buys a $100 huffy mountain bike because they heard bikes are cool. Torchchat is someone who built a mountain bike out of high quality components chosen for a specific task/outcome with the understanding of how each component in the platform functions and interacts with the others to achieve an end goal." https://www.reddit.com/r/LocalLLaMA/comments/1eh6xmq/comment...

Longer Answer with some more details is

If you don't care about which quant you're using, only use ollama and want easy integration with desktop/laptop based projects use Ollama. If you want to run on mobile, integrate into your own apps or projects natively, don't want to use GGUF, want to do quantization, or want to extend your PyTorch based solution use torchchat

Right now Ollama (based on llama.cpp) is a faster way to get performance on a laptop desktop and a number of projects are pre-integrated with Ollama thanks to the OpenAI spec. It's also more mature with more fit and polish. That said the commands that make everything easy use 4bit quant models and you have to do extra work to go find a GGUF model with a higher (or lower) bit quant and load it into Ollama. Also worth noting is that Ollama "containerizes" the models on disk so you can't share them with other projects without going through Ollama which is a hard pass for any users and usecases since duplicating model files on disk isn't great. https://www.reddit.com/r/LocalLLaMA/comments/1eh6xmq/comment...

dagaci · on Aug 1, 2024

If you running windows anywhere then you better off using ollama, lmstudio, and or LLamaSharp for coding these are all cross-platform too.

lostmsu · on Aug 2, 2024

I found LlamaSharp to be quite unstable with random crashes in the built-in llama.cpp build.

sunshinesfbay · on Aug 1, 2024

Pretty cool! What are the steps to use these on mobile? Stoked about using ollama on my iPhone!

dagaci · on Aug 2, 2024

>> "If running windows" << All of these have web interfaces actually, and all of these implement the same openai api.

So you get to browse locally and remotely if you are able to expose the service remotely adjusting your router.

Coudflare will also expose services remotely if you wishhttps://developers.cloudflare.com/cloudflare-one/connections...

So you can also run on any LLM privately with ollama, lmstudio, and or LLamaSharp with windows, mac and iphone, all are opensource and customizable too and user friendly and frequently maintained.

JackYoustra · on Aug 1, 2024

Probably if you have any esoteric flags that pytorch supports. Flash attention 2, for example, was supported way earlier on pt than llama.cpp, so if flash attention 3 follows the same path it'll probably make more sense to use this when targeting nvidia gpus.

sunshinesfbay · on Aug 1, 2024

It would appear that Flash-3 is already something that exists for PyTorch based on this joint blog between Nvidia, Together.ai and Princeton about enabling Flash-3 for PyTorch: https://pytorch.org/blog/flashattention-3/

JackYoustra · on Aug 1, 2024

Right - my point about "follows the same path" mostly revolves around llama.cpp's latency in adopting it.

jerrygenser · on Aug 1, 2024

Olamma currently has only one "supported backend" which is llama.cpp. It enables downloading and running models on CPU. And might have more mature server.

This allows running models on GPU as well.

Zambyte · on Aug 1, 2024

I have been running Ollama on AMD GPUs (which support for came after NVIDIA GPUs) since February. Llama.cpp has supported it even longer.

tarruda · on Aug 1, 2024

How well does it run in AMD GPUs these days compared to Nvidia or Apple silicon?

I've been considering buying one of those powerful Ryzen mini PCs to use as an LLM server in my LAN, but I've read before that the AMD backend (ROCm IIRC) is kinda buggy

SushiHippie · on Aug 1, 2024

I have an RTX 7900 XTX and never had AMD specific issues, except that I needed to set some environment variable.

But it seems like integrated GPUs are not supported

https://github.com/ollama/ollama/issues/2637

RealStickman_ · on Aug 2, 2024

Not sure about Ollama, but llama.cpp supports vulkan for GPU computing.

darkteflon · on Aug 1, 2024

Ollama runs on GPUs just fine - on Macs, at least.

Kelteseth · on Aug 1, 2024

Forks fine on Windows with an AMD 7600XT

amunozo · on Aug 1, 2024

I use it in Ubuntu and works fine too.

ekianjo · on Aug 1, 2024

it runs on GPUs everywhere. On Linux, on Windows...

fbuilesv · on March 25, 2024

What package managers are people using in other languages to make sure that software "always works the same as when you first wrote it, without asterisks"? I'd like to understand how they solve the "package no longer exists in a central registry" problem.

crabbone · on March 25, 2024

This is not as much about a package manager as it is about conventions and necessary dependencies.

Standards ensure that going forward language semantics and syntax don't change. Having minimal dependencies ensures program longevity. Package manager cannot solve these problems, no matter how good it is at its job.

Python doesn't have a standard, it's heavily reliant on dependencies which are very plentiful and similarly unregulated. A program written in C that uses only functionality described in some POSIX standard will endure decades unmodified. Even Python helloworld program went stale sometime ago, even though it's just one line.

pxc · on March 26, 2024

> I'd like to understand how they solve the "package no longer exists in a central registry" problem.

This is, of course, an infrastructure/maintenance issue as much as a package manager design issue. But in Nix's case, the public 'binary cache' (for Nixpkgs/NixOS) of build outputs includes not only final build outputs but also the source tarballs that go into them. As Nix disallows network access at build time, all dependencies are represented this way, including jar files or source tarballs, or whatever— Nix itself must be the one to fetch your dependencies. Consequently, everything you fetch from the Internet for your build is a kind of intermediary Nix build that can be cached using the usual Nix tools. The Nix community's public cache has a policy of retaining copies of upstream sources forever (there is recently talk of limiting storage of the final built packages to a retention period of only 2 years, but sources will continue to be retained indefinitely. So far the cache reaches back to its inception around a decade ago.)

Taken together, these things mean that when a project disappears entirely from GitHub or Maven Central or whatever, people building against old versions of it with Nix/Nixpkgs don't even notice. Nix just fetches those upstream sources from the public cache without even reaching out to that central repository from which those sources have been removed.

For private use cases where your project and its dependencies won't be mirrored to the public cache of Nixpkgs builds, you can achieve the same effect by running your own cache or paying a hosted service to do that.

For builds outside the Nix universe, you can make special arrangements for each type of package your various builds fetch, and mirroring those repos. Then configure your builds to pull from your mirrors instead of the main/public ones.

Shrezzing · on March 25, 2024

Dotnet uses Nuget[1]. Packages in the system are immutable, & never changing. They can be unlisted, but never deleted (except in limited & extreme cases, like malware), which means even if a package maintainer stops publishing new versions to the repository, existing packages will continue to be publicly available for as long as Microsoft continues to exist.

[1] https://www.nuget.org/

mixmastamyk · on March 25, 2024

Often a package says it works with OS version X, and not X+1. It may be true or false. But what you described does not solve either version of that problem.

Or the "says it will work with > X," but doesn't.

zaptheimpaler · on March 25, 2024

I think you're referring to vendoring dependencies? In python/pip for example, you can download the source for a package and point to the folder directly as a dependency instead of the version or a git URL. Most package managers/languages support some version of that. I suppose if you wanted to vendor all dependencies by default, keep them updated etc it would take a little more scripting or extra tools.

fbuilesv · on March 22, 2023

It's a filter for the Stack Overflow API, you can test it out:

* https://api.stackexchange.com/2.2/questions/16476924?order=d...

I think this should contain a list of what's in the filter: https://api.stackexchange.com/2.3/filters/!-*f(6s6U8Q9b

dotancohen · on March 23, 2023

Terrific, thank you.

fbuilesv · on May 9, 2018

Can't edit the comment anymore, the correct email is: f.builes@catawiki.nl!

fbuilesv · on Feb 5, 2018

The correct email is: f.builes@catawiki.nl!

fbuilesv · on March 16, 2017

Using the GitLab calculator: If you are a developer in Medellín you are paid 3x less than if you are living in the Bay Area. If you are a customer in Medellín you pay the same as a customer in the Bay Area. In this case, "cost of living"[0] only applies to employees and not customers, which in my opinion says a lot about how much the company values their staff.

I appreciate the openness from GitLab on this matter though, they're pretty up-front about it so if you don't like it don't apply.

[0] When the word "cost of living" is mentioned but it's not reflected in the price for final users then you know you're about to get fucked.

fbuilesv · on July 19, 2016

It seems to me there is a lot of VIM vocabulary that is very different from what is common on the desktop today. I don't know what yank means. I don't think I have a desktop app that has a yank function.

In most Unix systems you should be able to go to the beginning of a line, press Ctrl-k and then Ctrl-y to yank. At the very least your browser and terminal should understand those two commands.

kedean · on July 21, 2016

Conversely, in many windows apps, ctrl+y is the 'redo' command.