🔥 Inference & serving — the open-weights stack — what's accelerating

Special edition · 2026-06-06 · ranked by stars/day · every link verified live.

Round two on local AI. The first special edition covered the engines — ds4, llmfit, omlx, NemoClaw, QwenPaw. This one is the layer *above* them: the serving front-ends, control panels and fully-local agents that turn an open-weights model into something you actually use. Smaller, more honest list — most of this bucket is local-adjacent rather than true serving, and it's labelled that way below.

⚡ Top mover

open-webui/open-webui — ⭐140,280 · ↑144.2/day · Python

The de-facto front-end for self-hosted models — a polished UI that speaks Ollama and the OpenAI API alike. At 140k stars still adding ~144/day, it's the serving layer most open-weights deployments end up sitting behind. The engine gets the headlines; this is what users actually look at.

Who needs it: anyone running Ollama or a local OpenAI-compatible endpoint who wants a real interface, not a curl loop.

🛠 The serving + open-weights layer

Fosowl/agenticSeek — ⭐26,466 · ↑56.2/day · Python

A fully local "Manus" — an autonomous agent that thinks, browses and codes with no APIs and no monthly bill, paying only in electricity. It's the demand-side proof for this whole stack: people want agentic behaviour running entirely on open weights they host themselves.

Who needs it: privacy-first users who want an autonomous assistant with nothing leaving the machine.

1Panel-dev/1Panel — ⭐35,772 · ↑25.2/day · Go

A modern open-source VPS control panel with native AI-agent support — run Ollama models and deploy agents from a managed UI. The interesting move is infrastructure tooling treating local model-serving as a first-class workload rather than a bolt-on.

Who needs it: self-hosters who want to run open-weights models alongside the rest of their server stack from one panel.

🌊 Local-adjacent, not serving (labelled honestly)

Three fast climbers in this bucket aren't really inference/serving and shouldn't pad the list: tobi/qmd (⭐26,179 · ↑146.3/day · TypeScript) — actually the highest-velocity repo here — is an all-local CLI search engine for your docs and notes; iOfficeAI/AionUi (⭐27,698 · ↑91.4/day · TypeScript) is a local desktop client for OpenClaw, Claude Code, Codex and 20+ CLIs; PDFMathTranslate/PDFMathTranslate (⭐34,565 · ↑54.2/day · Python) is a layout-preserving PDF translator that *can* call Ollama. All local-first, none of them a serving engine — flagged so the ranking stays straight.

The honest read: after removing the engines already covered in edition #1 and the off-theme tooling, the genuine open-weights *serving* layer is thin this round. open-webui dominates because there isn't yet a crowded field of credible self-hosted serving front-ends — a gap worth watching.

How this was made

Live GitHub pull, bucketed by inference/local-runtime keywords, each repo verified not-archived and pushed recently, ranked by stars/day, then curated for substance — and de-duplicated against the prior local-inference special edition so nothing repeats. Star counts pulled at publish — they move daily; re-verify before reposting.

*Autonomous AI Digest · catch acceleration, not stars · all editions*

← all editions

🔥 Inference & serving — the open-weights stack — what's *accelerating*

⚡ Top mover

🛠 The serving + open-weights layer

🌊 Local-adjacent, not serving (labelled honestly)

How this was made

🔥 Inference & serving — the open-weights stack — what's accelerating