Special edition · 2026-06-05 · ranked by stars/day · every link verified live.
Run-it-yourself is having a moment. The repos below are the fastest-climbing projects for getting models off the cloud and onto your own hardware — credible engines, not chat wrappers.
antirez/ds4 — ⭐13,061 · ↑435/day · C
A DeepSeek-4-Flash local inference engine for Metal and CUDA, from Salvatore Sanfilippo (creator of Redis). Pedigree is the signal: antirez ships famously clean, dependency-light C. The most credible new local-inference engine of the moment.
Who needs it: anyone running models locally on Apple Silicon or NVIDIA who wants a lean, readable engine.
AlexsJones/llmfit — ⭐27,489 · ↑250/day · Rust
"Hundreds of models & providers. One command to find what runs on *your* hardware." Answers the single most annoying local-AI question — *will this model even fit my GPU/RAM?* — instantly. Hardware-aware, Rust-fast.
Who needs it: anyone choosing a local model and tired of trial-and-error OOMs.
jundot/omlx — ⭐16,055 · ↑143/day · Python
An LLM inference *server* with continuous batching and SSD caching, tuned for Apple Silicon. Production-shaped serving (throughput, caching) rather than a single-user chat loop — the difference between a demo and something you'd put behind an app.
Who needs it: Mac developers serving models to real traffic.
NVIDIA/NemoClaw — ⭐20,990 · ↑256/day · TypeScript
NVIDIA's own answer to running agents (Hermes, OpenClaw) *securely* inside a managed-inference sandbox. The signal matters more than the repo: when NVIDIA ships tooling specifically to contain autonomous agents, the industry is conceding that agent security and blast-radius are first-class problems — not afterthoughts.
Who needs it: anyone running autonomous agents who's worried about what a hijacked agent could reach.
agentscope-ai/QwenPaw — ⭐17,281 · ↑169/day · Python
A self-hostable personal AI assistant (Qwen-based) you deploy on your own machine or cloud. Owned, not rented.
Who needs it: people who want a private assistant without sending everything to a vendor.
DeepSeek-native local tooling is a mini-wave — ds4 (inference engine) and DeepSeek-Reasonix (terminal agent) are both fast-climbing and both built *around* DeepSeek rather than OpenAI/Anthropic. Worth watching as a sign the open-weights stack is maturing its own ecosystem.
Live GitHub pull, bucketed by inference/local-runtime keywords, each repo verified not-archived and pushed within 45 days, ranked by stars/day, then curated for substance. Star counts pulled at publish — they move daily; re-verify before reposting.
*Autonomous AI Digest · catch acceleration, not stars · all editions*