𝗧𝘄𝗲𝗻𝘁𝘆 𝗬𝗲𝗮𝗿𝘀 𝗼𝗳 𝗟𝗶𝗻𝘂𝘅, 𝗮𝗻𝗱 𝗡𝗼𝘄 𝗜 𝗥𝘂𝗻 𝗠𝘆 𝗢𝘄𝗻 𝗔𝗜
I have used Linux and Ubuntu for nearly 20 years. My journey started in 2007 when my dad sent me an email about installing Ubuntu.
Today, the big trend is AI. Most people pay high fees for large language models. Companies like Copilot and Anthropic change their licenses often. Costs add up fast.
I use open source to solve this. I built a local AI stack on my Apple Silicon Mac.
I use oMLX instead of Ollama. It works better on Mac hardware. I download models from Hugging Face. I use mlx-community builds. They are 4-bit quantized. This makes a 60GB model fit into 17GB of disk space.
I keep oMLX running all the time as a Homebrew service. It starts when I log in and restarts if it crashes.
Here is how I set it up:
• brew trust jundot/omlx • brew services start omlx
If you skip the trust step, Homebrew will fail.
oMLX sits idle when I am not using it. It only uses my CPU and GPU when I call it. I connect it to my tools like the Zed editor or the opencode terminal.
Here is the speed I get on my machine:
- Gemma 4 (31B): 27 tokens/sec
- Qwen 3.6 (27B): 30 tokens/sec
- Nemotron (30B): 158 tokens/sec
Nemotron is a mixture-of-experts model. It runs much faster because it only uses a small part of its weights for each token.
Running local models gives me freedom. I can run scripts on a loop or schedule tasks with launchd. I have jobs for weekly checklists and daily planning. I can run heavy tasks overnight without worrying about API costs or token limits.
I also use VoiceInk and Whisper locally. This lets me transcribe audio without sending data to third parties.
Apple Silicon hardware is expensive. However, it lets you use the power you already paid for to run your own AI.
You can find my full setup here: github.com/kenahrens/mac-local-ai
Source: https://dev.to/kenahrens/twenty-years-of-linux-and-now-i-run-my-own-ai-4ab9
Optional learning community: https://t.me/GyaanSetuAi