๐—ฅ๐˜‚๐—ป๐—ป๐—ถ๐—ป๐—ด ๐—Ÿ๐—ฎ๐—ป๐—ด๐˜‚๐—ฎ๐—ด๐—ฒ ๐— ๐—ผ๐—ฑ๐—ฒ๐—น๐˜€ ๐—ถ๐—ป ๐˜๐—ต๐—ฒ ๐—•๐—ฟ๐—ผ๐˜„๐˜€๐—ฒ๐—ฟ

Cloud LLMs cost a lot. You give your data away for free. Local models solve this.

SmolLM2 runs in your browser. It uses your GPU. If the GPU fails, it uses the CPU.

Large models need huge resources. GPT-4 needs terabytes of RAM. Your browser only has a few gigabytes. This is why most AI lives in the cloud.

I built a web app with SmolLM2. It is 140MB. It works on desktops. Mobile browsers struggle with the setup.

I tried pre-compiled models. These models work better on mobile. They skip the hard setup step. They are bigger in size.

A small, well-tuned model often beats a large, poor one.

Current outputs are poor. They are not as good as big LLMs. But the technology is growing.

Source: https://dev.to/garciadiazjaime/running-language-models-directly-in-the-browser-2c3e