๐ฅ๐๐ป๐ป๐ถ๐ป๐ด ๐๐ฎ๐ป๐ด๐๐ฎ๐ด๐ฒ ๐ ๐ผ๐ฑ๐ฒ๐น๐ ๐ถ๐ป ๐๐ต๐ฒ ๐๐ฟ๐ผ๐๐๐ฒ๐ฟ
Cloud LLMs cost a lot. You give your data away for free. Local models solve this.
SmolLM2 runs in your browser. It uses your GPU. If the GPU fails, it uses the CPU.
Large models need huge resources. GPT-4 needs terabytes of RAM. Your browser only has a few gigabytes. This is why most AI lives in the cloud.
I built a web app with SmolLM2. It is 140MB. It works on desktops. Mobile browsers struggle with the setup.
I tried pre-compiled models. These models work better on mobile. They skip the hard setup step. They are bigger in size.
A small, well-tuned model often beats a large, poor one.
Current outputs are poor. They are not as good as big LLMs. But the technology is growing.
Source: https://dev.to/garciadiazjaime/running-language-models-directly-in-the-browser-2c3e