๐ง๐ต๐ฒ ๐๐ฒ๐ป๐ฒ๐ณ๐ถ๐๐ ๐ข๐ณ ๐๐ผ๐บ๐ฏ๐ถ๐ป๐ถ๐ป๐ด ๐๐ผ๐ฐ๐ฎ๐น ๐ฎ๐ป๐ฑ ๐๐ผ๐๐๐ฒ๐ฑ ๐๐๐ I am working on a large project called gas-fakes. It allows local execution, continuous integration, and containerization of native Apps Script code.
- We are liberating Apps Script code
- We are making it more accessible
My project has 4399 methods and 10,500 parity tests. I am now using AI to help with coding work.
- I use a local model to do the heavy work
- I use a hosted model for planning and decision-making
You can use this technique to save time and money.
- You can use a local model to reduce token costs
- You can use a hosted model for high-level planning
Here's how it works:
- The hosted model determines what needs to be done
- The local model executes the task
- The hosted model integrates the result into the final response
To get started, you need to set up a local model orchestrator.
- You can use oMLX on a Mac
- You can use other orchestrators on other devices
You can control the system behavior via environment variables.
- You can verify that everything is working
- You can check the oMlx dashboard
Source: https://dev.to/brucemcpherson/combining-local-and-hosted-llm-to-minimize-token-cost-o69 Optional learning community: https://t.me/GyaanSetuAi