স্বল্পমূল্যে অত্যাধুনিক মানের কোডিং

Translated for your language. Read the original.

AI-assisted draft.

GyaanSetu Editorial৬ দিন আগে2min read

Frontier-Quality Coding at Cheap-Tier Cost

You can get frontier-quality coding scores at a fraction of the cost.

We built a system that uses a cheap local model for most tasks. It only sends hard problems to a frontier model. This method works because of the structure, not just the model size.

How the architecture works:

Two channels: A capability channel (the cheap local model) and a structure channel (verification gates).
Verification: Guards decide if an answer is trustworthy.
Escalation: If guards fail, the system moves the request to a frontier model.
Cache: A cache layer prevents re-solving exact repeats.

The results from our HumanEval+ tests:

Full cascade score: 94.5% plus correctness.
Local model solo score: 84.8% plus correctness.
The structure channel adds roughly 10 points of accuracy.

We tested the importance of the structure through an ablation study:

Full system: 100% correct.
Removed verification: 75% correct.
Removed guards: 50% correct.

Correctness drops by half when you remove the guards. This proves the structure carries the reliability.

The cost benefits:

Blended cost: $0.00201 per request.
Frontier cost: $0.017 per request.
Our system is about 8x cheaper than using a frontier model for every request.
91% of requests are served by the local model.

A note on long context:

Our compaction layer uses 165 tokens compared to 28,000 tokens for raw context. This is a massive increase in efficiency. We hit an infrastructure limit at 208k tokens, but this is a setting, not a model failure.

What we have not proven yet:

We do not have official long-horizon benchmark numbers. We have built the runners for RULER and SWE-bench, but we have not run them in a clean sandbox. We are not claiming official results for long-horizon performance yet.

Summary of our claim:

Our system matches frontier coding scores while using cheap local models. This reduces costs by 8x. The reliability comes from our structure channel.

Source: https://dev.to/tom_jones_230c4659491adcd/frontier-quality-coding-at-cheap-tier-cost-what-we-built-and-how-we-measured-it-3g2j

Optional learning community: https://t.me/GyaanSetuAi

স্বল্পমূল্যে অত্যাধুনিক মানের কোডিং

Continue reading

এজেন্ট আর্কিটেকচার হলো একটি কম্পিউট অ্যালোকেশন প্রবলেম

𝗧𝗶𝗲𝗿𝗲𝗱 𝗔𝗜 𝗖𝗼𝗱𝗲 𝗥𝗲𝘃𝗶𝗲𝘄: 𝗔 𝗙𝗿𝗮𝗺𝗲𝘄𝗼𝗿𝗸 𝗳𝗼𝗿 𝗔𝗜 𝗚𝗲𝗻𝗲𝗿𝗮𝘁𝗲𝗱 𝗣𝗥𝘀

ভেরিফিকেশন খরচই হলো এআই কোডিংয়ের আসল খরচ

কম খরচের এআই কোডিং মডেলের জন্য একটি যাচাইকরণ সিঁড়ি