๐ฆ๐๐ผ๐ฝ ๐๐๐ฒ๐๐๐ถ๐ป๐ด ๐๐๐ ๐ฆ๐ฎ๐บ๐ฝ๐น๐ถ๐ป๐ด ๐ฃ๐ฎ๐ฟ๐ฎ๐บ๐ฒ๐๐ฒ๐ฟ๐
You pick temperature 0.7 because a blog says so. Your bot starts talking nonsense. You try top-p 0.9. Then top-k 50. You are guessing.
Most teams lack a test set. They use defaults for general chat. Your use case is different.
Learn these four knobs:
- Temperature: Controls randomness. Use 0 for facts. Use 0.7 for chat. Use 1.0 for ideas.
- Top-p: Keeps tokens until a probability limit is hit. It adapts to model confidence.
- Top-k: Keeps a fixed number of tokens. It is too rigid for most work.
- Min-p: Keeps tokens based on the top choice. It is a great safety net for production.
Production Recipes:
- Chat: temperature 0.7, top-p 0.9, min-p 0.05.
- Code: temperature 0.1, top-p 0.95, min-p 0.01.
- Labels: temperature 0.
Avoid these mistakes:
- Stop using temperature above 1.5. You get noise, not creativity.
- Stop using top-k as your only tool.
- Stop tuning without a metric. Measure first. Tune second.
If greedy decoding (temp 0) fails, your prompt is the problem. Sampling parameters will not fix a bad model.
Source: https://dev.to/tech_nuggets/sampling-strategies-compared-temperature-top-p-top-k-min-p-and-what-actually-works-in-2o16 Optional learning community: https://t.me/GyaanSetuAi