𝗦𝘁𝗼𝗽 𝗟𝗟𝗠 𝗖𝗼𝘀𝘁 𝗦𝗽𝗶𝗸𝗲𝘀 𝗕𝗲𝗳𝗼𝗿𝗲 𝗕𝗶𝗹𝗹𝗶𝗻𝗴
You use OTel and OpenInference. You see token counts. You do not see which team spends money.
Use these three attributes.
- team.id: Tag spans at the gateway. This shows cost by team.
- feature.id: Tag the feature. This shows which feature spikes.
- llm.model: Separate cheap models from expensive ones.
Run a daily query in Grafana. Look at the 95th percentile of output tokens. Group by team, feature, and model.
Set an alert for a 2x jump in the 7-day average. This caught a retry loop last quarter. The main dashboard missed it. Total spend stayed under budget. One team spent double.
Skip user.id for privacy. Skip request.id to keep data small.
Source: https://dev.to/jasmine_park_dev/span-attributes-that-catch-llm-cost-regressions-before-billing-does-472n Optional learning community: https://t.me/GyaanSetuAi