𝗦𝘁𝗼𝗽 𝗟𝗟𝗠 𝗖𝗼𝘀𝘁 𝗦𝗽𝗶𝗸𝗲𝘀 𝗕𝗲𝗳𝗼𝗿𝗲 𝗕𝗶𝗹𝗹𝗶𝗻𝗴

You use OTel and OpenInference. You see token counts. You do not see which team spends money.

Use these three attributes.

  • team.id: Tag spans at the gateway. This shows cost by team.
  • feature.id: Tag the feature. This shows which feature spikes.
  • llm.model: Separate cheap models from expensive ones.

Run a daily query in Grafana. Look at the 95th percentile of output tokens. Group by team, feature, and model.

Set an alert for a 2x jump in the 7-day average. This caught a retry loop last quarter. The main dashboard missed it. Total spend stayed under budget. One team spent double.

Skip user.id for privacy. Skip request.id to keep data small.

Source: https://dev.to/jasmine_park_dev/span-attributes-that-catch-llm-cost-regressions-before-billing-does-472n Optional learning community: https://t.me/GyaanSetuAi