請求前にLLMコストの急増を阻止する

Translated for your language. 原文を読む.

AI-assisted draft.

3 週間前1分で読めます

𝗦𝘁𝗼𝗽 𝗟𝗟𝗠 𝗖𝗼𝘀𝘁 𝗦𝗽𝗶𝗸𝗲𝘀 𝗕𝗲𝗳𝗼𝗿𝗲 𝗕𝗶𝗹𝗹𝗶𝗻𝗴

You use OTel and OpenInference. You see token counts. You do not see which team spends money.

Use these three attributes.

team.id: Tag spans at the gateway. This shows cost by team.
feature.id: Tag the feature. This shows which feature spikes.
llm.model: Separate cheap models from expensive ones.

Run a daily query in Grafana. Look at the 95th percentile of output tokens. Group by team, feature, and model.

Set an alert for a 2x jump in the 7-day average. This caught a retry loop last quarter. The main dashboard missed it. Total spend stayed under budget. One team spent double.

Skip user.id for privacy. Skip request.id to keep data small.

Source: https://dev.to/jasmine_park_dev/span-attributes-that-catch-llm-cost-regressions-before-billing-does-472n Optional learning community: https://t.me/GyaanSetuAi

請求前にLLMコストの急増を阻止する

続きを読む

𝗪𝗵𝘆 𝗬𝗼𝘂𝗿 𝗔𝗴𝗲𝗻𝘁𝘀 𝗔𝗿𝗲 𝗕𝘂𝗿𝗻𝗶𝗻𝗴 𝗧𝗼𝗸𝗲𝗻𝘀

𝗖𝗼𝘀𝘁 𝗢𝗽𝘁𝗶𝗺𝗶𝘇𝗮𝘁𝗶𝗼𝗻 𝗳𝗼𝗿 𝗟𝗟𝗠 𝗦𝘆𝘀𝘁𝗲𝗺𝘀

AIコストが膨れ上がる理由と、その制御方法

𝗛𝗼𝘄 𝗜 𝗖𝘂𝘁 𝗢𝘂𝗿 𝗔𝗜 𝗔𝗣𝗜 𝗕𝗶𝗹𝗹 𝗶𝗻 𝗛𝗮𝗹𝗳 𝗪𝗵𝗶𝗹𝗲 𝗛𝗶𝘁𝘁𝗶𝗻𝗴 𝗽𝟵𝟵 𝗦𝗟𝗔𝘀

𝟳 𝗪𝗮𝘆𝘀 𝘁𝗼 𝗥𝗲𝗱𝘂𝗰𝗲 𝗬𝗼𝘂𝗿 𝗔𝗜 𝗕𝗶𝗹𝗹