𝗠𝘆 𝗔𝗜 𝗜𝗻𝘁𝗲𝗴𝗿𝗮𝘁𝗶𝗼𝗻 𝗖𝗼𝘀𝘁 𝗧𝗼𝗼 𝗠𝘂𝗰𝗵 𝗨𝗻𝘁𝗶𝗹 𝗜 𝗖𝗵𝗮𝗻𝗴𝗲𝗱 𝗠𝘆 𝗔𝗽𝗽𝗿𝗼𝗮𝗰𝗵
I built a tool to summarize long articles. I used GPT-4 to make it work. The summaries were great. The bill was $1,200 in one month.
That cost was too high for my project. I had three choices. Fix the cost or stop the feature.
I tried switching to GPT-3.5. The price went down. The quality went down too. The summaries became vague and missed important facts.
I tried reducing the input text size. I used local libraries to pick important sentences first. It helped, but the cost stayed high. Small models still made mistakes on long text.
Then I found a better way. I stopped using one big model for everything. I started using a two-step pipeline.
Step 1: Extractive phase. Use a cheap, fast tool to pick the top 5 to 10 sentences from the article. This removes 90% of the text.
Step 2: Abstractive phase. Send only those few sentences to a small, cheap API. Ask it to combine those sentences into a clean summary.
This change cut my costs by 80%. The quality stayed close to GPT-4. The API calls became much smaller and cheaper.
I also learned two important lessons:
Use caching. If many users read the same article, do not run the process twice. Save the result.
Use layers. You do not always need a heavy model. Break complex tasks into small, cheap steps.
How do you balance AI quality and cost in your projects? Do you use different models for different steps?
Optional learning community: https://t.me/GyaanSetuAi