๐๐๐ถ๐น๐ฑ๐ถ๐ป๐ด ๐ฎ ๐๐ต๐ฎ๐๐๐ฃ๐ง ๐ง๐ฎ๐๐ธ ๐ฆ๐ฐ๐ต๐ฒ๐ฑ๐๐น๐ฒ๐ฟ ๐ถ๐ป ๐๐ผ
I built a task scheduler with an MCP interface. The goal was to handle 10K jobs per second with reliable execution.
Most people ask AI to build a feature and then watch the AI edit files immediately. This leads to mess. I used a staged workflow called QRSPI to prevent this.
The QRSPI Workflow:
- Question: Ask neutral research questions.
- Research: Get objective answers from the actual code.
- Design: Decide where we are going and why.
- Structure: Create vertical slices and test points.
- Plan: Write a tactical, file-by-file document.
- Worktree: Use an isolated git worktree.
- Implement: Execute phase by phase.
- PR: Write a description based on the design.
This approach changed how I built the system.
Design Decisions:
- Research first: I mapped the terrain before designing. I found that the existing app only ran one goroutine and had no background loop precedent. Every decision I made followed these facts.
- Define non-goals: I wrote down what I was NOT building. This stopped the project from growing too large.
- Decouple with queues: I used a producer/consumer split. A watcher finds due work and pushes it to Redis. Workers pull from the queue and execute. This allows them to scale independently.
- Use Redis Streams: I used XADD and XREADGROUP for message exclusivity. This provides at-least-once delivery. I paired this with a job ID to ensure idempotency.
- Database bucketing: I used an hourly time_bucket column in Postgres. This keeps scans local and fast instead of searching a massive table.
- Clean shutdowns: I used context and WaitGroups. This ensures no goroutines leak and no work stops mid-flight during a shutdown.
Key Lessons:
- Research the existing code before you design.
- Decouple producers from consumers with a queue.
- Use at-least-once delivery plus an idempotency key.
- Stub your side effects. I used a StubExecutor to test the retry and DLQ logic before writing real handlers.
This is Part 1 of a two-part series. Part 1 covers the prototype. Part 2 will cover production hardening like multi-tenancy, DST-safe recurrence, and LLM reliability.
Repo: github.com/linkc0829/go-chatgpt-tasks
Optional learning community: https://t.me/GyaanSetuAi