My CI/CD Pipeline Passed for 3 Months — Then I Read the Logs
Green checkmarks feel good. Every pull request passed. Every deploy worked.
Then a user reported a broken feature. It had been broken for weeks.
I opened the pipeline logs. Our passing builds lied to us.
Our process looked perfect:
- Linting
- Unit tests
- Integration tests
- Build
- Deploy
Every step had a 100% success rate for months.
The export button did nothing when clicked. No error appeared. I traced it back to a change from 11 weeks ago. The pipeline passed. The code review was approved. The feature was broken from the start.
The problem was not our code. It was our test code.
Our integration tests were using mocks for everything. We mocked the entire export service. The test checked the mock instead of the real code. The mock always returned a success status.
We fell into the over-mocking trap:
- Unit tests: Mocked everything to isolate the unit. This is okay.
- Integration tests: Mocked everything for speed. This is a mistake.
- E2E tests: Missed this specific flow.
Our integration tests were just expensive unit tests. They verified that our mocks worked. They did not verify that our code worked.
I made three changes to fix this:
Limit mocks to unit tests. Integration tests must hit real databases, APIs, and file systems. If a test is slow, do not hide it with a mock. Use that speed as a signal to optimize.
Add contract tests. These ensure your mocks match real service behavior. If a mock returns data that a real service would not, the contract test fails.
Track real coverage. We stopped looking at simple pass rates. We looked at what the tests actually exercised. Our coverage numbers dropped from 94% to 67%. This was the most honest metric we had.
A green pipeline does not mean your code works. It means your tests passed. These are different things.
The most dangerous bugs are the ones your pipeline says are fine.
Ask yourself these questions:
- Are my tests catching bugs or just confirming mocks?
- Do my integration tests actually integrate?
- If I remove all mocks, how many tests still pass?
- Am I measuring coverage or confidence?
A pipeline that never fails is not reliable. It is untested.
Source: https://dev.to/kollittle/my-cicd-pipeline-passed-for-3-months-then-i-read-the-logs-4mbj
