𝗙𝗿𝗼𝗺 𝗖𝗵𝗮𝗼𝘀 𝘁𝗼 𝗖𝗼𝗻𝘀𝗶𝘀𝘁𝗲𝗻𝗰𝘆: 𝗗𝗼𝗰𝗸𝗲𝗿 𝗳𝗼𝗿 𝗔𝗜 𝗪𝗼𝗿𝗸𝗳𝗹𝗼𝘄𝘀
You trained the model. The notebook runs. The demo works.
Then you send it to a teammate. Forty minutes later, you get a message. They have a CUDA error. Torch won't import. They have the wrong Python version.
You say the words every engineer dreads: "It works on my machine."
"It works on my machine" is a confession. It means your code depends on things on your laptop that you did not document. You missed a Python version, a system library, or a CUDA toolkit.
Docker stops this.
A typical web app has few dependencies. AI projects have many layers:
• Python packages: Torch, Transformers, and Numpy. • System libraries: Libgl1 or ffmpeg. • The CUDA stack: Drivers and toolkits. • Model weights: Large files not in your repo. • Python version: 3.10 on your laptop, 3.12 on the server.
A requirements.txt file only captures one layer. Docker captures all of them.
A Docker image is a snapshot of a computer. It includes the OS, Python, packages, and your code. A container is a running copy of that snapshot.
A container shares your machine kernel. It starts in seconds. One image runs the same way on your laptop and a cloud GPU server.
Use a Dockerfile to write your recipe. Here is a template for PyTorch:
FROM python:3.11-slim
RUN apt-get update && apt-get install -y --no-install-recommends
build-essential
libgl1
&& rm -rf /var/lib/apt/lists/*
WORKDIR /app
COPY requirements.txt . RUN pip install --no-cache-dir -r requirements.txt
COPY . .
CMD ["python", "predict.py"]
Follow these rules to avoid mistakes:
• Pin your versions. Use torch==2.3.1 instead of just torch. This ensures reproducibility. • Copy requirements first. This uses Docker caching. It makes builds fast by skipping re-installs if your code changes but your packages do not. • Use volumes for weights. Do not put 5GB models inside the image. Use a volume to link your local folder to the container. • One process per container. Do not put your API and database in one container. Use Docker Compose to link them.
With Docker, your teammate only needs two commands:
git clone your-repo docker compose up
No more debugging environment errors. It works on every machine.
Source: https://dev.to/sachinsingh2156/from-chaos-to-consistency-docker-for-modern-ai-workflows-2gb7
Optional learning community: https://t.me/GyaanSetuAi