๐—ฆ๐—ฒ๐—น๐—ณ ๐—›๐—ผ๐˜€๐˜๐—ฒ๐—ฑ ๐—Ÿ๐—Ÿ๐—  ๐—–๐—ผ๐—ป๐˜€๐—ถ๐˜€๐˜๐—ฒ๐—ป๐—ฐ๐˜† ๐—œ๐˜€๐˜€๐˜‚๐—ฒ๐˜€

Sequential tests worked. You got the same answer every time.

Parallel tests failed. The model disagreed with itself. It returned the right answer 87% of the time.

Batching causes this. Requests arrive at once. Floating point results change between a batch of one and five. This change picks a different token. Temperature 0.0 does not stop this drift.

GPU scheduling controls this layer. You do not control it. Your agents run in parallel. This breaks your guarantee.

Follow these steps:

Source: https://dev.to/codelev/self-hosted-llm-same-prompt-temperature-zero-6-different-answers-ae6 Optional learning community: https://t.me/GyaanSetuAi