“primary reason nearly all LLM inference endpoints are nondeterministic is that the load (and thus batch-size) nondeterministically varies!” Posted on September 11, 2025 by jgordon https://simonwillison.net/2025/Sep/11/defeating-nondeterminism/#atom-everything Not primarily floating point arithmetic