vLLM: temperature=NaN and temperature=Infinity bypass validation and propagate to GPU kernels
- When
- Where
- Global (internet)
- Category
- cyber_advisory · pip
## Summary All temperature validation gates use comparison operators (`<`, `>`), which silently evaluate to `False` for `NaN` and for positive `Infinity` in Python's IEEE 754 float semantics. Both values pass every guard and propagate to GPU sampling kernels, where they produce undefined behavior or CUDA errors that can crash the inference worker. Note: `-Infinity` is correctly caught. ## Root Cause `sampling_params.py:384`: ```python if 0 < self.temperature < _MAX_TEMP: # NaN → False; +Inf → False ``` `sampling_params.py:462`: ```python if self.temperature < 0.0: # NaN → False; +Inf → False raise VLLMValidationError(...) ``` No `math.isnan()` or `math.isinf()` check exists anywhere in `sampling_params.py`. Python semantics (verified): `float('nan') < 0.0` → `False`, `float('inf') < 0.0` → `False`. ## Impact Crash of inference worker on GPU kernel execution with NaN/Inf softmax input, degrading service for all concurrent users. ## Remediation Add `math.isfinite(self.temperature)` check in `_verify_args()`. Reject non-finite float values with a 400 error. ## Fix A fix for this vulnerability was merged here: https://github.com/vllm-project/vllm/pull/45116
Sources
- GitHub Advisory Database ↗ · first seen 2026-06-17 14:02 UTC
Defaxon links out to the original reporting and never republishes article text.
Correlated events
Computed by the Defaxon correlation engine — linked by shared actors, co-location, and temporal proximity. Scored hypotheses, never causal claims.