Update README.md (#12877)

This commit is contained in:
Guancheng Fu 2025-02-24 09:59:17 +08:00 committed by GitHub
parent 10400abfb7
commit 02ec313eab
No known key found for this signature in database
GPG key ID: B5690EEEBB952194

View file

@ -241,3 +241,10 @@ llm = LLM(model="DeepSeek-R1-Distill-Qwen-7B", # Unquantized model path on disk
When finish executing, the low-bit model has been saved at `/llm/fp8-model-path`. When finish executing, the low-bit model has been saved at `/llm/fp8-model-path`.
Later we can use the option `--low-bit-model-path /llm/fp8-model-path` to use the low-bit model. Later we can use the option `--low-bit-model-path /llm/fp8-model-path` to use the low-bit model.
### 5. Known issues
#### Runtime memory
If runtime memory is a concern, you can set --swap-space 0.5 to reduce memory consumption during execution. The default value for --swap-space is 4, which means that by default, the system reserves 4GB of memory for use when GPU memory is insufficient.