Skip to content

fix: security hardening — unsafe deserialization, RCE, input validation#263

Open
devatsecure wants to merge 2 commits intomicrosoft:mainfrom
devatsecure:fix/security-hardening
Open

fix: security hardening — unsafe deserialization, RCE, input validation#263
devatsecure wants to merge 2 commits intomicrosoft:mainfrom
devatsecure:fix/security-hardening

Conversation

@devatsecure
Copy link
Copy Markdown

@devatsecure devatsecure commented Mar 28, 2026

Summary

Security hardening addressing 6 vulnerabilities found during automated security review using multiple AI models and static analysis.

Changes (6 fixes across 4 files)

1. torch.loadweights_only=True (CWE-502 — Medium)

File Line
demo/web/app.py 161
demo/realtime_model_inference_from_file.py 225
vibevoice/scripts/convert_nnscaler_checkpoint_to_transformers.py 32

weights_only=False allows arbitrary code execution via pickle deserialization in malicious .pt files. Changed to weights_only=True to load only tensor data.

Note: If voice preset .pt files contain non-tensor objects, this may need adjustment. Please verify with your voice preset files.

2. --trust-remote-code made opt-in (CWE-94 — Medium)

  • Removed from default vLLM command in _build_vllm_cmd()
  • Added --trust-remote-code CLI flag (default: False)

The default model (microsoft/VibeVoice-ASR) is from a trusted source, but hardcoding --trust-remote-code is a risk if users point --model at untrusted repos.

3. shell=True removed (CWE-78 — Low)

  • run_command() now always uses shell=False
  • All current callers pass command lists, so shell=True was unnecessary

4. WebSocket text size limit (CWE-770 — Low)

  • Added 10K character limit on /stream endpoint text parameter
  • Returns WebSocket close code 1008 for oversized inputs
  • Prevents excessive GPU inference from large text inputs

5. vLLM server API key authentication (CWE-306 — Medium)

  • Added --api-key CLI flag and VLLM_API_KEY env var support
  • Propagated through _build_vllm_cmd, start_vllm_server, start_dp_server
  • Without auth, anyone on the network can access the inference API

6. Config path traversal guard (CWE-22 — Low)

  • Sanitized init_config_name in checkpoint conversion script using Path.name + validation
  • Prevents ../ sequences from escaping the configs/ directory
  • Config name originates from untrusted checkpoint data

Test Plan

  • Verify voice preset loading works with weights_only=True
  • Verify server starts correctly without --trust-remote-code (may need the flag for custom model architectures)
  • Verify run_command() callers work without shell=True
  • Verify WebSocket connections with text > 10K chars are rejected
  • Verify --api-key / VLLM_API_KEY enables token-based auth on vLLM endpoints
  • Verify checkpoint conversion works with path sanitization
  • Verify normal TTS generation still works end-to-end

References

Found by Argus Security — AI-powered 6-phase security pipeline.

…on, input validation

1. torch.load(weights_only=False) → weights_only=True (CWE-502)
   - demo/web/app.py: voice preset loading
   - demo/realtime_model_inference_from_file.py: voice sample loading
   - vibevoice/scripts/convert_nnscaler_checkpoint_to_transformers.py: checkpoint loading
   Prevents arbitrary code execution via malicious .pt/.pth files.

2. --trust-remote-code made opt-in instead of default (CWE-94)
   - vllm_plugin/scripts/start_server.py: removed from default vLLM command
   - Added --trust-remote-code CLI flag (default: False)
   Prevents automatic execution of remote Python code from model repositories.

3. subprocess shell=True removed (CWE-78)
   - vllm_plugin/scripts/start_server.py: run_command() now always uses shell=False
   Eliminates command injection vector.

4. WebSocket text size limit added (CWE-770)
   - demo/web/app.py: 10K char limit on /stream endpoint text parameter
   Prevents denial of service via excessive GPU inference from oversized inputs.

Found by: Argus Security (https://github.com/devatsecure/Argus-Security)
@devatsecure
Copy link
Copy Markdown
Author

@microsoft-github-policy-service agree

5. Missing vLLM server authentication (CWE-306)
   - Added --api-key CLI flag and VLLM_API_KEY env var support
   - Propagated through _build_vllm_cmd, start_vllm_server, start_dp_server
   Without auth, anyone on the network can access the inference API.

6. Config path traversal in checkpoint conversion (CWE-22)
   - Sanitized init_config_name using Path.name + validation
   - Prevents "../" sequences from escaping the configs directory
   Config name comes from untrusted checkpoint data.

Found by: Argus Security (https://github.com/devatsecure/Argus-Security)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant