fix: harden input validation, file safety, and error handling by ankit1999 · Pull Request #227 · jamiepine/voicebox

ankit1999 · 2026-03-02T20:50:12Z

Summary

Addresses code review findings across frontend and backend — input validation, file integrity, error handling, and code deduplication.

Changes

Frontend

ModelManagement.tsx — Validate model_name format before removal; disable Remove button during pending mutation
client.ts — URL-encode modelId in removeCustomModel path
useGenerationForm.ts — Constrain modelSize with regex (built-in sizes + custom slugs)
useModelStatus.ts (new) — Shared hook extracted from GenerationForm / FloatingGenerateBox

Backend

custom_models.py — Atomic writes (temp file → fsync → os.replace), threading lock, corrupt config backup, strict owner/repo regex for hf_repo_id
models.py — Regex pattern on model_size Field
config.py — Platform-appropriate default data dir for PyInstaller --onefile bundles (via platformdirs)
main.py — Re-raise HTTPException before broad except Exception so 202 download-in-progress responses aren't swallowed as 500s
custom_models.json — Removed pre-seeded entry for clean installs
requirements.txt — Added platformdirs>=4.0.0
backend/README.md — Documented --data-dir CLI flag

Skipped

main.py probe_model_cache_status refactoring (pure refactoring, deferred to reduce regression risk)

Verification

✅ TypeScript tsc --noEmit — zero errors
✅ Python py_compile — all modified files pass

Summary by CodeRabbit

Release Notes

New Features
- Support for adding and managing custom HuggingFace text-to-speech models.
- Grouped model selection interface displaying built-in and custom models separately.
- Model management panel to register, remove, and manage custom TTS models.
- Configurable data directory for backend storage via --data-dir option.
Documentation
- Added backend documentation for custom data directory configuration.

- New custom_models.py module for CRUD management of user-defined HF TTS models - New /custom-models API endpoints (list, add, get, delete) - Updated MLX and PyTorch backends to resolve custom model paths (custom:slug format) - Added Custom Models section to ModelManagement UI with add/remove dialogs - Updated GenerationForm and FloatingGenerateBox with grouped model selectors - Added CustomModelCreate/Response types and API client methods - Added instruct field to GenerationRequest type - Graceful actool fallback in build.rs for non-Xcode environments - Added custom_models hidden import for PyInstaller bundling Author: AJ - Kamyab (Ankit Jain)

- ModelManagement.tsx: validate model_name format before removal, guard duplicate clicks - client.ts: URL-encode modelId in removeCustomModel path - useGenerationForm.ts: constrain modelSize with regex pattern - useModelStatus.ts: extract shared hook from GenerationForm/FloatingGenerateBox - custom_models.py: atomic writes, threading lock, corrupt config backup, strict hf_repo_id regex - models.py: add regex pattern to model_size Field - config.py: platform-appropriate default data dir for PyInstaller bundles - main.py: re-raise HTTPException before broad except to preserve 202 responses - custom_models.json: remove pre-seeded entry for clean installs - requirements.txt: add platformdirs dependency - backend/README.md: document --data-dir CLI flag

coderabbitai · 2026-03-02T20:50:36Z

📝 Walkthrough

Walkthrough

This PR adds comprehensive custom HuggingFace TTS model support to the application. Changes include a new backend module for persistent custom model storage, API endpoints for CRUD operations, frontend UI components for model selection and management, and integration throughout the generation workflow to handle both built-in and custom models during voice synthesis.

Changes

Cohort / File(s)	Summary
Frontend Model Selection UI `app/src/components/Generation/FloatingGenerateBox.tsx`, `app/src/components/Generation/GenerationForm.tsx`	Introduced useModelStatus hook to fetch built-in and custom models; replaced static model options with grouped SelectGroup/SelectLabel UI displaying built-in models or fallback options, with separate custom models section; updated labels from "Model Size" to "Model" and descriptive text.
Model Management Interface `app/src/components/ServerSettings/ModelManagement.tsx`	Added dialog-driven "Add Custom Model" UI for registering HuggingFace repos; implemented CustomModelItem component to display metadata and actions (download, delete cache, remove); extended model list to render custom models section with appropriate state guards; added mutation handlers for adding/removing custom models.
API Client & Types `app/src/lib/api/client.ts`, `app/src/lib/api/types.ts`	Added public API methods (listCustomModels, addCustomModel, removeCustomModel); updated deleteModel to use DELETE method with encoded URL; introduced new types (CustomModelCreate, CustomModelResponse, CustomModelListResponse); broadened GenerationRequest.model_size from enum to string to support custom identifiers; added is_custom field to ModelStatus.
Form & Hook Logic `app/src/lib/hooks/useGenerationForm.ts`, `app/src/lib/hooks/useModelStatus.ts` (new)	Relaxed modelSize validation to accept built-in sizes ("1.7B", "0.6B") or custom identifiers ("custom:<slug>"); added pre-flight model status check during generation submission; introduced new useModelStatus hook that fetches and groups models into builtInModels and customModels arrays with 10-second refetch interval.
Backend Custom Model Management `backend/custom_models.py` (new)	Created new module with thread-safe persistent JSON storage for custom models; implemented list_custom_models, get_custom_model, add_custom_model, remove_custom_model, and get_hf_repo_id_for_custom_model functions; includes HuggingFace repo ID validation, slug generation, atomic file operations, and corrupt file backup handling.
Backend Model Resolution & Integration `backend/backends/mlx_backend.py`, `backend/backends/pytorch_backend.py`, `backend/main.py`	Added custom model detection in MLX and PyTorch backends to handle "custom:" prefixed model_size values by resolving to HuggingFace repo IDs; extended main.py with custom model endpoints (GET/POST/GET/DELETE /custom-models); updated model status checking and download triggering to enumerate and support custom models; enhanced delete_model to unload and clean cache for custom models.
Backend API & Types `backend/models.py`	Updated GenerationRequest.model_size field pattern to accept built-in or custom identifiers with descriptive text; added is_custom boolean flag to ModelStatus; introduced CustomModelCreate, CustomModelResponse, and CustomModelListResponse classes for API contracts.
Configuration & Packaging `backend/config.py`, `backend/requirements.txt`, `backend/build_binary.py`, `backend/voicebox-server.spec`	Added platformdirs dependency; implemented data directory management functions (set_data_dir, get_data_dir, get_db_path, get_profiles_dir, etc.) with platform-specific fallback for PyInstaller bundles; added --data-dir override support; included backend.custom_models in hidden imports and refactored PyInstaller spec to use collect_all for mlx modules.
Documentation & Data Files `backend/README.md`, `data/custom_models.json`	Added documentation for --data-dir option with examples for custom data directory usage; created empty custom_models.json data file as configuration template.
Build System `tauri/src-tauri/build.rs`	Replaced hard panic flows on actool failure with cargo:warning messages and graceful continuation, allowing builds to proceed without fatal errors.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant Frontend
    participant API_Client
    participant Backend_API
    participant Custom_Models_Storage
    participant Model_Resolver

    User->>Frontend: Navigate to Model Management
    Frontend->>API_Client: listCustomModels()
    API_Client->>Backend_API: GET /custom-models
    Backend_API->>Custom_Models_Storage: list_custom_models()
    Custom_Models_Storage-->>Backend_API: Custom models list
    Backend_API-->>API_Client: CustomModelListResponse
    API_Client-->>Frontend: Render custom models UI

    User->>Frontend: Add custom model (repo ID, name)
    Frontend->>API_Client: addCustomModel(hf_repo_id, display_name)
    API_Client->>Backend_API: POST /custom-models
    Backend_API->>Custom_Models_Storage: add_custom_model(hf_repo_id, display_name)
    Custom_Models_Storage->>Custom_Models_Storage: Validate & generate slug
    Custom_Models_Storage->>Custom_Models_Storage: Save to custom_models.json
    Custom_Models_Storage-->>Backend_API: CustomModelResponse
    Backend_API-->>API_Client: Success
    API_Client-->>Frontend: Update UI, show new model

    User->>Frontend: Generate audio with custom model
    Frontend->>API_Client: submitGenerationForm(model: "custom:slug")
    API_Client->>Backend_API: POST /generate
    Backend_API->>Model_Resolver: Resolve "custom:slug" to HF repo
    Custom_Models_Storage-->>Model_Resolver: HF repo ID
    Model_Resolver->>Model_Resolver: Download/load model from HuggingFace
    Model_Resolver->>Backend_API: Model ready
    Backend_API->>Backend_API: Generate audio
    Backend_API-->>API_Client: Audio result
    API_Client-->>Frontend: Display result
    Frontend-->>User: Play/download audio

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Poem

🐰 Hooray for custom models, now we hop and play,
HuggingFace repos bundled in a special way,
Built-ins grouped with customs in SelectGroups so fine,
Persistent JSON storage keeps the configs in line,
From frontend to backend, the whole system aligned! 🎵

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'fix: harden input validation, file safety, and error handling' accurately describes the main hardening improvements across the PR: input validation constraints, file integrity measures, and error handling fixes.
Docstring Coverage	✅ Passed	Docstring coverage is 81.58% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Tip

Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs).
Share your feedback on Discord.

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

app/src/lib/hooks/useGenerationForm.ts (1)

120-132: ⚠️ Potential issue | 🟠 Major

Handle download-pending responses before using result as GenerationResponse.

This block assumes result.duration exists. When backend returns 202 download-pending payload, this can crash the success path.

✅ Safer flow: early-return on undownloaded model

       try {
         const modelStatus = await apiClient.getModelStatus();
         const model = modelStatus.models.find((m) => m.model_name === modelName);

         if (model) {
           displayName = model.display_name;
           if (!model.downloaded) {
             // Not yet downloaded — enable progress tracking UI
             setDownloadingModelName(modelName);
             setDownloadingDisplayName(displayName);
+            if (!model.downloading) {
+              await apiClient.triggerModelDownload(modelName);
+            }
+            toast({
+              title: 'Model download in progress',
+              description: `${displayName} is downloading. Try generating again when ready.`,
+            });
+            return;
           }
         }
       } catch (error) {

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@app/src/lib/hooks/useGenerationForm.ts` around lines 120 - 132, The success
handler assumes result is a fully available GenerationResponse and accesses
result.duration, which will crash if the backend returned a 202 download-pending
payload; update the flow in useGenerationForm where generation.mutateAsync is
awaited (the block that constructs toast with result.duration) to first detect
the download-pending response (e.g., check a status field or absence of
duration/download URL on the returned object) and early-return or show a
"download pending" toast instead of accessing result.duration; ensure subsequent
logic only runs when result.duration exists (guard on result.duration or
result.status === 'ready') so you never dereference duration on a pending
response.

🧹 Nitpick comments (5)

backend/config.py (1)

50-57: Consider ensuring the data directory exists in get_data_dir().

Unlike set_data_dir() and the subdirectory getters (e.g., get_profiles_dir()), get_data_dir() does not call mkdir(). If called before set_data_dir() and the directory doesn't exist, callers may encounter errors when writing files.
🔧 Proposed fix
 def get_data_dir() -> Path:
     """
     Get the data directory path.
 
     Returns:
         Path to the data directory
     """
+    _data_dir.mkdir(parents=True, exist_ok=True)
     return _data_dir
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/config.py` around lines 50 - 57, get_data_dir() should ensure the
underlying directory exists like set_data_dir() and get_profiles_dir() do:
before returning the module-level _data_dir Path, call
_data_dir.mkdir(parents=True, exist_ok=True) (or equivalent) to create the
directory if missing so callers won’t fail when writing files; update the
get_data_dir() function to perform this mkdir step and then return _data_dir.

backend/backends/mlx_backend.py (1)

40-65: Consider extracting shared custom model resolution logic.

The custom model resolution code (lines 40-49) is nearly identical to pytorch_backend.py (lines 65-73). While this duplication is minimal and acceptable, you could extract it to a shared utility if this pattern expands to more backends.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@backend/backends/mlx_backend.py` around lines 40 - 65, The duplicate
custom-model resolution block (checking model_size.startswith("custom:"),
extracting custom_id, calling get_hf_repo_id_for_custom_model and raising
ValueError) should be extracted into a shared helper so both mlx_backend and
pytorch_backend reuse it; create a utility function (e.g.,
resolve_custom_model_hf_repo or get_hf_repo_id_from_model_size) that accepts
model_size, imports/calls get_hf_repo_id_for_custom_model, returns the
hf_repo_id or raises the ValueError, and replace the inline code in mlx_backend
(the model_size.startswith("custom:") branch) and the corresponding code in
pytorch_backend with calls to this new helper; keep existing behavior/logging
(print) and preserve mlx_model_map/other logic.

backend/custom_models.py (1)

47-62: Use logger.exception to include traceback in error logs.

Within exception handlers, logger.exception automatically includes the traceback, which is more useful for debugging than logger.error.

♻️ Proposed fix

     except json.JSONDecodeError as exc:
         # Back up the corrupt file so we don't lose data
         backup = path.with_suffix(
             f".json.corrupt.{datetime.utcnow().strftime('%Y%m%dT%H%M%S')}"
         )
         try:
             path.rename(backup)
-            logger.error(
+            logger.exception(
                 "Corrupt custom_models.json backed up to %s: %s", backup, exc
             )
         except OSError as rename_err:
-            logger.error(
+            logger.exception(
                 "Failed to back up corrupt config %s: %s (original error: %s)",
                 path, rename_err, exc,
             )
         raise

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@backend/custom_models.py` around lines 47 - 62, Replace the logger.error
calls inside the JSON decode and nested OSError handlers with logger.exception
so the traceback is recorded: in the except json.JSONDecodeError as exc block,
change the logger.error("Corrupt custom_models.json backed up to %s: %s",
backup, exc) to logger.exception(...) and in the inner except OSError as
rename_err block change logger.error("Failed to back up corrupt config %s: %s
(original error: %s)", path, rename_err, exc) to logger.exception(...)
(preserving the descriptive messages and context) so the full tracebacks for the
JSONDecodeError and the rename error are included.

backend/main.py (1)

1546-1552: Don’t silently swallow cache-probe exceptions in model status paths.

Current except Exception: pass behavior hides actionable failures. Log at debug level so support can diagnose cache-state inconsistencies.

🧭 Minimal observability patch

+import logging
+logger = logging.getLogger(__name__)
@@
-                        except Exception:
-                            pass
+                        except Exception:
+                            logger.debug("Custom model cache probe failed (incomplete check): %s", hf_repo_id, exc_info=True)
@@
-                            except Exception:
-                                pass
+                            except Exception:
+                                logger.debug("Custom model size computation failed: %s", hf_repo_id, exc_info=True)
@@
-                                except Exception:
-                                    pass
-                except Exception:
-                    pass
+                                except Exception:
+                                    logger.debug("Custom model fallback size computation failed: %s", hf_repo_id, exc_info=True)
+                except Exception:
+                    logger.debug("Custom model fallback cache probe failed: %s", hf_repo_id, exc_info=True)
@@
-            except Exception:
-                pass
+            except Exception:
+                logger.debug("Custom model loaded-state check failed: %s", model_name, exc_info=True)
@@
-        except Exception:
+        except Exception:
+            logger.debug("Custom model status assembly failed: %s", model_name, exc_info=True)
             is_downloading = model_name in active_download_names

Also applies to: 1557-1560, 1584-1593, 1598-1601, 1618-1628

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@backend/main.py` around lines 1546 - 1552, Replace silent except blocks that
currently do "except Exception: pass" in the model status cache probes (the
blocks that compute cache_dir_path, blobs_dir and has_incomplete using
hf_constants.HF_HUB_CACHE and hf_repo_id) with exception handlers that call the
module logger at debug level including a descriptive message and the exception
details (e.g., "failed probing HF cache for repo {hf_repo_id},
cache_dir={cache_dir_path}") so failures aren't swallowed; make the same change
for the other similar blocks flagged (the ones around lines computing
blobs_dir/has_incomplete in the same function) to ensure consistent debug-level
observability while preserving existing control flow.

app/src/components/Generation/GenerationForm.tsx (1)

146-174: Extract shared model-option rendering/mapping to avoid drift.

The built-in/custom select rendering and model_name → sizeValue mapping is duplicated in FloatingGenerateBox.tsx; fallback text has already started diverging. Consider a shared helper/component for this block.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@app/src/components/Generation/GenerationForm.tsx` around lines 146 - 174, The
built-in/custom model option rendering and the model_name→sizeValue mapping in
GenerationForm (see builtInModels, customModels, SelectGroup/SelectItem usage
and the sizeValue = model.model_name.replace('qwen-tts-', '')) are duplicated in
FloatingGenerateBox; extract a shared helper or small presentational component
(e.g., renderModelOptions or ModelSelectGroup component and a mapModelNameToSize
utility) that returns the same SelectGroup/SelectItem structure and performs the
model_name→sizeValue transformation, then replace the duplicated loops in both
GenerationForm.tsx and FloatingGenerateBox.tsx to call the new shared
function/component so fallback text and mapping remain consistent.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@app/src/components/Generation/GenerationForm.tsx`:
- Line 37: The hook useModelStatus currently classifies builtInModels using
model_name.startsWith('qwen-tts'), which incorrectly flags repos like
"qwen-tts/my-voice" as built-in; update the predicate in useModelStatus (the
filter that builds builtInModels/customModels) to only treat true built-ins such
as the core qwen-tts identifier (e.g., model_name === 'qwen-tts' or match a
pattern that disallows a following '/'), for example replace
startsWith('qwen-tts') with a stricter check (exact equality or a regex like
/^qwen-tts($|[:@])/) so repo-qualified names with a slash go to customModels.

In `@app/src/components/ServerSettings/ModelManagement.tsx`:
- Around line 617-623: The Remove button currently disables only when
model.loaded or isUnregistering is true; update the disable condition for the
Button (the component with onClick={onRemove}) to also check the showDownloading
flag so removal is prevented while a download is active. Locate the Button that
uses onRemove and the disabled prop, and add showDownloading (or its local/state
variable) into the combined disabled expression alongside model.loaded and
isUnregistering.

In `@backend/custom_models.py`:
- Line 10: Remove the unused fcntl import from the top of custom_models.py
(delete the `import fcntl` statement), verify there are no remaining references
to fcntl in the file (the module already uses threading.Lock for
synchronization), and run tests/linting to confirm no regressions.

In `@backend/voicebox-server.spec`:
- Around line 16-19: The spec unconditionally calls collect_all('mlx') and
collect_all('mlx_audio') which fails when MLX isn't installed; wrap those
collect_all calls in a runtime presence check (e.g., try/except ImportError or
importlib.util.find_spec) before invoking collect_all so that if 'mlx' or
'mlx_audio' is missing you skip updating tmp_ret/datas/binaries/hiddenimports.
Specifically, guard the collect_all('mlx') and collect_all('mlx_audio') blocks
(the places that assign tmp_ret and then do datas += tmp_ret[0]; binaries +=
tmp_ret[1]; hiddenimports += tmp_ret[2]) so they only run when the package is
importable.

---

Outside diff comments:
In `@app/src/lib/hooks/useGenerationForm.ts`:
- Around line 120-132: The success handler assumes result is a fully available
GenerationResponse and accesses result.duration, which will crash if the backend
returned a 202 download-pending payload; update the flow in useGenerationForm
where generation.mutateAsync is awaited (the block that constructs toast with
result.duration) to first detect the download-pending response (e.g., check a
status field or absence of duration/download URL on the returned object) and
early-return or show a "download pending" toast instead of accessing
result.duration; ensure subsequent logic only runs when result.duration exists
(guard on result.duration or result.status === 'ready') so you never dereference
duration on a pending response.

---

Nitpick comments:
In `@app/src/components/Generation/GenerationForm.tsx`:
- Around line 146-174: The built-in/custom model option rendering and the
model_name→sizeValue mapping in GenerationForm (see builtInModels, customModels,
SelectGroup/SelectItem usage and the sizeValue =
model.model_name.replace('qwen-tts-', '')) are duplicated in
FloatingGenerateBox; extract a shared helper or small presentational component
(e.g., renderModelOptions or ModelSelectGroup component and a mapModelNameToSize
utility) that returns the same SelectGroup/SelectItem structure and performs the
model_name→sizeValue transformation, then replace the duplicated loops in both
GenerationForm.tsx and FloatingGenerateBox.tsx to call the new shared
function/component so fallback text and mapping remain consistent.

In `@backend/backends/mlx_backend.py`:
- Around line 40-65: The duplicate custom-model resolution block (checking
model_size.startswith("custom:"), extracting custom_id, calling
get_hf_repo_id_for_custom_model and raising ValueError) should be extracted into
a shared helper so both mlx_backend and pytorch_backend reuse it; create a
utility function (e.g., resolve_custom_model_hf_repo or
get_hf_repo_id_from_model_size) that accepts model_size, imports/calls
get_hf_repo_id_for_custom_model, returns the hf_repo_id or raises the
ValueError, and replace the inline code in mlx_backend (the
model_size.startswith("custom:") branch) and the corresponding code in
pytorch_backend with calls to this new helper; keep existing behavior/logging
(print) and preserve mlx_model_map/other logic.

In `@backend/config.py`:
- Around line 50-57: get_data_dir() should ensure the underlying directory
exists like set_data_dir() and get_profiles_dir() do: before returning the
module-level _data_dir Path, call _data_dir.mkdir(parents=True, exist_ok=True)
(or equivalent) to create the directory if missing so callers won’t fail when
writing files; update the get_data_dir() function to perform this mkdir step and
then return _data_dir.

In `@backend/custom_models.py`:
- Around line 47-62: Replace the logger.error calls inside the JSON decode and
nested OSError handlers with logger.exception so the traceback is recorded: in
the except json.JSONDecodeError as exc block, change the logger.error("Corrupt
custom_models.json backed up to %s: %s", backup, exc) to logger.exception(...)
and in the inner except OSError as rename_err block change logger.error("Failed
to back up corrupt config %s: %s (original error: %s)", path, rename_err, exc)
to logger.exception(...) (preserving the descriptive messages and context) so
the full tracebacks for the JSONDecodeError and the rename error are included.

In `@backend/main.py`:
- Around line 1546-1552: Replace silent except blocks that currently do "except
Exception: pass" in the model status cache probes (the blocks that compute
cache_dir_path, blobs_dir and has_incomplete using hf_constants.HF_HUB_CACHE and
hf_repo_id) with exception handlers that call the module logger at debug level
including a descriptive message and the exception details (e.g., "failed probing
HF cache for repo {hf_repo_id}, cache_dir={cache_dir_path}") so failures aren't
swallowed; make the same change for the other similar blocks flagged (the ones
around lines computing blobs_dir/has_incomplete in the same function) to ensure
consistent debug-level observability while preserving existing control flow.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 38bf96f and a7e698a.

⛔ Files ignored due to path filters (2)

bun.lock is excluded by !**/*.lock
tauri/src-tauri/Cargo.lock is excluded by !**/*.lock

📒 Files selected for processing (19)

app/src/components/Generation/FloatingGenerateBox.tsx
app/src/components/Generation/GenerationForm.tsx
app/src/components/ServerSettings/ModelManagement.tsx
app/src/lib/api/client.ts
app/src/lib/api/types.ts
app/src/lib/hooks/useGenerationForm.ts
app/src/lib/hooks/useModelStatus.ts
backend/README.md
backend/backends/mlx_backend.py
backend/backends/pytorch_backend.py
backend/build_binary.py
backend/config.py
backend/custom_models.py
backend/main.py
backend/models.py
backend/requirements.txt
backend/voicebox-server.spec
data/custom_models.json
tauri/src-tauri/build.rs

coderabbitai · 2026-03-02T21:02:25Z

app/src/components/Generation/GenerationForm.tsx

  const { form, handleSubmit, isPending } = useGenerationForm();

+  // Use shared hook for model status fetching and grouping
+  const { builtInModels, customModels } = useModelStatus();


⚠️ Potential issue | 🟡 Minor

Prevent built-in/custom option overlap from hook classification.

builtInModels from useModelStatus is based on model_name.startsWith('qwen-tts'). A custom repo like qwen-tts/my-voice can appear in both groups and be misinterpreted in built-in mapping.

🔧 Proposed fix (in app/src/lib/hooks/useModelStatus.ts)

-const builtInModels = - modelStatus?.models.filter((m) => m.model_name.startsWith('qwen-tts')) || []; +const builtInModels = + modelStatus?.models.filter((m) => !m.is_custom && m.model_name.startsWith('qwen-tts-')) || [];

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@app/src/components/Generation/GenerationForm.tsx` at line 37, The hook useModelStatus currently classifies builtInModels using model_name.startsWith('qwen-tts'), which incorrectly flags repos like "qwen-tts/my-voice" as built-in; update the predicate in useModelStatus (the filter that builds builtInModels/customModels) to only treat true built-ins such as the core qwen-tts identifier (e.g., model_name === 'qwen-tts' or match a pattern that disallows a following '/'), for example replace startsWith('qwen-tts') with a stricter check (exact equality or a regex like /^qwen-tts($|[:@])/) so repo-qualified names with a slash go to customModels.

coderabbitai · 2026-03-02T21:02:25Z

app/src/components/ServerSettings/ModelManagement.tsx

+        <Button
+          size="sm"
+          onClick={onRemove}
+          variant="ghost"
+          title="Remove custom model from list"
+          disabled={model.loaded || isUnregistering}
+        >


⚠️ Potential issue | 🟡 Minor

Disable “remove custom model” while download is active.

Unregistering during showDownloading can create inconsistent UX/state around in-flight downloads.

🛠️ Small UI guard improvement

<Button size="sm" onClick={onRemove} variant="ghost" title="Remove custom model from list" - disabled={model.loaded || isUnregistering} + disabled={model.loaded || isUnregistering || showDownloading} >

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@app/src/components/ServerSettings/ModelManagement.tsx` around lines 617 - 623, The Remove button currently disables only when model.loaded or isUnregistering is true; update the disable condition for the Button (the component with onClick={onRemove}) to also check the showDownloading flag so removal is prevented while a download is active. Locate the Button that uses onRemove and the disabled prop, and add showDownloading (or its local/state variable) into the combined disabled expression alongside model.loaded and isUnregistering.

coderabbitai · 2026-03-02T21:02:25Z

backend/custom_models.py

+@author AJ - Kamyab (Ankit Jain)
+"""
+
+import fcntl


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

rg 'fcntl' backend/custom_models.py

Repository: jamiepine/voicebox

Length of output: 74

🏁 Script executed:

rg 'fcntl[\.(\s]' backend/custom_models.py

Repository: jamiepine/voicebox

Length of output: 44

🏁 Script executed:

head -20 backend/custom_models.py

Repository: jamiepine/voicebox

Length of output: 467

Remove unused fcntl import on line 10.

The fcntl module is imported but never used in this file. Since fcntl is Unix-only, this import will cause an ImportError on Windows. The module already uses threading.Lock for synchronization, making fcntl unnecessary.

Proposed fix

-import fcntl import json import logging

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

import fcntl

import json

import logging

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@backend/custom_models.py` at line 10, Remove the unused fcntl import from the top of custom_models.py (delete the `import fcntl` statement), verify there are no remaining references to fcntl in the file (the module already uses threading.Lock for synchronization), and run tests/linting to confirm no regressions.

coderabbitai · 2026-03-02T21:02:25Z

backend/voicebox-server.spec

+tmp_ret = collect_all('mlx')
+datas += tmp_ret[0]; binaries += tmp_ret[1]; hiddenimports += tmp_ret[2]
+tmp_ret = collect_all('mlx_audio')
+datas += tmp_ret[0]; binaries += tmp_ret[1]; hiddenimports += tmp_ret[2]


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

git ls-files | head -20

Repository: jamiepine/voicebox

Length of output: 446

🏁 Script executed:

find . -name "*.spec" -o -name "*requirements*.txt" -o -name "pyproject.toml" -o -name "setup.cfg" | grep -v ".git" | head -20

Repository: jamiepine/voicebox

Length of output: 169

🏁 Script executed:

ls -la backend/ 2>/dev/null || echo "backend dir not found"

Repository: jamiepine/voicebox

Length of output: 1843

🏁 Script executed:

# Check if the spec file exists and examine it if [ -f "backend/voicebox-server.spec" ]; then echo "=== backend/voicebox-server.spec ===" head -30 backend/voicebox-server.spec fi

Repository: jamiepine/voicebox

Length of output: 1802

🏁 Script executed:

echo "=== backend/requirements.txt ===" && cat backend/requirements.txt && echo && \ echo "=== backend/requirements-mlx.txt ===" && cat backend/requirements-mlx.txt

Repository: jamiepine/voicebox

Length of output: 672

🏁 Script executed:

echo "=== Build workflows ===" && \ for f in .github/workflows/*.yml; do echo "--- $f ---" cat "$f" | head -50 done

Repository: jamiepine/voicebox

Length of output: 2778

🏁 Script executed:

echo "=== Check Makefile for build commands ===" && \ cat Makefile 2>/dev/null | head -80

Repository: jamiepine/voicebox

Length of output: 3156

🏁 Script executed:

# Verify PyInstaller collect_all behavior and check if error handling exists in build_binary.py echo "=== backend/build_binary.py ===" && \ head -50 backend/build_binary.py && \ echo "..." && \ tail -30 backend/build_binary.py

Repository: jamiepine/voicebox

Length of output: 2784

🏁 Script executed:

# Check if there's any error handling around collect_all in the repo rg -n "collect_all|try.*collect" backend/

Repository: jamiepine/voicebox

Length of output: 268

Guard MLX collect_all calls for environments where MLX is absent.

The spec file unconditionally calls collect_all('mlx') and collect_all('mlx_audio') (lines 16–19), but MLX is declared as Apple Silicon-only in requirements-mlx.txt and conditionally installed only on aarch64-apple-darwin. Windows and Intel Mac CI builds skip MLX dependencies, causing PyInstaller to fail when the spec file is executed on these platforms. Add package-presence guards before collecting.

Suggested fix

+from importlib.util import find_spec + tmp_ret = collect_all('mlx') datas += tmp_ret[0]; binaries += tmp_ret[1]; hiddenimports += tmp_ret[2] tmp_ret = collect_all('mlx_audio') datas += tmp_ret[0]; binaries += tmp_ret[1]; hiddenimports += tmp_ret[2] +if find_spec("mlx") is not None: + _d, _b, _h = collect_all("mlx") + datas += _d + binaries += _b + hiddenimports += _h + +if find_spec("mlx_audio") is not None: + _d, _b, _h = collect_all("mlx_audio") + datas += _d + binaries += _b + hiddenimports += _h

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

tmp_ret = collect_all('mlx')

datas += tmp_ret[0]; binaries += tmp_ret[1]; hiddenimports += tmp_ret[2]

tmp_ret = collect_all('mlx_audio')

datas += tmp_ret[0]; binaries += tmp_ret[1]; hiddenimports += tmp_ret[2]

from importlib.util import find_spec

if find_spec("mlx") is not None:

_d, _b, _h = collect_all("mlx")

datas += _d

binaries += _b

hiddenimports += _h

if find_spec("mlx_audio") is not None:

_d, _b, _h = collect_all("mlx_audio")

datas += _d

binaries += _b

hiddenimports += _h

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@backend/voicebox-server.spec` around lines 16 - 19, The spec unconditionally calls collect_all('mlx') and collect_all('mlx_audio') which fails when MLX isn't installed; wrap those collect_all calls in a runtime presence check (e.g., try/except ImportError or importlib.util.find_spec) before invoking collect_all so that if 'mlx' or 'mlx_audio' is missing you skip updating tmp_ret/datas/binaries/hiddenimports. Specifically, guard the collect_all('mlx') and collect_all('mlx_audio') blocks (the places that assign tmp_ret and then do datas += tmp_ret[0]; binaries += tmp_ret[1]; hiddenimports += tmp_ret[2]) so they only run when the package is importable.

ankit1999 added 2 commits March 2, 2026 03:26

coderabbitai bot reviewed Mar 2, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: harden input validation, file safety, and error handling#227

fix: harden input validation, file safety, and error handling#227
ankit1999 wants to merge 2 commits intojamiepine:mainfrom
ankit1999:fix/code-review-hardening

ankit1999 commented Mar 2, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Mar 2, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Mar 2, 2026

Uh oh!

coderabbitai bot Mar 2, 2026

Uh oh!

coderabbitai bot Mar 2, 2026

Uh oh!

coderabbitai bot Mar 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

-tmp_ret = collect_all('mlx')
-datas += tmp_ret[0]; binaries += tmp_ret[1]; hiddenimports += tmp_ret[2]
-tmp_ret = collect_all('mlx_audio')
-datas += tmp_ret[0]; binaries += tmp_ret[1]; hiddenimports += tmp_ret[2]
+from importlib.util import find_spec
+if find_spec("mlx") is not None:
+    _d, _b, _h = collect_all("mlx")
+    datas += _d
+    binaries += _b
+    hiddenimports += _h
+if find_spec("mlx_audio") is not None:
+    _d, _b, _h = collect_all("mlx_audio")
+    datas += _d
+    binaries += _b
+    hiddenimports += _h

Conversation

ankit1999 commented Mar 2, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Frontend

Backend

Skipped

Verification

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai bot commented Mar 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ankit1999 commented Mar 2, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 2, 2026 •

edited

Loading