Complete reference for all PixelProbe configuration options, environment variables, and performance tuning.
- Environment Variables
- Docker Compose Configuration
- Database Configuration
- Performance Tuning
- Scanning Configuration
- Celery Configuration
- Data Retention Configuration
- Security Configuration
- Resource Recommendations
All configuration is done via environment variables, either in .env file or directly in docker-compose.yml.
| Variable | Description | Example |
|---|---|---|
SECRET_KEY |
Flask session secret key (64 chars) | Generate with: python -c "import secrets; print(secrets.token_hex(32))" |
POSTGRES_PASSWORD |
PostgreSQL database password | your-secure-password |
MEDIA_PATH |
Host path to media files (Docker only) | /mnt/media |
| Variable | Default | Description |
|---|---|---|
POSTGRES_HOST |
localhost |
PostgreSQL server hostname |
POSTGRES_PORT |
5432 |
PostgreSQL server port |
POSTGRES_DB |
pixelprobe |
Database name |
POSTGRES_USER |
pixelprobe |
Database username |
POSTGRES_PASSWORD |
(required) | Database password |
DATABASE_ECHO |
false |
Enable SQL query logging (debug) |
| Variable | Default | Description |
|---|---|---|
FLASK_ENV |
production |
Flask environment (production, development, testing) |
SCAN_PATHS |
/media |
Comma-separated paths to scan inside container |
TZ |
UTC |
Timezone for timestamps (e.g., America/New_York) |
PORT |
5000 |
Web interface port |
| Variable | Default | Description | Recommendations |
|---|---|---|---|
MAX_WORKERS |
10 |
Parallel file scanning workers per task | 10-24 for most systems |
BATCH_SIZE |
100 |
Files per batch during discovery | 50-200 based on file sizes |
MAX_OUTPUT_SIZE |
10000 |
Max output characters before rotation | 10000-50000 |
OUTPUT_ROTATION_ENABLED |
true |
Enable output truncation | true for large scans |
FREEZE_DETECTION_ENABLED |
true |
Enable video freeze detection (freezedetect + blackdetect) | false to skip and reduce scan time |
Performance Notes:
MAX_WORKERScontrols parallelism within each scan task- Each worker creates 1 database connection
- Total connections = 60 (main app pool) + MAX_WORKERS
- Keep under PostgreSQL max_connections (default: 100)
| Variable | Default | Description |
|---|---|---|
CELERY_BROKER_URL |
redis://localhost:6379/0 |
Redis URL for task queue |
CELERY_RESULT_BACKEND |
redis://localhost:6379/0 |
Redis URL for results |
CELERY_CONCURRENCY |
4 |
Number of concurrent Celery tasks |
CELERY_LOG_LEVEL |
INFO |
Celery log level (DEBUG, INFO, WARNING, ERROR) |
Celery Notes:
CELERY_CONCURRENCYcontrols how many scan tasks run simultaneously- Independent from
MAX_WORKERS(which controls parallelism within each task) - Recommended: 4-8 for most systems
| Variable | Default | Description |
|---|---|---|
REDIS_MAX_MEMORY |
2gb |
Maximum Redis memory for task queue |
Redis Notes:
- For large libraries (1M+ files), increase to 4gb
- Redis stores task queue and results temporarily
- Uses
noevictionpolicy to prevent task loss
| Variable | Default | Description |
|---|---|---|
EXCLUDED_PATHS |
(empty) | Comma-separated paths to exclude from scanning |
EXCLUDED_EXTENSIONS |
.txt,.log,.md |
Comma-separated file extensions to exclude |
PERIODIC_SCAN_SCHEDULE |
(empty) | Automated scan schedule (cron or interval format) |
CLEANUP_SCHEDULE |
(empty) | Automated cleanup schedule (cron or interval format) |
Schedule Format Examples:
# Cron format (standard cron syntax)
PERIODIC_SCAN_SCHEDULE=cron:0 2 * * * # Daily at 2 AM
CLEANUP_SCHEDULE=cron:0 3 * * 0 # Weekly on Sunday at 3 AM
# Interval format
PERIODIC_SCAN_SCHEDULE=interval:hours:6 # Every 6 hours
CLEANUP_SCHEDULE=interval:days:7 # Every 7 days| Variable | Default | Description |
|---|---|---|
SCAN_OUTPUT_RETENTION_DAYS |
30 |
Days before archiving scan outputs (currently disabled) |
REPORT_RETENTION_DAYS |
90 |
Days before deleting old reports |
SCAN_STATE_RETENTION_DAYS |
7 |
Days before deleting completed scan states |
LOG_RETENTION_DAYS |
30 |
Days before deleting old log entries (configurable via UI) |
Data Retention Notes:
- Automated cleanup runs daily via Celery Beat
SCAN_OUTPUT_RETENTION_DAYSis currently not used (scan results kept forever)- Configurable via environment variables for future flexibility
LOG_RETENTION_DAYSdefault is stored in theapp_configsdatabase table and can be changed via the UI (System > View Logs) or API (PUT /api/logs/retention)
| Variable | Default | Description |
|---|---|---|
ENABLE_MONITORING |
false |
Enable Prometheus metrics endpoint |
METRICS_PORT |
9090 |
Metrics endpoint port |
Minimal docker-compose.yml for production:
version: '3.8'
services:
postgres:
image: postgres:15-alpine
container_name: pixelprobe-postgres
environment:
POSTGRES_DB: pixelprobe
POSTGRES_USER: pixelprobe
POSTGRES_PASSWORD: ${POSTGRES_PASSWORD}
volumes:
- postgres_data:/var/lib/postgresql/data
healthcheck:
test: ["CMD-SHELL", "pg_isready -U pixelprobe"]
interval: 10s
timeout: 5s
retries: 5
redis:
image: redis:7-alpine
container_name: pixelprobe-redis
command: redis-server --maxmemory ${REDIS_MAX_MEMORY:-2gb} --maxmemory-policy noeviction
healthcheck:
test: ["CMD", "redis-cli", "ping"]
interval: 10s
timeout: 5s
retries: 5
pixelprobe:
image: ttlequals0/pixelprobe:latest
container_name: pixelprobe-app
environment:
SECRET_KEY: ${SECRET_KEY}
POSTGRES_HOST: postgres
POSTGRES_PASSWORD: ${POSTGRES_PASSWORD}
CELERY_BROKER_URL: redis://redis:6379/0
CELERY_RESULT_BACKEND: redis://redis:6379/0
SCAN_PATHS: ${SCAN_PATHS:-/media}
MAX_WORKERS: ${MAX_WORKERS:-10}
TZ: ${TZ:-UTC}
volumes:
- ${MEDIA_PATH}:/media:ro
ports:
- "${PORT:-5000}:5000"
depends_on:
postgres:
condition: service_healthy
redis:
condition: service_healthy
celery-worker:
image: ttlequals0/pixelprobe:latest
container_name: pixelprobe-celery-worker
command: python celery_worker.py
environment:
CELERY_BROKER_URL: redis://redis:6379/0
CELERY_RESULT_BACKEND: redis://redis:6379/0
POSTGRES_HOST: postgres
POSTGRES_PASSWORD: ${POSTGRES_PASSWORD}
SECRET_KEY: ${SECRET_KEY}
MAX_WORKERS: ${MAX_WORKERS:-10}
CELERY_CONCURRENCY: ${CELERY_CONCURRENCY:-4}
volumes:
- ${MEDIA_PATH}:/media:ro
depends_on:
postgres:
condition: service_healthy
redis:
condition: service_healthy
volumes:
postgres_data:To scan multiple directories:
Method 1: Multiple Volume Mounts
pixelprobe:
environment:
SCAN_PATHS: /movies,/tv-shows,/photos
volumes:
- /mnt/movies:/movies:ro
- /mnt/tv-shows:/tv-shows:ro
- /mnt/photos:/photos:ro
celery-worker:
environment:
SCAN_PATHS: /movies,/tv-shows,/photos
volumes:
- /mnt/movies:/movies:ro
- /mnt/tv-shows:/tv-shows:ro
- /mnt/photos:/photos:roMethod 2: Single Parent Volume
pixelprobe:
environment:
SCAN_PATHS: /media/movies,/media/tv,/media/photos
volumes:
- /mnt/all-media:/media:roBoth pixelprobe and celery-worker MUST run as the same user to access media files:
pixelprobe:
user: "${PUID:-1000}:${PGID:-1000}"
volumes:
- ${MEDIA_PATH}:/media:ro
celery-worker:
user: "${PUID:-1000}:${PGID:-1000}" # MUST match pixelprobe
volumes:
- ${MEDIA_PATH}:/media:roFind your UID/GID:
id -u # Shows UID (typically 1000)
id -g # Shows GID (typically 1000)Configured in config.py:
SQLALCHEMY_ENGINE_OPTIONS = {
'pool_size': 20, # Base connection pool size
'pool_pre_ping': True, # Test connections before use
'pool_recycle': 3600, # Recycle connections after 1 hour
'max_overflow': 40, # Additional connections when pool exhausted
'pool_timeout': 30, # Timeout waiting for connection
}Total Connections: 20 (base) + 40 (overflow) + MAX_WORKERS = 60 + MAX_WORKERS
PostgreSQL max_connections:
- Default: 100 connections
- Recommended: 150+ for production
- Set in PostgreSQL:
ALTER SYSTEM SET max_connections = 150;
For PostgreSQL optimization:
-- Increase shared buffers (25% of RAM)
ALTER SYSTEM SET shared_buffers = '2GB';
-- Increase work memory for sorts
ALTER SYSTEM SET work_mem = '16MB';
-- Enable parallel queries
ALTER SYSTEM SET max_parallel_workers_per_gather = 4;
-- Restart PostgreSQL
SELECT pg_reload_conf();MAX_WORKERS=4
CELERY_CONCURRENCY=2
BATCH_SIZE=50
REDIS_MAX_MEMORY=512mbMAX_WORKERS=10
CELERY_CONCURRENCY=4
BATCH_SIZE=100
REDIS_MAX_MEMORY=1gbMAX_WORKERS=16
CELERY_CONCURRENCY=6
BATCH_SIZE=200
REDIS_MAX_MEMORY=2gbMAX_WORKERS=24
CELERY_CONCURRENCY=8
BATCH_SIZE=200
REDIS_MAX_MEMORY=4gbDocker resource limits for large libraries:
celery-worker:
deploy:
resources:
limits:
cpus: '8' # Limit CPU cores
memory: 8G # Limit RAM
reservations:
cpus: '4' # Guaranteed CPU cores
memory: 4G # Guaranteed RAMFor best performance:
- Database Storage: SSD strongly recommended
- Media Storage: Can be HDD, but SSD improves scan speed
- Temp Files: Use tmpfs for temporary files (optional)
pixelprobe:
volumes:
- /mnt/ssd/postgres_data:/var/lib/postgresql/data # SSD for database
- /mnt/hdd/media:/media:ro # HDD OK for media
tmpfs:
- /tmp:size=1G # tmpfs for temp filesExclude specific paths or file types from scanning:
Via Environment Variables:
EXCLUDED_PATHS=/media/temp,/media/incomplete,/media/.cache
EXCLUDED_EXTENSIONS=.tmp,.partial,.!qB,.part,.crdownloadVia Web Interface:
- Navigate to Tools > Exclusions
- Add paths or extensions
- Click Save
Schedule automated scans:
Via Environment Variables:
# Daily full scan at 2 AM
PERIODIC_SCAN_SCHEDULE=cron:0 2 * * *
# Weekly cleanup on Sunday at 3 AM
CLEANUP_SCHEDULE=cron:0 3 * * 0Via Web Interface:
- Navigate to Tools > Schedules
- Click "Create Schedule"
- Configure schedule type, frequency, and scan type
- Click Save
Number of concurrent scan tasks:
# Low concurrency (memory-constrained systems)
CELERY_CONCURRENCY=2
# Medium concurrency (typical systems)
CELERY_CONCURRENCY=4
# High concurrency (powerful systems)
CELERY_CONCURRENCY=8Celery queues and priorities are automatically configured:
- Default Queue: Normal scans (priority 5)
- Integrity Queue: Integrity checks (priority 7)
- Cleanup Queue: Database cleanup (priority 6)
- Retention Queue: Data retention (priority 9)
Celery Beat runs scheduled tasks daily:
# Runs at 2 AM daily
'data-retention-cleanup': {
'task': 'pixelprobe.tasks.run_retention_cleanup',
'schedule': crontab(hour=2, minute=0),
'options': {'queue': 'retention', 'priority': 9}
}Configure how long data is retained:
# Archive scan outputs after 30 days (currently disabled)
SCAN_OUTPUT_RETENTION_DAYS=30
# Delete old reports after 90 days
REPORT_RETENTION_DAYS=90
# Delete completed scan states after 7 days
SCAN_STATE_RETENTION_DAYS=7Run data retention manually:
# Docker
docker exec pixelprobe-app python tools/data_retention.py
# Manual installation
python tools/data_retention.pyPreview what would be cleaned:
python tools/data_retention.py --dry-runPixelProbe includes SSRF protection that blocks outbound requests to private/reserved IP ranges. If you use internal services for healthchecks, notifications (ntfy, webhooks), or similar integrations that resolve to private IPs, you can allowlist them:
| Variable | Default | Description |
|---|---|---|
TRUSTED_INTERNAL_HOSTS |
(empty) | Comma-separated hostnames and/or CIDR ranges that bypass SSRF private-IP blocking |
Examples:
# Single hostname
TRUSTED_INTERNAL_HOSTS=healthcheck.internal.local
# Hostname + subnet
TRUSTED_INTERNAL_HOSTS=healthcheck.internal.local,192.168.5.0/24
# Multiple entries
TRUSTED_INTERNAL_HOSTS=healthcheck.internal.local,ntfy.internal.local,10.0.0.0/8Notes:
- Hostname matching is case-insensitive
- CIDR ranges apply to resolved IPs regardless of hostname
- A bare IP (e.g.,
10.0.0.5) is treated as a/32single-host range - Must be set in both
pixelprobeandcelery-workercontainers (or via shared.env) - Public IPs are always allowed; this setting only affects private/reserved ranges
Generate a secure secret key:
python3 -c "import secrets; print(secrets.token_hex(32))"Copy the output to SECRET_KEY in .env.
Session settings are configured in Flask:
# Session timeout (default: 30 days)
PERMANENT_SESSION_LIFETIME = timedelta(days=30)
# Session cookie settings
SESSION_COOKIE_SECURE = True # HTTPS only (production)
SESSION_COOKIE_HTTPONLY = True # Prevent JavaScript access
SESSION_COOKIE_SAMESITE = 'Lax' # CSRF protectionUsers can generate API tokens via:
- Web UI: Account > API Tokens
- API:
POST /api/auth/tokens
Tokens support optional expiration dates.
| Library Size | Minimum CPUs | Recommended CPUs |
|---|---|---|
| < 10K files | 2 cores | 4 cores |
| 10K-100K files | 4 cores | 8 cores |
| 100K-1M files | 8 cores | 16 cores |
| 1M+ files | 16 cores | 32 cores |
| Library Size | Minimum RAM | Recommended RAM |
|---|---|---|
| < 10K files | 2 GB | 4 GB |
| 10K-100K files | 4 GB | 8 GB |
| 100K-1M files | 8 GB | 16 GB |
| 1M+ files | 16 GB | 32 GB |
- Database: 100 MB per 10,000 files (estimated)
- Logs: 1-10 GB (depending on retention)
- Reports: 100 MB per 1,000 reports
- Temp Files: 1-2 GB during scans
- Bandwidth: Minimal (local file access)
- Latency: Low latency to database required
- Ports: 5000 (web), 5432 (postgres), 6379 (redis)
- Start Conservative: Begin with default settings and increase gradually
- Monitor Resources: Use
docker statsto monitor CPU/memory usage - Test Changes: Test configuration changes on a subset of files first
- Document Settings: Keep notes on what works for your environment
- Regular Backups: Backup database and configuration regularly
- Security First: Use strong passwords and keep SECRET_KEY secure
- Update Regularly: Pull latest images for bug fixes and improvements
MAX_WORKERS=8
CELERY_CONCURRENCY=3
BATCH_SIZE=100
REDIS_MAX_MEMORY=1gb
POSTGRES_PASSWORD=strong-password-here
SCAN_PATHS=/movies,/tvMAX_WORKERS=20
CELERY_CONCURRENCY=6
BATCH_SIZE=200
REDIS_MAX_MEMORY=4gb
POSTGRES_PASSWORD=very-strong-password
SCAN_PATHS=/archive/video,/archive/images
OUTPUT_ROTATION_ENABLED=true
MAX_OUTPUT_SIZE=50000MAX_WORKERS=24
CELERY_CONCURRENCY=8
BATCH_SIZE=200
REDIS_MAX_MEMORY=8gb
POSTGRES_PASSWORD=enterprise-strength-password
SCAN_PATHS=/storage/media1,/storage/media2,/storage/media3
ENABLE_MONITORING=trueSee TROUBLESHOOTING.md for solutions to common configuration issues.