A comprehensive Rust-based infrastructure management system for 20+ blockchain nodes with advanced health monitoring, automated maintenance, network-wide snapshot management, state sync orchestration, and ETL service monitoring through a centralized web interface.
✅ 95 tests passing | 🚀 Production-ready | 🐳 Docker support
- Multi-Service Health Monitoring: Real-time monitoring of blockchain nodes, Hermes relayers, and ETL services with RPC-based status checks
- Automated Pruning: Seamless integration with
cosmos-pruner
tool for efficient blockchain data management - Network-Based Snapshot System: Create, restore, and manage network-wide LZ4-compressed blockchain snapshots with cross-node recovery and validator state preservation
- State Sync Orchestration: Automated state sync execution for rapid node synchronization from trusted snapshots
- ETL Service Monitoring: HTTP health checks for custom ETL services with configurable endpoints and timeout handling
- Log Monitoring: Pattern-based log monitoring with configurable alerts and context extraction
- Hermes Management: Smart relayer restarts with RPC-based dependency validation
- Web Interface: RESTful API with comprehensive endpoints for all operations
- Centralized Alert System: Progressive rate-limited webhook notifications with recovery detection
- Configuration Hot-Reload: Multi-server support with automatic configuration reloading
- Smart Path Derivation: Automatic path configuration based on server-level defaults
- Parallel Operations: Execute maintenance across multiple servers simultaneously
- Dependency Validation: Hermes restarts only when dependent nodes are healthy and synced
- Scheduled Maintenance: Cron-based automation with timezone awareness for pruning, snapshots, and service restarts
- Real-time Monitoring: Continuous health checks with SQLite persistence and historical tracking
- Batch Operations: Execute operations across multiple nodes efficiently with built-in safety checks
- Maintenance Tracking: Track operation status with duration estimates, stuck operation detection, and automatic cleanup
- Cross-Node Recovery: Network-based snapshots allow any node on the same network to restore from shared snapshots
- Validator State Preservation: Auto-restore preserves current validator signing state to prevent double-signing
- Scheduled Snapshots: Automatic network snapshot creation with configurable retention policies
- Emergency Cleanup: Force cleanup of stuck operations and maintenance windows
- Progressive Alerting: Rate-limited alerts (0, 6, 12, 24, 48 hours) prevent notification spam
- Auto-Restore Detection: Automatic restoration from snapshots when corruption patterns detected in logs
- Automated State Sync: Orchestrated state sync execution with RPC parameter fetching
- Fail-Fast Design: Immediate failure on any error with detailed logging
- WASM Cache Management: Smart cleanup of WASM cache during state sync
- Config Management: Automatic state sync enablement/disablement in config.toml
- Timeout Handling: Configurable sync timeout with automatic failure detection
- Multi-Chain Support: Automatic daemon binary detection for different Cosmos chains
- Network-Based Naming: Snapshots are named by network (e.g.,
pirin-1_20250101_120000
) enabling cross-node recovery - Validator Safety: Current validator state is preserved during restore to prevent consensus violations
- LZ4 Compression: Fast background compression with good ratios
- Automatic Backups: Scheduled network snapshot creation
- Retention Management: Configurable cleanup of old network snapshots
- Cross-Node Recovery: Any node on the same network can restore from the same snapshot
- Long Operation Support: 24-hour timeout for large snapshots
- Process Monitoring: Detect stuck pruning processes and silent failures
- Log Pattern Detection: Monitor logs for specific error patterns with context extraction
- Rate-Limited Alerting: Prevent alarm spam with progressive alert scheduling (3 checks, then 6h, 6h, 12h, 24h intervals)
- Maintenance Windows: Visual indication when nodes are undergoing maintenance
- Health Recovery Notifications: Automatic notifications when nodes recover from failures
- Auto-Restore Triggers: Automatic restoration from snapshots when corruption patterns detected
- ETL Health Checks: HTTP-based health monitoring for custom ETL services
- Catching Up Detection: Clear distinction between "Synced" and "Catching Up" states
- Centralized AlertService: Single webhook configuration for all alerts across the system
- Progressive Rate Limiting: Smart alert scheduling to prevent notification fatigue
- Recovery Detection: Automatic recovery notifications when services return to healthy state
- Alert Types: Node health, auto-restore, snapshot operations, Hermes restarts, log patterns, maintenance
- Severity Levels: Critical, Warning, Info, Recovery
- Webhook Testing: Startup webhook connectivity validation
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ Web Interface │ │ Health Monitor │ │ Maintenance │
│ (Axum + API) │ │ (RPC Polling) │ │ Scheduler │
└─────────────────┘ └─────────────────┘ └─────────────────┘
│ │ │
└───────────────────────┼───────────────────────┘
│
┌───────────────────────────────────────────────────────────┐
│ Core Engine │
│ ┌─────────────┐ ┌─────────────┐ ┌──────────┐ ┌──────┐│
│ │Config Mgmt │ │ Database │ │HTTP Agent│ │Alert ││
│ │(Hot Reload) │ │ (SQLite) │ │Manager │ │System││
│ │ │ │ │ │ │ │ ││
│ └─────────────┘ └─────────────┘ └──────────┘ └──────┘│
│ ┌─────────────┐ ┌─────────────┐ ┌──────────┐ │
│ │Maintenance │ │ Network │ │ State │ │
│ │Tracker │ │ Snapshot │ │ Sync │ │
│ │ │ │ Manager │ │ Manager │ │
│ └─────────────┘ └─────────────┘ └──────────┘ │
└───────────────────────────────────────────────────────────┘
│
┌───────────────────────────────────────────────────────────┐
│ Blockchain Infrastructure │
│ ┌─────────────┐ ┌─────────────┐ ┌──────────┐ │
│ │ Cosmos │ │ Hermes │ │ ETL │ │
│ │ Nodes │ │ Relayers │ │ Services │ │
│ └─────────────┘ └─────────────┘ └──────────┘ │
└───────────────────────────────────────────────────────────┘
│
┌───────────────────────────────────────────────────────────┐
│ HTTP Agents (per server - port 8745) │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │Pruning │ │Snapshots │ │Restore │ │State Sync│ │
│ │ │ │ │ │ │ │ │ │
│ └──────────┘ └──────────┘ └──────────┘ └──────────┘ │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │Systemctl │ │Commands │ │Logs │ │
│ │ │ │ │ │ │ │
│ └──────────┘ └──────────┘ └──────────┘ │
└───────────────────────────────────────────────────────────┘
- Rust 1.70+
cosmos-pruner
tool installed on target servers- HTTP agent access to all blockchain servers
- SQLite3
lz4
compression tool installed on target servers (for snapshots)
# Clone repository
git clone https://github.com/nolus-protocol/nodes-manager.git
cd nodes-manager
# Build release version
cargo build --release
# Create required directories
mkdir -p data static config
# Set up configuration files
# Create your configuration files based on the examples below
# Ensure proper API key setup for HTTP agents
export AGENT_API_KEY="your-secure-api-key-here"
On each blockchain server, install and run the agent:
# Copy agent binary to server
scp target/release/agent user@server:/usr/local/bin/
# Set up systemd service (optional)
# Create /etc/systemd/system/blockchain-agent.service
# Set API key and start
export AGENT_API_KEY="your-secure-api-key-here"
/usr/local/bin/agent
Create config/main.toml
with:
host = "0.0.0.0"
port = 8095
check_interval_seconds = 90
rpc_timeout_seconds = 10
# Alert webhook (required for alerts)
alarm_webhook_url = "https://your-webhook-endpoint.com/webhook/node-alarm"
# Hermes configuration
hermes_min_uptime_minutes = 5
# Auto-restore trigger words (optional)
auto_restore_trigger_words = [
"AppHash",
"wrong Block.Header.AppHash",
"database corruption",
"state sync failed",
"panic:"
]
# Log monitoring context (lines before/after match)
log_monitoring_context_lines = 2
Create files like config/enterprise.toml
with server-level defaults:
[server]
host = "192.168.11.206"
agent_port = 8745
api_key = "your-secure-api-key-here"
request_timeout_seconds = 300
max_concurrent_requests = 5
# Smart defaults - automatically derive paths for all nodes on this server
[defaults]
base_deploy_path = "/opt/deploy" # Auto-derives pruning_deploy_path
base_log_path = "/var/log" # Auto-derives log_path
base_backup_path = "/backup/snapshots" # Auto-derives snapshot_backup_path
# Minimal node configuration - paths are auto-derived!
[nodes.enterprise-osmosis]
rpc_url = "http://192.168.11.206:26657"
network = "osmosis-1"
server_host = "enterprise"
enabled = true
# Pruning configuration
pruning_enabled = true
service_name = "osmosis" # MANDATORY - systemd service name, used for path auto-derivation
pruning_schedule = "0 0 6 * * 2" # Tuesdays at 6AM UTC
pruning_keep_blocks = 8000
pruning_keep_versions = 8000
# pruning_deploy_path auto-derived: /opt/deploy/osmosis/data
# Snapshot configuration
snapshots_enabled = true
# snapshot_backup_path auto-derived: /backup/snapshots
# snapshot_deploy_path auto-derived: /opt/deploy/osmosis
auto_restore_enabled = true
# Scheduled snapshots (optional)
snapshot_schedule = "0 0 2 * * 0" # Sundays at 2AM UTC
snapshot_retention_count = 7
# Log configuration
# log_path auto-derived: /var/log/osmosis
truncate_logs_enabled = false
# Per-node log monitoring (optional)
log_monitoring_enabled = true
log_monitoring_patterns = [
"Possibly no price is available!",
"failed to lock fees to pay for"
]
# State sync configuration (optional)
[nodes.enterprise-neutron]
rpc_url = "http://192.168.11.206:26957"
network = "neutron-1"
server_host = "enterprise"
enabled = true
pruning_enabled = true
snapshots_enabled = true
# State sync settings
state_sync_enabled = true
state_sync_schedule = "0 0 3 * * 6" # Saturdays at 3AM UTC
state_sync_rpc_sources = [
"https://rpc1.neutron.quokkastake.io:443",
"https://rpc2.neutron.quokkastake.io:443"
]
state_sync_trust_height_offset = 2000
state_sync_max_sync_timeout_seconds = 600
# Hermes relayer configuration
[hermes.relay-enterprise]
server_host = "enterprise"
service_name = "hermes"
log_path = "/var/log/hermes"
restart_schedule = "0 0 16 * * 2" # Tuesdays at 4PM UTC
dependent_nodes = ["enterprise-osmosis", "enterprise-neutron"]
truncate_logs_enabled = false
# ETL service monitoring
[etl.osmosis-etl]
server_host = "enterprise"
host = "192.168.11.206"
port = 8080
endpoint = "/health"
enabled = true
timeout_seconds = 10
description = "Osmosis ETL indexer service"
# Start the manager
./target/release/manager
# Or with custom config location
CONFIG_DIR=./custom-config ./target/release/manager
The manager will:
- ✅ Load configuration from
config/*.toml
files - ✅ Initialize SQLite database
- ✅ Test alert webhook connectivity
- ✅ Clean up stuck operations from previous runs
- ✅ Start health monitoring for nodes, Hermes, and ETL services
- ✅ Start scheduled maintenance tasks
- ✅ Launch web API on configured port (default 8095)
# Get all blockchain nodes health
GET /api/health/nodes
GET /api/health/nodes?include_disabled=true
# Get specific node health
GET /api/health/nodes/{node_name}
# Get all Hermes instances health
GET /api/health/hermes
# Get specific Hermes health
GET /api/health/hermes/{hermes_name}
# Get all ETL services health
GET /api/health/etl
GET /api/health/etl?include_disabled=true
# Get specific ETL service health
GET /api/health/etl/{service_name}
# Force refresh ETL health
POST /api/health/etl/refresh
# Get all node configurations
GET /api/config/nodes
# Get all Hermes configurations
GET /api/config/hermes
# Get all ETL configurations
GET /api/config/etl
# Restart node (returns immediately)
POST /api/maintenance/nodes/{node_name}/restart
# Prune node (returns immediately)
POST /api/maintenance/nodes/{node_name}/prune
# Restart Hermes (returns immediately)
POST /api/maintenance/hermes/{hermes_name}/restart
# Create network snapshot (returns immediately)
POST /api/snapshots/{node_name}/create
# List all network snapshots
GET /api/snapshots/{node_name}/list
# Get snapshot statistics
GET /api/snapshots/{node_name}/stats
# Delete specific snapshot (returns immediately)
DELETE /api/snapshots/{node_name}/{filename}
# Cleanup old snapshots (returns immediately)
POST /api/snapshots/{node_name}/cleanup?retention_count=5
# Restore from latest snapshot (returns immediately)
POST /api/snapshots/{node_name}/restore
# Check auto-restore triggers
GET /api/snapshots/{node_name}/check-triggers
# Get auto-restore status
GET /api/snapshots/{node_name}/auto-restore-status
# Get all active operations
GET /api/operations/active
# Check specific target status
GET /api/operations/{target_name}/status
# Cancel operation
POST /api/operations/{target_name}/cancel
# Emergency cleanup (remove stuck operations)
POST /api/operations/emergency-cleanup?max_hours=12
# Get scheduled maintenance operations
GET /api/maintenance/schedule
The system uses the cosmos-pruner
tool with extended timeouts for large datasets:
cosmos-pruner prune /opt/deploy/osmosis/data --blocks=8000 --versions=8000
Process:
- Start maintenance tracking (5-hour timeout)
- Stop blockchain service via HTTP agent
- Optional: Truncate logs if enabled
- Execute cosmos-pruner with configured parameters
- Start blockchain service via HTTP agent
- Verify service health
- Send completion notification via AlertService
Features:
- Network-Based Naming: Snapshots named by network (e.g.,
pirin-1_20250101_120000
) for cross-node compatibility - Cross-Node Recovery: Any node on the same network can restore from shared network snapshots
- Validator State Preservation: Current validator signing state is backed up and restored to prevent double-signing
- LZ4 Compression: Fast background compression with good ratios
- Automatic Backups: Scheduled network snapshot creation
- Retention Management: Configurable cleanup of old network snapshots
- Long Operation Support: 24-hour timeout for large snapshots
Network Snapshot Process:
- Start maintenance tracking (24-hour timeout)
- Stop blockchain service via HTTP agent
- Create network-named directory:
{network}_{timestamp}
- Copy data and wasm directories (excluding validator state)
- Remove any validator state files from snapshot
- Start blockchain service via HTTP agent
- Apply network retention policy if configured
- Background LZ4 compression
Cross-Node Restore Process:
- Start maintenance tracking (24-hour timeout)
- Stop blockchain service via HTTP agent
- Backup current validator state (critical for validator safety)
- Delete existing data and wasm directories
- Copy data and wasm from network snapshot
- Restore backed up validator state (prevents double-signing)
- Set proper permissions
- Start blockchain service via HTTP agent
- Verify service health
- Send completion notification
Auto-Restore System:
- Monitors node logs for trigger words (configurable patterns)
- Automatically restores from latest network snapshot when corruption detected
- Preserves current validator state during auto-restore
- Prevents infinite loops with 2-hour cooldown between attempts
- Sends critical alerts if auto-restore fails
Features:
- Automated RPC Fetching: Automatically fetches trust height and hash from configured RPC sources
- Fail-Fast Design: Immediate failure on any error with comprehensive logging
- Multi-Chain Support: Automatic daemon binary detection (nolusd, osmosisd, neutrond, etc.)
- WASM Cache Management: Smart cleanup of WASM cache during state sync
- Config Management: Automatic state sync parameter injection and cleanup
- Timeout Handling: Configurable sync timeout with status monitoring
State Sync Process:
- Fetch state sync parameters from RPC sources
- Stop blockchain service
- Truncate logs (if configured)
- Update config.toml with state sync parameters
- Execute
unsafe-reset-all
to wipe chain state - Clean WASM cache (preserve blobs, delete cache only)
- Start blockchain service
- Wait for state sync completion with timeout
- Disable state sync in config
- Restart service with clean config
Features:
- HTTP Health Checks: Configurable endpoints for ETL service health validation
- Response Time Tracking: Monitor response times and status codes
- Integration with AlertService: Same progressive alerting as blockchain nodes
- Flexible Configuration: Per-service timeout, endpoint, and description
Configuration:
[etl.my-etl-service]
server_host = "enterprise"
host = "192.168.11.206"
port = 8080
endpoint = "/health"
enabled = true
timeout_seconds = 10
description = "Custom ETL service"
Progressive Rate Limiting:
- First Alert: After 3 consecutive unhealthy checks
- Second Alert: 6 hours after first alert
- Third Alert: 6 hours after second alert (12 hours total)
- Fourth Alert: 12 hours after third alert (24 hours total)
- Fifth+ Alerts: Every 24 hours thereafter
Alert Types:
NodeHealth
: Health state changes for blockchain nodes and ETL servicesAutoRestore
: Automatic restoration attempts and resultsSnapshot
: Snapshot creation and restoration operationsHermes
: Hermes relayer restart operationsLogPattern
: Log pattern detection alertsMaintenance
: Maintenance operation notifications
Severity Levels:
Critical
: Node failures, auto-restore failuresWarning
: Degraded performance, approaching limitsInfo
: Maintenance operations, scheduled tasksRecovery
: Service recovery from failure state
Webhook Payload:
{
"timestamp": "2025-01-15T10:30:00Z",
"alert_type": "NodeHealth",
"severity": "Critical",
"node_name": "enterprise-osmosis",
"message": "Node health check failed: RPC timeout",
"server_host": "enterprise",
"details": {
"block_height": 12345678,
"error": "connection timeout"
}
}
Hermes relayers restart only when ALL dependent nodes are:
- Healthy (RPC status check passes)
- Synced (not catching up)
- Recent (health data less than 5 minutes old)
- Minimum Uptime (configurable minimum uptime before restart)
Real-time Status:
- Track all operations with start time and duration estimates
- Visual indication in web interface when nodes are in maintenance
- Automatic cleanup of expired maintenance windows (48-hour maximum)
- Detection of stuck operations with process monitoring
- Startup cleanup of stuck operations from previous runs
Emergency Features:
- Force kill stuck pruning processes
- Emergency clear all maintenance windows
- Manual maintenance window cleanup per node
- Overdue operation detection (3x estimated duration)
Operation Timeouts:
- Pruning: 5 hours
- Snapshot creation: 24 hours
- Snapshot restore: 24 hours
- State sync: 24 hours
- Node restart: 30 minutes
- Hermes restart: 15 minutes
Direct Communication Model:
- Each operation uses HTTP POST to dedicated agents
- Automatic operation cleanup after completion
- No persistent connection pooling (prevents conflicts)
- Configurable timeouts per server
- Parallel execution across different servers
- Sequential execution on same server for safety
- Job-based async execution with status polling
Server-Level Defaults: Instead of repeating paths for every node, define them once at the server level:
[defaults]
base_deploy_path = "/opt/deploy"
base_log_path = "/var/log"
base_backup_path = "/backup/snapshots"
Automatic Derivation:
If you specify service_name
(MANDATORY) and base paths in [defaults]
, paths will be auto-derived:
pruning_deploy_path
→{base_deploy_path}/{service_name}/data
snapshot_deploy_path
→{base_deploy_path}/{service_name}
log_path
→{base_log_path}/{service_name}
snapshot_backup_path
→{base_backup_path}
Example:
[defaults]
base_deploy_path = "/opt/deploy"
base_log_path = "/var/log"
base_backup_path = "/backup/snapshots"
[nodes.osmosis-1]
service_name = "osmosis" # MANDATORY
# ... rest of config
Derived paths:
- Deploy path:
/opt/deploy/osmosis/data
- Snapshot path:
/opt/deploy/osmosis
- Log path:
/var/log/osmosis
- Backup path:
/backup/snapshots
Important: All cron schedules run in the timezone where the Manager is deployed.
Time Conversion Example:
- Manager timezone: UTC+3 (EEST)
- Local time: 10:00 AM EEST
- Config schedule:
"0 0 7 * * 2"
(7:00 AM UTC) - Result: Runs at 10:00 AM local time
# Create snapshot on node1 - creates network-based snapshot
curl -X POST http://localhost:8095/api/snapshots/pirin-node-1/create
# Result: Creates snapshot named "pirin-1_20250115_120000"
# This snapshot can be used by ANY node on pirin-1 network
# Node7 can restore from snapshot created by Node1
curl -X POST http://localhost:8095/api/snapshots/pirin-node-7/restore
# Result: Restores from latest pirin-1 network snapshot
# Preserves pirin-node-7's current validator state
# During restore, the system:
# 1. Backs up current validator state from pirin-node-7
# 2. Restores blockchain data from network snapshot
# 3. Restores pirin-node-7's validator state (not from snapshot)
# 4. Node resumes with correct signing state - no double-signing risk
# Overall system status with network snapshot info
curl http://localhost:8095/api/health/nodes
# Active operations
curl http://localhost:8095/api/operations/active
# Check specific node status
curl http://localhost:8095/api/operations/pirin-node-1/status
# Network snapshot statistics
curl http://localhost:8095/api/snapshots/pirin-node-1/stats
The SQLite database (data/nodes.db
) contains:
Tables:
health_records
: Node health history with block height, sync statusmaintenance_operations
: All maintenance operations with status and timing
Startup Cleanup:
- Automatically cleans up stuck operations (> 1 hour in running/started state)
- Marks them as 'failed' with cleanup message
- Logs cleaned operations for audit trail
- Health checks run every 90 seconds (configurable)
- ETL health checks run every 90 seconds (same as nodes)
- Failed operations are logged with detailed error messages
- HTTP agent connection failures automatically trigger retries
- Maintenance windows automatically expire after 48 hours
- Network snapshot operations support up to 24-hour timeouts
- Auto-restore attempts have 2-hour cooldown periods
- Cross-node recovery capability eliminates single points of failure
- API key authentication for all HTTP agent communications
- Config files contain sensitive information (use appropriate permissions)
- Use firewalls to restrict API access
- Monitor HTTP agent connection limits per server
- Regular security updates for all dependencies
- Network snapshot backup paths should be secured
- Validator state files are never included in shared snapshots
- Alert webhook URLs may contain sensitive endpoints
- Database: Regular backups of
data/nodes.db
- Configuration: Backup
config/*.toml
files separately - Network Snapshots: Configure separate backup storage for network snapshots
- Validator States: Individual validator states are preserved per node
- Log Rotation: Set up log rotation for maintenance logs
- Database: ~10-50MB for typical deployments
- Logs: Variable based on retention policies
- Network Snapshots: Can be very large (GBs to TBs depending on blockchain data)
- Per-network Storage: One snapshot location serves all nodes on same network
- LZ4 Compression: Faster than gzip, good balance of speed/compression
- Network Snapshot Retention: Configure appropriate retention counts to manage disk usage
- Alert Rate Limiting: Progressive scheduling prevents webhook overload
- Maintenance Windows: Plan maintenance schedules to avoid conflicts
- Cross-Node Recovery: Reduces individual node storage requirements
- ETL Monitoring: Minimal overhead with configurable timeout
- Health checks: 20+ nodes in <5 seconds (parallel execution)
- ETL checks: Multiple services in <2 seconds (parallel HTTP requests)
- HTTP agent operations: Direct communication per operation
- Database: SQLite with indexed queries for fast access
- Memory usage: ~50-150MB typical operation (includes network snapshot management)
- Pruning operations: 10-300 minutes depending on node size (5-hour timeout)
- Network snapshot creation: 30-1440 minutes depending on data size (24-hour timeout)
- Cross-node restore: 30-60 minutes depending on snapshot size with validator state preservation
- State sync: 5-20 minutes depending on network and RPC performance
- Alert processing: <100ms per alert with webhook delivery
- LZ4 compression: 50-200 MB/s typical compression speed
[server]
host = "192.168.1.100"
agent_port = 8745
api_key = "secure-key-here"
[defaults]
base_deploy_path = "/opt/deploy"
base_log_path = "/var/log"
base_backup_path = "/backup/snapshots"
[nodes.mainnet-pirin]
rpc_url = "http://192.168.1.100:26657"
network = "pirin-1"
server_host = "mainnet-server"
enabled = true
# Pruning
pruning_enabled = true
pruning_schedule = "0 0 6 * * 2"
pruning_keep_blocks = 8000
pruning_keep_versions = 8000
# Snapshots
snapshots_enabled = true
auto_restore_enabled = true
snapshot_schedule = "0 0 2 * * 0"
snapshot_retention_count = 4
# State sync
state_sync_enabled = true
state_sync_schedule = "0 0 3 * * 6"
state_sync_rpc_sources = [
"https://rpc1.example.com",
"https://rpc2.example.com"
]
# Logs
log_monitoring_enabled = true
log_monitoring_patterns = [
"panic:",
"database corruption"
]
# All nodes on pirin-1 network share the same snapshot location
[nodes.pirin-node-1]
network = "pirin-1"
snapshots_enabled = true # Can create snapshots
[nodes.pirin-node-2]
network = "pirin-1"
snapshots_enabled = false # Only restores, doesn't create
[nodes.pirin-node-7]
network = "pirin-1"
auto_restore_enabled = true # Can auto-restore from network snapshots
[etl.osmosis-indexer]
server_host = "mainnet-server"
host = "192.168.1.100"
port = 8080
endpoint = "/health"
enabled = true
timeout_seconds = 10
description = "Osmosis blockchain indexer"
[etl.neutron-api]
server_host = "mainnet-server"
host = "192.168.1.100"
port = 3000
endpoint = "/api/health"
enabled = true
timeout_seconds = 5
description = "Neutron API service"
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests if applicable
- Submit a pull request
MIT License - see LICENSE file for details.
For issues, questions, or contributions:
- GitHub Issues: Create an issue
- cosmos-pruner - Blockchain state pruning tool
- Hermes - IBC relayer
- Cosmos SDK - Blockchain application framework
- LZ4 - Fast compression algorithm