Watchdog — JVM-level Auto-Restart & Health Supervisor
A lightweight, production-hardened daemon that supervises SCADA‑LTS JVM processes, performs health checks, and executes controlled restarts or failover actions when required.
Key capabilities
- Periodic JVM responsiveness checks (HTTP, JMX, custom probes).
- Latency & thread‑stall detection with configurable thresholds.
- Graceful restart sequence with pre/post hooks and safe rollback.
- Integration with HA/failover orchestrators and load‑balancers.
- Alerting to Prometheus/Grafana, syslog and SMS gateway.
Quick facts
Package: Watchdog vX.Y
Compatibility: SCADA‑LTS approved binaries only
Download:Get package
How it works
The Watchdog runs as an OS service and executes a configurable set of probes (HTTP health endpoints, JMX checks, thread dump heuristics). When a probe fails or latency rules are breached, the daemon attempts a controlled graceful restart:
- Attempt soft recovery steps (clear caches, trigger GC, restart a worker process).
- If unresolved, trigger a coordinated JVM restart with graceful shutdown hooks.
- If restart fails, mark instance unhealthy and inform load‑balancer / HA orchestrator.
- Execute post‑restart validation and, on failure, roll back to previous known-good binary if available.
Configuration highlights
- Probe types: HTTP, TCP, JMX, script hook, custom plugin.
- Threshold profiles: staging / production / critical (separate cooldown times).
- Retry and backoff policies configurable per probe.
- Pre/post hooks: run maintenance scripts, notify teams, trigger snapshots.
- Secure credentials store for JMX and API probes (integrates with secret managers).
Alerts & Integrations
Watchdog sends metrics and events to Prometheus, pushes alerts to Alertmanager, supports syslog/rsyslog, and can call our SMS gateway for critical notifications. Webhook integrations for ticketing (Jira, ServiceNow) are available.
Deployment & Support
We provide packaged installers for supported Linux distributions, containerized images, and configuration templates. Installation, tuning and runbooks are part of our service plans.
Install Guide Service Plans Contact Ops
Abil’I.T. — Watchdog
Contact: info@abilit.eu
