Monitoring as Code: Learn more about the Heartbeat Monitor Construct.
Monitor your scheduled jobs, background tasks, and automated processes with Checkly’s heartbeat monitoring. Unlike traditional active monitoring, heartbeat monitors work passively—they listen for regular “pings” from your tasks to ensure they’re running as expected.
Heartbeat monitor overview

What are Heartbeat Monitors?

Heartbeat monitors are passive monitoring checks that wait for your automated tasks to report their successful completion. When your scheduled job, backup script, or cron job finishes successfully, it sends a simple HTTP request (a “ping”) to Checkly to confirm it ran. If Checkly doesn’t receive a ping within the expected timeframe, it triggers alerts to notify you that something may have gone wrong. Heartbeat monitors are perfect for:
  • Backup jobs and data exports
  • ETL processes and data imports
  • Scheduled maintenance scripts
  • Newsletter and email campaigns
  • Database cleanup tasks
  • File processing workflows

How Heartbeat Monitoring Works

The heartbeat monitoring process is straightforward. Once created, your heartbeat monitor provides a unique ping URL. Your tasks should make an HTTP GET or POST request to this URL when they complete successfully.
  1. Create a monitor - Set up a heartbeat monitor with your expected ping frequency
  2. Get your ping URL - Checkly provides a unique URL for your task to ping
  3. Add the ping - Include a simple HTTP request in your task’s success path
  4. Monitor results - Checkly tracks pings and alerts you when they’re
Always add the heartbeat ping at the very end of your task, after all critical operations have completed successfully. This ensures you only get “success” pings when your job actually finished.

Grace Period

The grace period provides extra time before alerting, compensating for natural variance in job execution times: Common grace period examples:
  • Daily backup at 2 AM with 30-minute grace → Alert if no ping by 2:30 AM
  • Weekly report on Fridays with 4-hour grace → Alert if no ping by end of Friday
  • Hourly sync job with 5-minute grace → Alert if ping is more than 5 minutes late
Grace periods compensate for natural variance in job execution times. For example:
  • Daily backup at 2 AM with 30-minute grace → Alert if no ping by 2:30 AM
  • Weekly report on Fridays with 4-hour grace → Alert if no ping by end of Friday
  • Hourly sync job with 5-minute grace → Alert if ping is more than 5 minutes late
Choose grace periods based on:
  • Normal variance in your job execution time
  • Acceptable delay before you need to know about failures
  • Time needed for any retries or recovery processes
The heartbeat timer works predictably:
  • First ping starts the timer - When you send the first ping, monitoring begins
  • Each ping resets the timer - Every successful ping resets the countdown
  • Alerts also reset the timer - After an alert fires, the timer restarts
  • Deactivation resets everything - Pausing and resuming a monitor restarts timing
This means if your job is supposed to run every 6 hours but runs late at hour 7, the next ping will be expected at hour 13 (7 + 6), not hour 12.Understanding how grace periods and timing work in heartbeat monitoring
Consider Your Job Variance
  • How much does your job’s runtime vary?
  • Account for network delays and system load
  • Include time for any retry logic
Balance Speed vs. False Alerts
  • Too short: False alerts during normal delays
  • Too long: Slower failure detection
  • Start conservative, adjust based on experience

Metrics

Heartbeat monitors provide different metrics and insights than other types of checks and monitors:
  • Ping History: Timeline of when pings were received
  • Missed Pings: Gaps where expected pings didn’t arrive
  • Alert Timeline: When alerts were triggered and resolved
  • Source Tracking: Which systems or processes sent pings
Remember: Heartbeat monitors detect when jobs fail to complete, but they can’t tell you why a job failed. Combine heartbeat monitoring with application logging and error tracking for complete observability.

Manual pings

You can manually send pings via the Checkly UI. Use this to start the check timer when a check is first created or to silence alarms. Manually send a ping via the Checkly UI on the check overview page “Ping now” is also available in the quick menu in your list of Heartbeat monitors. Manually send a ping via the Checkly UI in the quick menu

How does the timer work?

The check timer starts when it receives its first ping and will reset after each ping or triggered alert. If you have a check that expects a ping every 60 minutes starting at 09:30, and it receives a ping at 10:00, it will reset the timer to expect a ping before 11:00. If the check does not receive a ping before 11:00 plus the configured grace period, it will trigger any configured alerts.
Every ping or triggered alert will reset the timer of the next expected heartbeat ping.
Explanation of timer resets. Every ping or alert resets the timer.
When a check is deactivated and activated again, the timer will start when the check is saved. This is also the case when changing the period of a check.

Best Practices

Always include timeout and retry options:
# Good: With timeout and retries
curl -m 5 --retry 3 https://ping.checklyhq.com/your-id

# Bad: No timeout or retry protection
curl https://ping.checklyhq.com/your-id
Position pings correctly in your code:
# Good: Ping only after success
try:
    run_backup()
    upload_to_s3()
    # Only ping if everything succeeded
    requests.get(ping_url, timeout=5)
except Exception as e:
    # Don't ping on failure - let heartbeat alert
    log_error(e)
Use source headers for tracking:
curl -H "Origin: backup-server-01" https://ping.checklyhq.com/your-id

Troubleshooting

  • Issue: Timer doesn’t start after first ping
  • Solution: Check that the monitor is activated and the ping URL is correct
  • Issue: Getting alerts even when job runs successfully
  • Solution: Increase grace period or check if ping is being sent correctly
  • Issue: Job runs but no ping is received
  • Solution: Verify network connectivity and ping URL accessibility
  • Issue: Timer doesn’t reset after successful ping
  • Solution: Check ping method (GET/POST) and ensure no network timeouts