Heartbeat Monitors

Monitoring as Code: Learn more about the Heartbeat Monitor Construct.

Monitor your scheduled jobs, background tasks, and automated processes with Checkly’s heartbeat monitoring. Unlike traditional active monitoring, heartbeat monitors work passively—they listen for regular “pings” from your tasks to ensure they’re running as expected.

What are Heartbeat Monitors?

Heartbeat monitors are passive monitoring checks that wait for your automated tasks to report their successful completion. When your scheduled job, backup script, or cron job finishes successfully, it sends a simple HTTP request (a “ping”) to Checkly to confirm it ran. If Checkly doesn’t receive a ping within the expected timeframe, it triggers alerts to notify you that something may have gone wrong. Heartbeat monitors are perfect for:

Backup jobs and data exports
ETL processes and data imports
Scheduled maintenance scripts
Newsletter and email campaigns
Database cleanup tasks
File processing workflows

How Heartbeat Monitoring Works

The heartbeat monitoring process is straightforward. Once created, your heartbeat monitor provides a unique ping URL. Your tasks should make an HTTP GET or POST request to this URL when they complete successfully.

Create a monitor - Set up a heartbeat monitor with your expected ping frequency
Get your ping URL - Checkly provides a unique URL for your task to ping
Add the ping - Include a simple HTTP request in your task’s success path
Monitor results - Checkly tracks pings and alerts you when they’re

Always add the heartbeat ping at the very end of your task, after all critical operations have completed successfully. This ensures you only get “success” pings when your job actually finished.

Grace Period

The grace period provides extra time before alerting, compensating for natural variance in job execution times: Common grace period examples:

Daily backup at 2 AM with 30-minute grace → Alert if no ping by 2:30 AM
Weekly report on Fridays with 4-hour grace → Alert if no ping by end of Friday
Hourly sync job with 5-minute grace → Alert if ping is more than 5 minutes late

Understanding Grace Periods

Grace periods compensate for natural variance in job execution times. For example:

Daily backup at 2 AM with 30-minute grace → Alert if no ping by 2:30 AM
Weekly report on Fridays with 4-hour grace → Alert if no ping by end of Friday
Hourly sync job with 5-minute grace → Alert if ping is more than 5 minutes late

Choose grace periods based on:

Normal variance in your job execution time
Acceptable delay before you need to know about failures
Time needed for any retries or recovery processes

Timer Behavior

The heartbeat timer works predictably:

First ping starts the timer - When you send the first ping, monitoring begins
Each ping resets the timer - Every successful ping resets the countdown
Alerts also reset the timer - After an alert fires, the timer restarts
Deactivation resets everything - Pausing and resuming a monitor restarts timing

This means if your job is supposed to run every 6 hours but runs late at hour 7, the next ping will be expected at hour 13 (7 + 6), not hour 12.

Understanding how grace periods and timing work in heartbeat monitoring

How to Choose the Right Grace Period

Consider Your Job Variance

How much does your job’s runtime vary?
Account for network delays and system load
Include time for any retry logic

Balance Speed vs. False Alerts

Too short: False alerts during normal delays
Too long: Slower failure detection
Start conservative, adjust based on experience

Metrics

Heartbeat monitors provide different metrics and insights than other types of checks and monitors:

Ping History: Timeline of when pings were received
Missed Pings: Gaps where expected pings didn’t arrive
Alert Timeline: When alerts were triggered and resolved
Source Tracking: Which systems or processes sent pings

Remember: Heartbeat monitors detect when jobs fail to complete, but they can’t tell you why a job failed. Combine heartbeat monitoring with application logging and error tracking for complete observability.

Manual pings

You can manually send pings via the Checkly UI. Use this to start the check timer when a check is first created or to silence alarms.

Manually send a ping via the Checkly UI on the check overview page

“Ping now” is also available in the quick menu in your list of Heartbeat monitors.

Manually send a ping via the Checkly UI in the quick menu

How does the timer work?

The check timer starts when it receives its first ping and will reset after each ping or triggered alert. If you have a check that expects a ping every 60 minutes starting at 09:30, and it receives a ping at 10:00, it will reset the timer to expect a ping before 11:00. If the check does not receive a ping before 11:00 plus the configured grace period, it will trigger any configured alerts.

Every ping or triggered alert will reset the timer of the next expected heartbeat ping.

Explanation of timer resets. Every ping or alert resets the timer.

When a check is deactivated and activated again, the timer will start when the check is saved. This is also the case when changing the period of a check.

Best Practices

Always include timeout and retry options:

# Good: With timeout and retries
curl -m 5 --retry 3 https://ping.checklyhq.com/your-id

# Bad: No timeout or retry protection
curl https://ping.checklyhq.com/your-id

Position pings correctly in your code:

# Good: Ping only after success
try:
    run_backup()
    upload_to_s3()
    # Only ping if everything succeeded
    requests.get(ping_url, timeout=5)
except Exception as e:
    # Don't ping on failure - let heartbeat alert
    log_error(e)

Use source headers for tracking:

curl -H "Origin: backup-server-01" https://ping.checklyhq.com/your-id

Troubleshooting

Monitor Not Starting

Issue: Timer doesn’t start after first ping
Solution: Check that the monitor is activated and the ping URL is correct

False Alerts

Issue: Getting alerts even when job runs successfully
Solution: Increase grace period or check if ping is being sent correctly

Missing Pings

Issue: Job runs but no ping is received
Solution: Verify network connectivity and ping URL accessibility

Timer Reset Issues

Issue: Timer doesn’t reset after successful ping
Solution: Check ping method (GET/POST) and ensure no network timeouts

Getting Started

Detect

Communicate

Resolve

Integrations

What are Heartbeat Monitors?

How Heartbeat Monitoring Works

Grace Period

Metrics

Manual pings

How does the timer work?

Best Practices

Troubleshooting

Getting Started

Detect

Communicate

Resolve

Integrations

​What are Heartbeat Monitors?

​How Heartbeat Monitoring Works

​Grace Period

​Metrics

​Manual pings

​How does the timer work?

​Best Practices

​Troubleshooting

What are Heartbeat Monitors?

How Heartbeat Monitoring Works

Grace Period

Metrics

Manual pings

How does the timer work?

Best Practices

Troubleshooting