Queueing and Retries for Workflows

Edited

What It Is

Queuing and retrying webhook alerts is a built-in reliability feature of the Next Identity platform’s workflow notification system. When a webhook notification cannot be delivered within a defined time window, it is automatically queued for retry based on a configurable schedule. Notifications that ultimately fail after all retry attempts are moved to a dead letter queue for further inspection and manual handling.

Why It Matters

Webhooks are essential for real-time communication between systems—triggering follow-up actions, notifying third-party services, or syncing identity events. However, temporary network issues, service outages, or latency can lead to undelivered messages. Queuing and retrying provides resilience by ensuring these messages are not lost and delivery is attempted multiple times before being marked as failed. This protects system integrity and supports dependable automation.

How It Works

When a workflow notification is triggered, the system attempts to send the webhook alert immediately. If the webhook is not successfully delivered within 5 seconds, it is placed in a retry queue. The default retry behavior is as follows:

  • Retry attempts: 5 total

  • Retry intervals: 30, 60, 120, 300, and 900 seconds

If the notification still fails after all retries, it is moved to a dead letter queue, where it is stored for visibility and follow-up.

Key characteristics:

  • Notifications are retried automatically—no manual intervention required

  • The dead letter queue allows teams to review failures and take corrective action

Use Cases

  • Ensuring webhook delivery even during temporary network disruptions

  • Supporting event-driven workflows between systems that may not always be available

  • Detecting and troubleshooting persistent failures using the dead letter queue

  • Improving delivery reliability for mission-critical identity events (e.g., registration, password reset, account updates)

Best Practices

  • Configure webhook endpoints for high availability and low response latency

  • Test webhook endpoints with realistic load and error conditions to validate reliability

Was this article helpful?

Sorry about that! Care to tell us more?

Thanks for the feedback!

There was an issue submitting your feedback
Please check your connection and try again.