Error Handling

Errors are inevitable in data pipelines—OpenETL equips you to handle them gracefully. This section covers what error handling entails, how it works, and how to manage it effectively.

What is Error Handling?

Error handling in OpenETL ensures pipelines recover from or respond to failures—like network issues, rate limits, or bad data. It's about keeping your ETL process robust with retries, logging, and controlled failure modes.

How Error Handling Works

The Orchestrator manages errors during pipeline execution:

Detects failures (e.g., adapter download throws an error).
Applies retry logic if configured (via error_handling).
Logs events (e.g., error, info) for tracking.
Either continues (if fail_on_error: false) or halts (if true), cleaning up adapters.

This keeps pipelines running or fails them predictably.

Configuring Error Handling

Set error behavior in the pipeline's error_handling option:

Property	Description
`max_retries`	Number of retry attempts
`retry_interval`	Delay between retries (ms)
`fail_on_error`	Stop on error (`true`) or continue (`false`)

error_handling: {
  max_retries: 3,
  retry_interval: 1000,
  fail_on_error: false,
}

This retries up to 3 times, waiting 1s each time, and proceeds despite errors.

Common Error Scenarios

Here's how OpenETL handles typical issues:

Network Failure: Retries based on max_retries; logs error event.
Rate Limit (429): Respects rate_limiting.max_retries_on_rate_limit or falls back to error_handling.
Auth Error (401): Attempts token refresh (for OAuth2) then retries; fails if unrecoverable.
Bad Data: Continues or stops per fail_on_error, logging the issue.

Debugging Errors

Catch and fix issues with these steps:

Enable Logging: Track events to spot failures.

orchestrator.runPipeline({
  id: 'debug-pipeline',
  source: { /* ... */ },
  error_handling: { max_retries: 2, retry_interval: 500, fail_on_error: false },
  logging: event => {
    if (event.type === 'error') console.error(event.message);
    else console.log(event);
  },
});

Check Vault: Verify credential_id matches valid Vault entries.
Isolate Steps: Test source or target alone to pinpoint the issue.
Review Adapter: Ensure connect, download, or upload handle errors correctly.

Example with retry and logging:

import Orchestrator from 'openetl';
import { hubspot } from '@openetl/hubspot';

const vault = { 'hs-auth': { type: 'oauth2', credentials: { /* ... */ } } };
const orchestrator = Orchestrator(vault, { hubspot });

orchestrator.runPipeline({
  id: 'hs-fetch',
  source: {
    adapter_id: 'hubspot',
    endpoint_id: 'contacts',
    credential_id: 'hs-auth',
    fields: ['firstname'],
  },
  error_handling: {
    max_retries: 2,
    retry_interval: 2000,
    fail_on_error: false,
  },
  logging: event => console.log(`${event.type}: ${event.message}`),
});

This retries twice on failure, waits 2s between attempts, logs each step, and continues even if errors persist—perfect for debugging.

Next: Adapters!

Edit this page on GitHub