Error Handling
Errors are inevitable in data pipelines—OpenETL equips you to handle them gracefully. This section covers what error handling entails, how it works, and how to manage it effectively.
What is Error Handling?
Error handling in OpenETL ensures pipelines recover from or respond to failures—like network issues, rate limits, or bad data. It's about keeping your ETL process robust with retries, logging, and controlled failure modes.
How Error Handling Works
The Orchestrator manages errors during pipeline execution:
- Detects failures (e.g., adapter
download
throws an error). - Applies retry logic if configured (via
error_handling
). - Logs events (e.g.,
error
,info
) for tracking. - Either continues (if
fail_on_error: false
) or halts (iftrue
), cleaning up adapters.
This keeps pipelines running or fails them predictably.
Configuring Error Handling
Set error behavior in the pipeline's error_handling
option:
Property | Description |
---|---|
max_retries |
Number of retry attempts |
retry_interval |
Delay between retries (ms) |
fail_on_error |
Stop on error (true ) or continue (false ) |
error_handling: {
max_retries: 3,
retry_interval: 1000,
fail_on_error: false,
}
This retries up to 3 times, waiting 1s each time, and proceeds despite errors.
Common Error Scenarios
Here's how OpenETL handles typical issues:
-
Network Failure: Retries based on
max_retries
; logserror
event. -
Rate Limit (429): Respects
rate_limiting.max_retries_on_rate_limit
or falls back toerror_handling
. - Auth Error (401): Attempts token refresh (for OAuth2) then retries; fails if unrecoverable.
-
Bad Data: Continues or stops per
fail_on_error
, logging the issue.
Debugging Errors
Catch and fix issues with these steps:
- Enable Logging: Track events to spot failures.
orchestrator.runPipeline({
id: 'debug-pipeline',
source: { /* ... */ },
error_handling: { max_retries: 2, retry_interval: 500, fail_on_error: false },
logging: event => {
if (event.type === 'error') console.error(event.message);
else console.log(event);
},
});
-
Check Vault: Verify
credential_id
matches valid Vault entries. -
Isolate Steps: Test
source
ortarget
alone to pinpoint the issue. -
Review Adapter: Ensure
connect
,download
, orupload
handle errors correctly.
Example with retry and logging:
import Orchestrator from 'openetl';
import { hubspot } from '@openetl/hubspot';
const vault = { 'hs-auth': { type: 'oauth2', credentials: { /* ... */ } } };
const orchestrator = Orchestrator(vault, { hubspot });
orchestrator.runPipeline({
id: 'hs-fetch',
source: {
adapter_id: 'hubspot',
endpoint_id: 'contacts',
credential_id: 'hs-auth',
fields: ['firstname'],
},
error_handling: {
max_retries: 2,
retry_interval: 2000,
fail_on_error: false,
},
logging: event => console.log(`${event.type}: ${event.message}`),
});
This retries twice on failure, waits 2s between attempts, logs each step, and continues even if errors persist—perfect for debugging.
Next: Adapters!