Introduction
OpenETL is an open-source ETL (Extract, Transform, Load) framework implemented in TypeScript. This document provides a technical overview of the framework's architecture and capabilities.
What is OpenETL?
OpenETL is a TypeScript framework for building data pipelines that extract data from source systems, apply transformations, and load results into target systems. The framework uses a modular adapter architecture to support multiple data sources and destinations.
Core Components
| Component | Description |
|---|---|
| Orchestrator | Pipeline execution engine that coordinates adapters and manages data flow |
| Adapters | Interface implementations for specific data sources (APIs, databases) |
| Vault | Credential storage with support for API keys, OAuth2, and basic authentication |
| Connectors | Configuration objects that define how adapters connect to pipelines |
| Transformations | Data manipulation functions applied during pipeline execution |
Official Adapters
OpenETL includes official adapters for common data sources:
Database Adapters:
- PostgreSQL - Full SQL support with parameterized queries
- MySQL - Full SQL support with parameterized queries
- MongoDB - Query translation with projection support
HTTP/API Adapters:
- HubSpot - CRM data (contacts, companies, deals)
- Stripe - Payment data (customers, charges, subscriptions)
- Xero - Accounting data (contacts, invoices, items)
- Google Ads - Advertising data with GAQL support
Technical Characteristics
Type Safety
OpenETL provides comprehensive TypeScript definitions for all interfaces:
import { Pipeline, Connector, Vault } from 'openetl';
const pipeline: Pipeline = {
id: 'typed-pipeline',
source: { /* TypeScript validates configuration */ },
};
Stateless Architecture
The framework maintains no internal state between pipeline executions. State management for incremental synchronization is handled by the consuming application.
Modular Design
Adapters are loaded on demand, minimizing bundle size:
// Only import what you need
import { postgresql } from '@openetl/postgresql';
import { hubspot } from '@openetl/hubspot';
const adapters = { postgresql, hubspot };
Security Features
- Credential Isolation - Credentials stored in Vault, never exposed to pipeline logic
- SQL Injection Protection - Database adapters use parameterized queries
- Operator Validation - Whitelisted operators prevent injection attacks
- Identifier Escaping - Schema, table, and column names properly escaped
See Security for detailed security documentation.
Error Handling
Configurable retry logic with exponential backoff:
error_handling: {
max_retries: 3,
retry_interval: 1000, // Base delay in ms
fail_on_error: true,
}
Use Cases
OpenETL is designed for:
- Data Synchronization - Sync CRM data to analytics databases
- Data Migration - Move data between database systems
- API Integration - Extract data from SaaS platforms
- ETL Pipelines - Build scheduled data processing workflows
- Data Warehousing - Load data into analytical data stores
Architecture Diagram
┌─────────────────────────────────────────────────────────────┐
│ Application │
├─────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────┐ ┌──────────────┐ ┌─────────────────┐ │
│ │ Vault │ │ Orchestrator │ │ Pipeline │ │
│ │ │───▶│ │◀───│ Configuration │ │
│ └─────────┘ └──────┬───────┘ └─────────────────┘ │
│ │ │
│ ┌───────────────┼───────────────┐ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ Adapter │ │ Adapter │ │ Adapter │ │
│ │ (Source) │ │ (Target) │ │ (...) │ │
│ └────┬─────┘ └────┬─────┘ └──────────┘ │
│ │ │ │
└────────┼──────────────┼──────────────────────────────────────┘
│ │
▼ ▼
┌─────────┐ ┌─────────┐
│ HubSpot │ │PostgreSQL│
│ API │ │ Database │
└─────────┘ └─────────┘
Next Steps
- Getting Started - Installation and first pipeline
- Adapters - Available adapters and configuration
- Pipeline Configuration - Pipeline options and callbacks
- Custom Adapters - Building custom adapters
- Security - SQL injection protection and credential management
- API Reference - Complete TypeScript API reference