Introduction

OpenETL is an open-source ETL (Extract, Transform, Load) framework implemented in TypeScript. This document provides a technical overview of the framework's architecture and capabilities.

What is OpenETL?

OpenETL is a TypeScript framework for building data pipelines that extract data from source systems, apply transformations, and load results into target systems. The framework uses a modular adapter architecture to support multiple data sources and destinations.

Core Components

Component Description
Orchestrator Pipeline execution engine that coordinates adapters and manages data flow
Adapters Interface implementations for specific data sources (APIs, databases)
Vault Credential storage with support for API keys, OAuth2, and basic authentication
Connectors Configuration objects that define how adapters connect to pipelines
Transformations Data manipulation functions applied during pipeline execution

Official Adapters

OpenETL includes official adapters for common data sources:

Database Adapters:

  • PostgreSQL - Full SQL support with parameterized queries
  • MySQL - Full SQL support with parameterized queries
  • MongoDB - Query translation with projection support

HTTP/API Adapters:

  • HubSpot - CRM data (contacts, companies, deals)
  • Stripe - Payment data (customers, charges, subscriptions)
  • Xero - Accounting data (contacts, invoices, items)
  • Google Ads - Advertising data with GAQL support

Technical Characteristics

Type Safety

OpenETL provides comprehensive TypeScript definitions for all interfaces:

import { Pipeline, Connector, Vault } from 'openetl';

const pipeline: Pipeline = {
  id: 'typed-pipeline',
  source: { /* TypeScript validates configuration */ },
};

Stateless Architecture

The framework maintains no internal state between pipeline executions. State management for incremental synchronization is handled by the consuming application.

Modular Design

Adapters are loaded on demand, minimizing bundle size:

// Only import what you need
import { postgresql } from '@openetl/postgresql';
import { hubspot } from '@openetl/hubspot';

const adapters = { postgresql, hubspot };

Security Features

  • Credential Isolation - Credentials stored in Vault, never exposed to pipeline logic
  • SQL Injection Protection - Database adapters use parameterized queries
  • Operator Validation - Whitelisted operators prevent injection attacks
  • Identifier Escaping - Schema, table, and column names properly escaped

See Security for detailed security documentation.

Error Handling

Configurable retry logic with exponential backoff:

error_handling: {
  max_retries: 3,
  retry_interval: 1000,  // Base delay in ms
  fail_on_error: true,
}

Use Cases

OpenETL is designed for:

  • Data Synchronization - Sync CRM data to analytics databases
  • Data Migration - Move data between database systems
  • API Integration - Extract data from SaaS platforms
  • ETL Pipelines - Build scheduled data processing workflows
  • Data Warehousing - Load data into analytical data stores

Architecture Diagram

┌─────────────────────────────────────────────────────────────┐
│                        Application                           │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│   ┌─────────┐    ┌──────────────┐    ┌─────────────────┐   │
│   │  Vault  │    │ Orchestrator │    │    Pipeline     │   │
│   │         │───▶│              │◀───│  Configuration  │   │
│   └─────────┘    └──────┬───────┘    └─────────────────┘   │
│                         │                                    │
│         ┌───────────────┼───────────────┐                   │
│         │               │               │                   │
│         ▼               ▼               ▼                   │
│   ┌──────────┐   ┌──────────┐   ┌──────────┐               │
│   │ Adapter  │   │ Adapter  │   │ Adapter  │               │
│   │ (Source) │   │ (Target) │   │  (...)   │               │
│   └────┬─────┘   └────┬─────┘   └──────────┘               │
│        │              │                                      │
└────────┼──────────────┼──────────────────────────────────────┘
         │              │
         ▼              ▼
    ┌─────────┐   ┌─────────┐
    │ HubSpot │   │PostgreSQL│
    │   API   │   │ Database │
    └─────────┘   └─────────┘

Next Steps