Security
OpenETL implements multiple security layers to protect against common vulnerabilities. This document covers SQL injection protection, credential management, and secure development practices.
Overview
Security features in OpenETL include:
- SQL Injection Protection - Parameterized queries for all database operations
- Operator Validation - Whitelisted SQL operators prevent injection attacks
- Identifier Escaping - Schema, table, and column names properly escaped
- Credential Isolation - Vault-based credential management
- Input Validation - Type checking and sanitization
SQL Injection Protection
Parameterized Queries
All database adapters use parameterized queries to prevent SQL injection. Values are never interpolated directly into SQL strings.
Vulnerable approach (NOT used):
// DANGEROUS - Never do this
const query = `SELECT * FROM users WHERE status = '${userInput}'`;
OpenETL approach:
// Safe - Values passed as parameters
const query = {
text: 'SELECT * FROM "public"."users" WHERE "status" = $1 LIMIT $2 OFFSET $3',
values: [userInput, limit, offset]
};
await pool.query(query.text, query.values);
PostgreSQL Example
// User-provided filter
filters: [
{ field: 'status', operator: '=', value: "active'; DROP TABLE users;--" }
]
// Generated query (safe)
// SELECT "id", "name" FROM "public"."users" WHERE "status" = $1
// Parameters: ["active'; DROP TABLE users;--"]
The malicious input is treated as a literal string value, not as SQL code.
MySQL Example
MySQL adapter uses ? placeholders:
// Generated query
// SELECT `id`, `name` FROM `mydb`.`users` WHERE `status` = ? LIMIT ? OFFSET ?
// Parameters: ['active', 10, 0]
MongoDB Example
MongoDB adapter validates collection names and field names:
// Collection name validation
if (!/^[a-zA-Z_][a-zA-Z0-9_]*$/.test(collection)) {
throw new Error('Invalid collection name');
}
// Query built safely using MongoDB driver
const query = { status: userInput }; // Driver handles escaping
Operator Validation
Database adapters validate SQL operators against a whitelist:
const ALLOWED_OPERATORS = new Set([
'=', '!=', '<>', '<', '>', '<=', '>=',
'LIKE', 'ILIKE', 'NOT LIKE', 'NOT ILIKE',
'IN', 'NOT IN', 'IS NULL', 'IS NOT NULL',
'BETWEEN', 'NOT BETWEEN'
]);
function validateOperator(operator: string): string {
const normalized = operator.trim().toUpperCase();
if (!ALLOWED_OPERATORS.has(normalized)) {
throw new Error(`Invalid SQL operator: '${operator}'`);
}
return normalized;
}
Example attack prevented:
// Attempted injection via operator
filters: [
{ field: 'id', operator: '1; DROP TABLE users; --', value: '1' }
]
// Throws: Invalid SQL operator: '1; DROP TABLE users; --'
Identifier Escaping
Schema, table, and column names are escaped to prevent injection via identifiers:
PostgreSQL (Double Quotes)
function escapeIdentifier(identifier: string): string {
if (!identifier || typeof identifier !== 'string') {
throw new Error('Invalid identifier');
}
// Reject control characters
if (/[\x00-\x1f\x7f]/.test(identifier)) {
throw new Error('Invalid identifier: contains control characters');
}
// Escape double quotes by doubling them
return `"${identifier.replace(/"/g, '""')}"`;
}
Example:
config: { schema: 'my"schema', table: 'user data' }
// Escaped: FROM "my""schema"."user data"
MySQL (Backticks)
function escapeIdentifier(identifier: string): string {
if (!identifier || typeof identifier !== 'string') {
throw new Error('Invalid identifier');
}
// Escape backticks by doubling them
return `\`${identifier.replace(/`/g, '``')}\``;
}
Credential Management
Vault Isolation
Credentials are stored in the Vault and referenced by ID in pipelines:
// Vault stores sensitive credentials
const vault: Vault = {
'production-db': {
id: 'production-db',
type: 'basic',
credentials: {
host: 'db.example.com',
database: 'production',
username: process.env.DB_USER!,
password: process.env.DB_PASSWORD!,
},
},
};
// Pipeline references credentials by ID only
const pipeline: Pipeline = {
id: 'sync',
source: {
adapter_id: 'postgresql',
credential_id: 'production-db', // Reference, not credentials
// ...
},
};
Benefits
- Separation of Concerns - Credentials managed separately from pipeline logic
- No Credential Exposure - Pipeline configurations don't contain secrets
- Environment-Based - Use environment variables for actual credentials
- Audit Trail - Credential access tracked by ID
OAuth2 Token Refresh
OAuth2 credentials support automatic token refresh:
const vault: Vault = {
'hubspot': {
id: 'hubspot',
type: 'oauth2',
credentials: {
client_id: process.env.HUBSPOT_CLIENT_ID!,
client_secret: process.env.HUBSPOT_CLIENT_SECRET!,
access_token: currentAccessToken,
refresh_token: process.env.HUBSPOT_REFRESH_TOKEN!,
},
expires_at: '2024-01-15T10:00:00Z', // Token expiration
},
};
When the access token expires, adapters automatically refresh using the refresh token.
Input Validation
Filter Validation
The validatePipeline function validates filter configurations:
import { validatePipeline } from 'openetl';
const validation = validatePipeline(pipeline, adapters, vault);
// Checks performed:
// - Filter field is non-empty
// - Filter operator is non-empty
// - Credential references exist
// - Adapter references exist
Type Checking
OpenETL provides TypeScript definitions for type-safe configurations:
import { Pipeline, Connector, Filter } from 'openetl';
// TypeScript catches type errors at compile time
const filter: Filter = {
field: 'status',
operator: '=',
value: 'active', // Type checked
};
Custom Query Security
When using custom_query endpoints, SQL injection protection is bypassed:
// CAUTION: custom_query executes raw SQL
source: {
adapter_id: 'postgresql',
endpoint_id: 'custom_query',
config: {
custom_query: `
SELECT u.id, u.name
FROM users u
WHERE u.created_at > '2024-01-01'
`,
},
}
Guidelines for custom queries:
- Never interpolate user input into custom queries
- Use custom queries only for complex joins or aggregations
- Validate any dynamic parts before inclusion
- Consider using
table_querywith filters instead
Secure Adapter Development
When building custom adapters, follow these practices:
1. Use Parameterized Queries
// Database adapter implementation
async download({ limit, offset }) {
const query = {
text: `SELECT ${escapedFields} FROM ${escapedTable} LIMIT $1 OFFSET $2`,
values: [limit, offset]
};
return await pool.query(query.text, query.values);
}
2. Validate All Identifiers
function escapeIdentifier(identifier: string): string {
if (!identifier || typeof identifier !== 'string') {
throw new Error('Invalid identifier');
}
if (/[\x00-\x1f\x7f]/.test(identifier)) {
throw new Error('Contains control characters');
}
return `"${identifier.replace(/"/g, '""')}"`;
}
3. Whitelist Operators
const ALLOWED_OPERATORS = new Set(['=', '!=', '<', '>', '<=', '>=', 'LIKE', 'IN']);
function validateOperator(op: string): string {
const normalized = op.trim().toUpperCase();
if (!ALLOWED_OPERATORS.has(normalized)) {
throw new Error(`Invalid operator: ${op}`);
}
return normalized;
}
4. Never Log Credentials
// Bad - exposes credentials
console.log('Connecting with:', auth);
// Good - log without sensitive data
console.log('Connecting to:', auth.credentials.host);
5. Use AdapterError for Security Failures
import { AdapterError } from 'openetl';
// Authentication failure
throw new AdapterError(
'Authentication failed',
'AUTHENTICATION_FAILED',
false // Not retryable
);
Security Checklist
For Pipeline Authors
- [ ] Store credentials in environment variables
- [ ] Use Vault for credential management
- [ ] Avoid
custom_queryunless necessary - [ ] Validate pipeline configuration before execution
- [ ] Use
validatePipeline()for pre-flight checks
For Adapter Developers
- [ ] Use parameterized queries for all database operations
- [ ] Escape all identifiers (tables, columns, schemas)
- [ ] Validate operators against whitelist
- [ ] Never interpolate user input into SQL
- [ ] Never log credential values
- [ ] Use AdapterError for security-related failures
- [ ] Validate collection/table names in NoSQL adapters
Additional Resources
- Custom Adapters: Building secure custom adapters
- Error Handling: Security error handling
- Authentication: Credential management
- PostgreSQL Adapter: Database security implementation