All components that support configuration rows — typically data source and destination connectors — can optionally run their row jobs in parallel. The parallelism setting controls how many row jobs execute concurrently within a single configuration.
Understanding what this setting does — and what it doesn’t — helps you make better decisions about performance and cost.
Parallelism defines the maximum number of row jobs that may run at the same time within a configuration’s execution. For example, if your configuration has 10 rows and you set parallelism to 3, those rows are processed in batches of up to 3.
Parallelism is an upper limit, not a guarantee. The actual number of concurrently running jobs may be lower than your configured value. Jobs that cannot start immediately are placed into a waiting state — this is normal behavior, not an error.
This setting is optional. The default is Parallel jobs: Off, which means rows are processed one at a time.
Example: A configuration has five rows and parallelism set to 2. The rows are processed in three consecutive sets — (2 + 2 + 1) — with the jobs in each set running in parallel.
Even with a high parallelism setting, multiple system-level constraints determine how many jobs actually run simultaneously:
| Constraint | Effect |
|---|---|
| Storage job capacity | Storage jobs have their own parallel limit. Table import and export operations contribute to this count. |
| Resource locks | If multiple jobs write to the same table, only one proceeds at a time. Others wait until the lock is released. |
| Worker availability | Backend workers are a shared infrastructure resource. Under load, a job may briefly wait for a worker to become free. |
A practical way to reason about it:
Actual concurrency = min(component parallelism, storage capacity, resource availability)
This is not a flaw — it is how Keboola ensures stability and data consistency across concurrent workloads.
Every job passes through predictable states:
waiting → processing → success / error
How billing relates to job state:
Important — container runtime billing: Some components run inside a container that orchestrates multiple child jobs. In these cases, the parent container may continue running and accumulating runtime costs even while individual child jobs are in the waiting state. Setting very high parallelism in a container-based component does not pause the container while jobs queue — the container remains active throughout.
Consider a database extractor with 100 configuration rows and parallelism set to 10.
Result:
As each running job completes, the next waiting job starts. The backlog clears gradually — this is expected behavior. The parallelism setting of 10 still takes effect as capacity opens up.
If you expected exactly 10 simultaneous extractions, you may see slower-than-anticipated progress during busy periods in your project. The solution is usually not a higher parallelism value but rather an awareness that system capacity is the bottleneck.
Parallelism delivers real gains when:
In these cases, higher parallelism reduces total execution time proportionally to available system capacity.
Parallelism has little or no effect when: