Pipelines & Jobs
Create and manage Spark Declarative Pipelines and orchestration jobs.
Example prompts
Section titled “Example prompts”“Create a streaming pipeline that ingests from my landing zone into a silver table”
“Show me the last 5 runs for my nightly-etl job”
“Check the event log for my pipeline to see why it failed”
manage_job_runs
Section titled “manage_job_runs”Description: Manage job runs: run_now, get, get_output, cancel, list, wait.
Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
action | str | Yes | — |
job_id | int | No | — |
run_id | int | No | — |
idempotency_token | str | No | — |
jar_params | List[str] | No | — |
notebook_params | Dict[str, str] | No | — |
python_params | List[str] | No | — |
spark_submit_params | List[str] | No | — |
python_named_params | Dict[str, str] | No | — |
pipeline_params | Dict[str, Any] | No | — |
sql_params | Dict[str, str] | No | — |
dbt_commands | List[str] | No | — |
queue | Dict[str, Any] | No | — |
active_only | bool | No | — |
completed_only | bool | No | — |
limit | int | No | — |
offset | int | No | — |
start_time_from | int | No | — |
start_time_to | int | No | — |
timeout | int | No | — |
poll_interval | int | No | — |
manage_jobs
Section titled “manage_jobs”Description: Manage Databricks jobs: create, get, list, find_by_name, update, delete.
Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
action | str | Yes | — |
job_id | int | No | — |
name | str | No | — |
tasks | List[Dict[str, Any]] | No | — |
job_clusters | List[Dict[str, Any]] | No | — |
environments | List[Dict[str, Any]] | No | — |
tags | Dict[str, str] | No | — |
timeout_seconds | int | No | — |
max_concurrent_runs | int | No | — |
email_notifications | Dict[str, Any] | No | — |
webhook_notifications | Dict[str, Any] | No | — |
notification_settings | Dict[str, Any] | No | — |
schedule | Dict[str, Any] | No | — |
queue | Dict[str, Any] | No | — |
run_as | Dict[str, Any] | No | — |
git_source | Dict[str, Any] | No | — |
parameters | List[Dict[str, Any]] | No | — |
health | Dict[str, Any] | No | — |
deployment | Dict[str, Any] | No | — |
limit | int | No | — |
expand_tasks | bool | No | — |
manage_pipeline
Section titled “manage_pipeline”Description: Manage Spark Declarative Pipelines: create, update, get, delete, find.
Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
action | str | Yes | — |
name | Optional[str] | No | — |
root_path | Optional[str] | No | — |
catalog | Optional[str] | No | — |
schema | Optional[str] | No | — |
workspace_file_paths | Optional[List[str]] | No | — |
extra_settings | Optional[Dict[str, Any]] | No | — |
start_run | bool | No | — |
wait_for_completion | bool | No | — |
full_refresh | bool | No | — |
timeout | int | No | — |
pipeline_id | Optional[str] | No | — |
manage_pipeline_run
Section titled “manage_pipeline_run”Description: Manage pipeline runs: start, monitor, stop, get events.
Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
action | str | Yes | — |
pipeline_id | str | Yes | — |
refresh_selection | Optional[List[str]] | No | — |
full_refresh | bool | No | — |
full_refresh_selection | Optional[List[str]] | No | — |
validate_only | bool | No | — |
wait | bool | No | — |
timeout | int | No | — |
update_id | Optional[str] | No | — |
include_config | bool | No | — |
full_error_details | bool | No | — |
max_results | int | No | — |
event_log_level | str | No | — |