Skip to content

feat: Lakeflow Jobs Plugin#265

Open
atilafassina wants to merge 2 commits intomainfrom
ekniazev/jobs-plugin-core
Open

feat: Lakeflow Jobs Plugin#265
atilafassina wants to merge 2 commits intomainfrom
ekniazev/jobs-plugin-core

Conversation

@atilafassina
Copy link
Copy Markdown
Contributor

@atilafassina atilafassina commented Apr 10, 2026

Important

to maintain contribution history this PR should be rebased when merged
(don't merge it if it has more than 2 commits)

Summary

Resource-scoped jobs plugin following the files plugin pattern. Jobs are configured as named resources discovered from environment variables at startup.

Design

  • Resource-scoped: Only configured jobs are accessible — this is not an open SDK wrapper
  • Env-var discovery: Jobs are discovered from DATABRICKS_JOB_<KEY> env vars (e.g. DATABRICKS_JOB_ETL=123)
  • Single-job shorthand: DATABRICKS_JOB_ID maps to the "default" key
  • Manifest declares job resources with CAN_MANAGE_RUN permission
  • Works with databricks apps init --features jobs

API

// Trigger a configured job
const { run_id } = await appkit.jobs("etl").runNow();

// Trigger and wait for completion
const run = await appkit.jobs("etl").runNowAndWait();

// OBO access
await appkit.jobs("etl").asUser(req).runNow();

// List recent runs
const runs = await appkit.jobs("etl").listRuns({ limit: 10 });

// Single-job shorthand
await appkit.jobs("default").runNow();

Files changed

  • plugins/jobs/manifest.json — declares job resource with CAN_MANAGE_RUN permission
  • plugins/jobs/types.tsJobAPI, JobHandle, JobsExport, IJobsConfig types
  • plugins/jobs/plugin.tsJobsPlugin with discoverJobs(), getResourceRequirements(), resource-scoped createJobAPI()
  • plugins/jobs/index.ts — barrel exports
  • connectors/jobs/client.tslistRuns now respects limit parameter
  • plugins/jobs/tests/plugin.test.ts — 32 tests covering discovery, resource requirements, exports, OBO, multi-job, and auto-fill

Documentation safety checklist

  • Examples use least-privilege permissions
  • Sensitive values are obfuscated
  • No insecure patterns introduced

Reopened from #223 — moved head branch from fork to upstream to fix CI secrets access.

demo

lakeflow-jobs.mp4

Jobs are configured as named resources (DATABRICKS_JOB_<KEY> env vars)
and discovered at startup, following the files plugin pattern.

API is scoped to configured jobs:
  appkit.jobs('etl').runNow()
  appkit.jobs('etl').runNowAndWait()
  appkit.jobs('etl').lastRun()
  appkit.jobs('etl').listRuns()
  appkit.jobs('etl').asUser(req).runNow()

Single-job shorthand via DATABRICKS_JOB_ID env var.
Supports OBO access via asUser(req).

Co-authored-by: Isaac
Signed-off-by: Evgenii Kniazev <evgenii.kniazev@databricks.com>
Copilot AI review requested due to automatic review settings April 10, 2026 20:40
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new resource-scoped “jobs” plugin to @databricks/appkit, following the existing “files” plugin pattern: jobs are discovered from environment variables at startup and exposed via a keyed accessor API, with HTTP routes for triggering and monitoring runs.

Changes:

  • Introduces plugins/jobs (manifest, defaults, params mapping, types, plugin implementation, and extensive tests).
  • Adds connectors/jobs with telemetry + cancellation support and updates exports/docs to surface the new plugin and types.
  • Updates templates and generated API docs/sidebars to include the new Jobs plugin.

Reviewed changes

Copilot reviewed 24 out of 25 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
template/appkit.plugins.json Adds “jobs” to plugin template metadata + resource description.
pnpm-lock.yaml Locks new dependency (zod@4.3.6) and related lockfile updates.
packages/appkit/src/plugins/jobs/types.ts Public types for Jobs plugin API/config (JobAPI/JobHandle/IJobsConfig).
packages/appkit/src/plugins/jobs/plugin.ts Core JobsPlugin: env discovery, dynamic resource requirements, API methods, HTTP routes.
packages/appkit/src/plugins/jobs/params.ts TaskType-based param mapping into SDK request fields.
packages/appkit/src/plugins/jobs/defaults.ts Execution defaults for read/write/stream operations.
packages/appkit/src/plugins/jobs/manifest.json Plugin manifest + config schema + baseline resource definition.
packages/appkit/src/plugins/jobs/index.ts Barrel exports for Jobs plugin/types.
packages/appkit/src/plugins/jobs/tests/plugin.test.ts Comprehensive unit tests for discovery, API, routes, and validation.
packages/appkit/src/plugins/index.ts Exposes jobs from the plugins barrel.
packages/appkit/src/index.ts Exposes jobs and related public types/configs from package root.
packages/appkit/src/connectors/jobs/client.ts JobsConnector SDK wrapper with telemetry instrumentation + limit handling.
packages/appkit/src/connectors/jobs/types.ts Connector config type (timeout/telemetry).
packages/appkit/src/connectors/jobs/index.ts Connector barrel exports.
packages/appkit/src/connectors/index.ts Exposes jobs connector from connectors barrel.
packages/appkit/package.json Adds zod dependency for runtime param schemas + JSON schema generation.
docs/docs/api/appkit/typedoc-sidebar.ts Adds new Jobs docs entries to sidebar.
docs/docs/api/appkit/*.md Adds generated API docs pages for Jobs plugin types.
docs/docs/api/appkit/index.md Adds Jobs-related types to API index list.
docs/docs/api/appkit/Interface.BasePluginConfig.md Updates “Extended by” list to include IJobsConfig.
Files not reviewed (1)
  • pnpm-lock.yaml: Language not supported

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread packages/appkit/src/plugins/jobs/plugin.ts Outdated
Comment thread packages/appkit/src/plugins/jobs/plugin.ts Outdated
Comment thread packages/appkit/src/plugins/jobs/plugin.ts Outdated
Comment thread packages/appkit/src/plugins/jobs/plugin.ts
Comment thread packages/appkit/src/plugins/jobs/plugin.ts Outdated
Comment thread packages/appkit/src/plugins/jobs/types.ts
Comment thread packages/appkit/src/connectors/jobs/client.ts
@atilafassina atilafassina changed the title feat: add resource-scoped jobs plugin for Databricks Lakeflow Jobs feat: Lakeflow Jobs Plugin Apr 10, 2026
@pkosiec
Copy link
Copy Markdown
Member

pkosiec commented Apr 13, 2026

How it is different from @keugenek PR here: #221?

If you based on top of Evgenii's work, maybe it's worth to recognize his contribution? What I'm thinking is either:

  • have 2 commits on this PR: Evgenii's base one and yours (with all changes, PR review fixes etc.), and temporarily enable the "Rebase & Merge" option to have 2 commits on main
  • or, merge Evgenii's PR as is, right before this PR (so we'd need to wait for this PR to be approved first)

WDYT?

@atilafassina
Copy link
Copy Markdown
Contributor Author

atilafassina commented Apr 14, 2026

@pkosiec

How it is different from @keugenek PR here: #221?
If you based on top of Evgenii's work, maybe it's worth to recognize his contribution? What I'm thinking is either:

  • have 2 commits on this PR: Evgenii's base one and yours (with all changes, PR review fixes etc.), and temporarily enable the "Rebase & Merge" option to have 2 commits on main
  • or, merge Evgenii's PR as is, right before this PR (so we'd need to wait for this PR to be approved first)

WDYT?

I got stuck trying to update his PR because it was opened from a fork, it's noted in the PR description:

Reopened from #223 — moved head branch from fork to upstream to fix CI secrets access.

So, I couldn't get a clean CI to merge it, and if I merged to main, it would trigger a release.
You'll notice that in the commit history the very first commit is his, that's his work that I iterated on top of.

It's also marked as a comment on the original PR

#223 (comment)

@pkosiec
Copy link
Copy Markdown
Member

pkosiec commented Apr 14, 2026

So, I couldn't get a clean CI to merge it, and if I merged to main, it would trigger a release.

It wouldn't trigger a release currently (finalizing releases is manual) so there shouldn't be a problem with that 👍 but I'd wait until your PR (on top of this one) is ready to merge, and I'd merge both one after another. What do you think?

…for jobs plugin

Signed-off-by: Atila Fassina <atila@fassina.eu>
@atilafassina atilafassina force-pushed the ekniazev/jobs-plugin-core branch from 493146a to 791b146 Compare April 14, 2026 15:17
@atilafassina atilafassina requested a review from pkosiec April 14, 2026 15:19
@atilafassina atilafassina force-pushed the ekniazev/jobs-plugin-core branch 4 times, most recently from 040f7bd to d8133b9 Compare April 14, 2026 19:10
Copy link
Copy Markdown
Collaborator

@MarioCadenas MarioCadenas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added some comments, but I'm actually wondering if this should just be a "jobs" plugin or if we should extend this and make it support both jobs and pipelines tbh 😅

}

/**
* @internal
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this shouldn't be internal no?

*/
export const jobs = toPlugin(JobsPlugin);

export { JobsPlugin };
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need to export this? tests?

method: "post",
path: "/:jobKey/run",
handler: async (req: express.Request, res: express.Response) => {
const { jobKey } = this._resolveJob(req, res);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we maybe simplify this endpoint a little bit? it seems too long no?

return result.ok ? result : errorResult(result.status);
},

async *runAndWait(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can runAndWait be simplified? maybe we can extract some parts?

Comment on lines +146 to +151
const client = getWorkspaceClient();
if (!client) {
throw new InitializationError(
"Jobs plugin requires a configured workspace client",
);
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need this check? I don't think its needed, at least no plugin is doing this check?

"vite": "npm:rolldown-vite@7.1.14",
"ws": "8.18.3"
"ws": "8.18.3",
"zod": "^4.3.6"
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pin the version to 4.3.6 please 😄

@MarioCadenas
Copy link
Copy Markdown
Collaborator

a few more things I noticed going through the core package:


1. listRuns cache key probably never hits cache

in plugin.ts, listRuns uses options ?? {} in the cache key:

self._readSettings(["jobs:listRuns", jobKey, options ?? {}])

when options is undefined (which is common), this creates a new {} every time — so if the cache compares by reference, every call is a miss and the 60s TTL does nothing. the other read methods get this right with primitives (["jobs:getRun", jobKey, runId]). can we normalize this to something like ["jobs:listRuns", jobKey, options?.limit ?? "default"]?


2. mapParams silently coerces non-primitives to garbage

in params.ts, the notebook/python_wheel/sql cases do String(v) on every value:

notebook_params: Object.fromEntries(
  Object.entries(params).map(([k, v]) => [k, String(v)]),
),

if someone passes an object or array through (e.g. with a loose zod schema like z.record(z.unknown())), String(v) produces "[object Object]" or "1,2,3" — the job would run with garbage params and no error. can we at least add a guard that rejects non-primitive values here, or coerce more defensively?


3. polling in runAndWait — no backoff or jitter

I know I already asked if we can simplify runAndWait, but specifically the poll interval is fixed at 5s forever. with the default 10min timeout that's 120 API calls per waiter, and if you have multiple SSE clients polling the same run they all hit the Jobs API at the same cadence with no jitter. should we add some basic exponential backoff (e.g. 2s → 5s → 10s → cap at 15s) and a small random jitter so concurrent pollers don't align?

Copy link
Copy Markdown
Member

@pkosiec pkosiec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few comments 👍 Nice work!

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's agentic review:

Code Review: Jobs Plugin for Databricks Lakeflow Jobs

Scope: ekniazev/jobs-plugin-core branch, 2 commits, ~39 files changed
Intent: Add a new resource-scoped Jobs plugin for Databricks Lakeflow Jobs with connector, plugin, HTTP routes, SSE streaming, param validation, OBO support, frontend page, docs, and tests. Also fixes StreamManager to abort generators on client disconnect.

P1 -- High

# File Issue Confidence
1 docs/docs/plugins/jobs.md:41 Docs/code mismatch: per-job field name. Docs table says timeout for per-job config, but the TypeScript field in types.ts:69 is waitTimeout. Users following the docs will set the wrong field. 0.95
2 jobs.route.tsx:95-100 SSE stream parsing doesn't handle split chunks. decoder.decode(value, { stream: true }) can split a data: line across two chunks. The code splits on \n and checks line.startsWith("data: "), so a line split mid-chunk would be silently dropped or corrupted. Should buffer incomplete lines between reads. 0.85

P2 -- Moderate

# File Issue Confidence
3 jobs.route.tsx:255-256 React key collision on stream log. key={line} uses the line content as key. Duplicate SSE messages (heartbeats, repeated status like "RUNNING") will produce duplicate keys, causing React reconciliation issues and potential UI glitches. Use index-based key or prepend a counter. 0.90
4 plugin.ts:229-278 Duplicated param validation block. runNow (lines 229-245) and runAndWait (lines 268-283) contain identical validation + mapping logic. If validation rules change, both must be updated. Extract a shared _validateAndMapParams(jobKey, params) method. 0.85
5 connectors/jobs/client.ts:176 Semantic mismatch: ExecutionError.statementFailed. Used for generic Jobs API errors, but the name implies SQL statement failures. Other connectors (sql-warehouse) use it for actual SQL errors. While the error chain works technically, it's misleading for debugging/logging. 0.75

P3 -- Low

# File Issue Confidence
6 plugin.ts:717 Unnecessary wrapper function. ((jobKey: string) => resolveJob(jobKey)) as JobsExport can be simplified to resolveJob as JobsExport. 0.90
7 jobs.route.tsx:23-24 TERMINATED mapped to green unconditionally. stateColor("TERMINATED") returns green, but TERMINATED only means the run ended -- it could have failed. The result_state column handles this separately, but the green lifecycle indicator is misleading when paired with a red result. 0.70
8 package.json (appkit) Zod caret range ^4.3.6. For a published SDK, pinned versions are safer to avoid unexpected breaks from minor Zod releases. The lockfile shows two Zod versions (4.1.13 + 4.3.6) coexisting. 0.65

Coverage

  • Tests: Comprehensive test suite (1712 lines) covering discovery, resource requirements, exports, parameter validation, interceptors, polling, error handling, OBO, route handlers, XSS sanitization.
  • StreamManager fix: Includes a targeted test for the new abort-on-disconnect behavior.
  • Testing gaps: No integration test for the full SSE streaming path (triggerRun -> poll -> client disconnect -> abort). Frontend component has no tests (acceptable for dev-playground).

Verdict: Ready with fixes. The docs/code mismatch (#1) and SSE parsing (#2) should be addressed before merge. Items #3-5 are recommended. Items #6-8 are at your discretion.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason I asked for an e2e demo is to see how the whole flow looks like. Checked and IMO we should adjust the description - shorten it and do not mention envs:

Image

What do you think?

BTW, please try to deploy such app to see if the DAB config is correct etc. 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants