Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 13 additions & 3 deletions .github/workflows/release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -93,21 +93,28 @@ jobs:
raise SystemExit(f"unexpected import name: {src_py_lib.__name__}")
PY

- name: Build wheel
- name: Build distributions
id: build
run: |
dist_dir="build/release/dist"
rm -rf build/release
mkdir -p "${dist_dir}"
shopt -s nullglob

uv build --wheel --out-dir "${dist_dir}" --no-create-gitignore
uv build --wheel --sdist --out-dir "${dist_dir}" --no-create-gitignore
project_wheels=("${dist_dir}"/*.whl)
if [[ "${#project_wheels[@]}" -ne 1 ]]; then
echo "::error title=Unexpected wheel count::Expected one project wheel, found ${#project_wheels[@]}."
exit 1
fi
source_distributions=("${dist_dir}"/*.tar.gz)
if [[ "${#source_distributions[@]}" -ne 1 ]]; then
echo "::error title=Unexpected source distribution count::Expected one source distribution, found ${#source_distributions[@]}."
exit 1
fi
wheel_path="${project_wheels[0]}"
wheel_name="$(basename "${wheel_path}")"
source_distribution_path="${source_distributions[0]}"
checksum_path="${wheel_path}.sha256"

(
Expand All @@ -117,6 +124,7 @@ jobs:

echo "wheel_path=${wheel_path}" >> "${GITHUB_OUTPUT}"
echo "wheel_name=${wheel_name}" >> "${GITHUB_OUTPUT}"
echo "source_distribution_path=${source_distribution_path}" >> "${GITHUB_OUTPUT}"
echo "checksum_path=${checksum_path}" >> "${GITHUB_OUTPUT}"

- name: Smoke test installed wheel
Expand Down Expand Up @@ -178,7 +186,9 @@ jobs:
uses: actions/upload-artifact@v7
with:
name: pypi-distributions
path: ${{ steps.build.outputs.wheel_path }}
path: |
${{ steps.build.outputs.wheel_path }}
${{ steps.build.outputs.source_distribution_path }}

- name: Publish GitHub release assets
env:
Expand Down
134 changes: 37 additions & 97 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,67 +1,41 @@
# src-py-lib

Reusable libraries for Sourcegraph-adjacent Python projects

This repo is the shared implementation layer for patterns which get
rebuilt in separate scripts: API clients, HTTP retries/timeouts, structured logging,
etc.
Reusable libraries for Sourcegraph projects

## Experimental - This is not a supported Sourcegraph product

This repo was created for Sourcegraph Implementation Engineering deployments,
and is not intended, designed, built, or supported for use in any other scenario.
Feel free to open issues or PRs, but responses are best effort.

## Install from another project
## Install

From PyPI:

```sh
uv add src-py-lib
```

From this repository:

```sh
uv add git+https://github.com/sourcegraph/src-py-lib.git
```

## What is included

- `src_py_lib.utils.logging` — centralized human stderr logs plus optional structured
JSONL events, run IDs, git commit metadata, context fields, event timing,
retention, startup metadata, and sanitized config snapshots.
- `src_py_lib.utils.config` — Pydantic-backed `Config` models loaded from code
defaults, `python-dotenv` `.env` parsing, shell environment, and CLI
overrides, with typed values, required checks, safe snapshots, and `op://...`
reference resolution.
- `src_py_lib.utils.http` — pooled `httpx` JSON HTTP client with a shared
30-second timeout, retry policy, `Retry-After` support, and contextual errors.
- `src_py_lib.utils.tsv` — padded TSV writer for human-readable tabular exports,
with newline/tab cleanup, URL preservation, and Unicode-aware column widths.
- `src_py_lib.clients.graphql` — shared GraphQL execution with automatic cursor
pagination, batched alias lookups, and schema introspection export.
- `src_py_lib.clients.sourcegraph` — Sourcegraph GraphQL client with token
validation, endpoint normalization, connection streaming, and shared config
fields for `SRC_ENDPOINT` (default: `https://sourcegraph.com`) and
`SRC_ACCESS_TOKEN`.
- `src_py_lib.clients.linear` — Linear GraphQL client with automatic cursor
handling, token validation, shared config fields, and injectable HTTP policy.
- `src_py_lib.clients.slack` — Slack Web API client with token validation,
cursor pagination, and method pacing. Consider `slack_sdk` if usage grows
beyond simple GET, pagination, and rate-limit handling.
- `src_py_lib.clients.github` — GitHub GraphQL client, PR URL parsing, and
batched PR lookups, with token validation. Defaults to `https://github.com`;
pass `github_url` for GitHub Enterprise Server. Keep lightweight for GraphQL;
GitHub SDKs help more for REST.
- `src_py_lib.clients.one_password` — tiny 1Password CLI wrapper for signing in,
validating authenticated `op` access, and resolving `op://...` references after config loading.
- `src_py_lib.clients.google_sheets` — Google Sheets API primitives with
spreadsheet access validation using gcloud Application Default Credentials or
a provided access token. Prefer Google's official libraries if Sheets usage
grows beyond small primitives, because auth, quota project, token refresh,
batching, and error shapes are subtle.

Prefer this library for shared logging, HTTP policy, and thin API wrappers.
Prefer vendor SDKs when they replace tricky auth, token refresh, retries,
pagination, quota behavior, or complex request models.

## Example

Define one project-specific `Config` model, then load it once at CLI startup.
For common CLI and client usage, import the curated root API:
## Use it for

- Typed config models loaded from defaults, `.env`, environment variables, and CLI flags
- Root logger setup with terminal output, optional JSONL events, timing, and run metadata
- A shared `httpx` JSON client with timeouts, retries, `Retry-After`, and contextual errors
- Small helpers for TSV output, JSON caches, GraphQL pagination, and batched GraphQL queries
- Thin clients for Sourcegraph, Linear, Slack, GitHub, Google Sheets, and 1Password

Prefer vendor SDKs when they handle complex auth, token refresh, quota behavior,
pagination, retries, or request models better than a thin wrapper

## Quick example

Define a project config, parse it once, and configure logging at startup:

```python
from pathlib import Path
Expand All @@ -78,56 +52,22 @@ class LinearExportConfig(src.LinearClientConfig):
help="Directory for generated files.",
)

config = src.parse_args(LinearExportConfig, description="Export Linear data.")
client = src.linear_client_from_config(config)
print(f"Writing files under {config.output_dir}")
```

Config precedence is: code defaults, `.env`, shell environment, then CLI
overrides. API client modules can provide shared Config base classes such as
`LinearClientConfig`, and `parse_args` resolves `op://...` references by
default. `config_field(default=...)` supports aliases, store-true /
store-false command flags, optional values, numeric bounds, and string patterns
for simple CLIs. Pass a custom `argparse.ArgumentParser` to `parse_args` only when you
need parsing beyond Config fields. Help text preserves description and
argument-help newlines, and reserves enough option-column width for long config
flags. Mark sensitive fields with `secret=True` so snapshots do not expose
resolved values.

## Logging example

Configure logging once at process startup. Prefer configuring the root logger
(`logger_name=""`, the default) so project modules and shared `src_py_lib` modules
such as `src_py_lib.utils.http` are captured by the same terminal and JSONL handlers.
Use `logging()` in CLIs to configure logging, add the command field to all
structured events, and emit standard run/startup/run-end metadata.
Use `debug()`, `info()`, `warning()`, `error()`, and `critical()` for one-off
structured events. Use `event()` blocks around timed work; they emit `trace`,
`span`, and nested `parent_span` fields. Use `start_level="debug"` to hide
noisy start events while keeping end timing visible, and
`omit_success_status=True` for very high-volume success events. Use `stage()`
for workflow context such as `stage="apply"`.
When the root logger is configured, noisy `httpx`/`httpcore` records are suppressed;
`HTTPClient` emits structured `http_request` events instead.
Run-end events include HTTP attempt/byte/status/retry counters. Set
`LoggingSettings.resource_sample_interval_seconds` to emit DEBUG
`resource_sample` events and include process resource totals on run end. Set
`SRC_LOG_LEVEL=INFO` for a run to omit DEBUG events from the log file.
`LoggingConfig` includes `--verbose/-v`, `--quiet/-q`, and `--silent/-s`
shortcuts (also available as `SRC_LOG_VERBOSE`, `SRC_LOG_QUIET`, and
`SRC_LOG_SILENT`). Use `logging_settings_from_config()` to build
`LoggingSettings` from those conventions.

```python
import src_py_lib as src
config = src.parse_args(LinearExportConfig, description="Export Linear data")

with src.logging({"src_token": "provided"}):
src.info("sync_started", repository_count=3)

client = src.SourcegraphClient("https://sourcegraph.example.com", "token")
data = client.graphql("query Viewer { currentUser { username } }")
with src.logging(config):
client = src.linear_client_from_config(config)
client.validate()
src.info("Starting export", output_dir=str(config.output_dir))
```

- Config precedence is code defaults, `.env`, shell environment, then CLI overrides
- `parse_args` resolves `op://...` references by default
- Mark sensitive fields with `secret=True` so config snapshots redact resolved values
- `src.logging()` configures the root logger by default so project code and `src_py_lib` internals
share the same handlers
- Use `event()` for timed work, `stage()` for workflow context, and
`info()` / `warning()` / `error()` for structured one-off events

## Development

```sh
Expand Down
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ dev = [
[project]
name = "src-py-lib"
version = "0.1.0"
description = "Reusable Python helpers for Sourcegraph projects"
description = "Reusable libraries for Sourcegraph projects"
readme = "README.md"
requires-python = ">=3.11"
license = "MIT"
Expand Down