Skip to content

HackYourFuture/data-mid-project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Week 7 Project: [Your Project Name]

What it does

Architecture

[Your API] ──► pipeline.py ──► Pydantic validation ──► Postgres INSERT
                                                     ──► Blob Storage (raw JSON)

Run locally

# 1. Populate .env from Azure Key Vault
cp .env.example .env
echo "POSTGRES_URL=$(az keyvault secret show --vault-name kv-hyf-data --name postgres-url --query value -o tsv)" >> .env
echo "AZURE_STORAGE_CONNECTION_STRING=$(az keyvault secret show --vault-name kv-hyf-data --name storage-connection-string --query value -o tsv)" >> .env

# 2. Install dependencies
uv sync

# 3. Run directly (without Docker)
uv run python -m src.pipeline

# 4. Or build and run with Docker
docker build -t my-pipeline .
docker run --env-file .env my-pipeline

Run tests

uv run pytest tests/ -v

Deploy to Azure

# Build for linux/amd64 (required by Azure Container Apps) and push to ACR
docker build --platform linux/amd64 -t hyfregistry.azurecr.io/my-pipeline:1.0 .
docker push hyfregistry.azurecr.io/my-pipeline:1.0

# Create Container App Job
az containerapp job create \
  --name my-pipeline-job \
  --resource-group rg-hyf-data \
  --environment env-hyf-data \
  --image hyfregistry.azurecr.io/my-pipeline:1.0 \
  --registry-server hyfregistry.azurecr.io \
  --trigger-type Manual \
  --replica-timeout 300 \
  --replica-retry-limit 0 \
  --env-vars \
    POSTGRES_URL="$(az keyvault secret show --vault-name kv-hyf-data --name postgres-url --query value -o tsv)" \
    AZURE_STORAGE_CONNECTION_STRING="$(az keyvault secret show --vault-name kv-hyf-data --name storage-connection-string --query value -o tsv)" \
    LOG_LEVEL=INFO

# Start the job
az containerapp job start --name my-pipeline-job --resource-group rg-hyf-data

Install psql

psql is the Postgres command-line client used to verify results. Install it once:

macOS

brew install libpq
echo 'export PATH="/opt/homebrew/opt/libpq/bin:$PATH"' >> ~/.zshrc
source ~/.zshrc

Linux (Debian/Ubuntu)

sudo apt-get install -y postgresql-client

Windows Download and run the installer from postgresql.org/download/windows. The installer includes psql. After installing, open a new terminal and verify with psql --version.

Verify results

# Check job execution
az containerapp job execution list --name my-pipeline-job --resource-group rg-hyf-data --output table

# Check Postgres (set POSTGRES_URL first — see Run locally above)
psql "$POSTGRES_URL" -c "SELECT COUNT(*) FROM your_table_name;"  # replace with your table name

# Check Blob Storage
az storage blob list --account-name hyfstoragedev --container-name raw --prefix pipeline/ --output table

Clean up

az containerapp job delete --name my-pipeline-job --resource-group rg-hyf-data --yes

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors