Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
145 changes: 145 additions & 0 deletions .claude/skills/port-pr/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,145 @@
---
name: port-pr
description: Port a machine (C#) PR into machine.py (Python). Given a GitHub issue number for a porting task in sillsdev/machine.py, finds the linked machine (C#) PR, ports its changes to the Python codebase, runs the local checks (black/flake8/isort/pyright/pytest), and opens a PR that closes the issue. Use when asked to port a PR/issue from machine (C#), complete a "porting" issue, or sync a machine change into machine.py.
---

# Port a machine (C#) PR into machine.py (Python)

`machine` (C#) and `machine.py` (Python) are direct, intentionally-synced ports of each
other. This skill ports a change that already landed in `machine` into `machine.py`,
driven by a "porting" issue in `sillsdev/machine.py`.

**Required argument:** the GitHub issue number in `sillsdev/machine.py` (always exists).

## Repos

- Python target repo: the current working directory (`sillsdev/machine.py`).
- C# source repo: the sibling clone at `../machine` (`sillsdev/machine`).
Use the local clone for reading surrounding context; use `gh ... --repo sillsdev/machine`
for authoritative PR data.

## Step 1 — Read the porting issue

```bash
gh issue view <ISSUE> --json title,body,labels
```

- The body looks like: `Port any relevant changes in https://github.com/sillsdev/machine/pull/<PR> from machine to machine.py.`
Extract `<PR>` — the machine (C#) PR number — from that URL.
- The issue title looks like `Port '<Title>'`. Keep `<Title>` for the branch and PR.
- If the body has no machine (C#) PR link, stop and ask the user for the source PR.

## Step 2 — Understand the source change

```bash
gh pr view <PR> --repo sillsdev/machine --json title,body,files,commits
gh pr diff <PR> --repo sillsdev/machine
```

Read the full diff. For each changed C# file, open the corresponding file(s) in
`../machine` to understand the surrounding context, and identify the Python counterpart
(see mapping below). Read the existing Python code you're about to change so the port
matches local idiom.

Note: not every change ports. Skip C#-only concerns (`.csproj`/`.sln`/`Directory.*.props`,
`AssemblyInfo`, `omnisharp.json`, csharpier/editorconfig formatting, NuGet packaging).
The issue says "any *relevant* changes" — use judgment and call out anything you
intentionally skip.

## Step 3 — File & API mapping

| machine (C#) | machine.py (Python) |
|---|---|
| `src/SIL.Machine/<PascalArea>/<PascalCase>.cs` (or the matching `SIL.Machine.*` project) | `machine/<area>/<snake_case>.py` |
| `tests/SIL.Machine.Tests/<PascalArea>/<PascalCase>Tests.cs` (or the matching `*.Tests` project) | `tests/<area>/test_<snake_case>.py` |
| `PascalCase` methods / `camelCase` locals / `_camelCase` fields | `snake_case` functions/vars |
| `IReadOnlyList<T>` / `IDictionary<,>` / `ISet<T>` etc. | `Sequence[T]`/`list` / `Mapping`/`dict` / `Set`/`set` etc. |
| NUnit `Assert.That(...)` | pytest plain `assert` (check neighboring test files) |
| `AssemblyInfo`/`.csproj` `<Version>` | `pyproject.toml` `version` (poetry) |

The top-level Python areas are: `annotations`, `clusterers`, `corpora`, `jobs`,
`optimization`, `punctuation_analysis`, `scripture`, `sequence_alignment`, `statistics`,
`tokenization`, `translation`, `utils`. Some C# code lives in tool/plugin projects
(`SIL.Machine.Tool`, `SIL.Machine.Translation.Thot`, `SIL.Machine.Morphology.HermitCrab`,
etc.); the Python equivalent may live under `machine/jobs` or a different area, or may not
exist at all. Find the Python counterpart by searching for the type/method name (translated
to snake_case): `grep -ri "<type_or_method_name>" machine tests` before assuming a path.

Port the **behavior**, not the syntax. Match existing Python patterns in the neighboring
code (naming, type hints, dataclasses, generators vs. loops, async conventions). Port the
tests too.

## Step 4 — Branch & apply

Create a branch off `main` (do not commit to `main`):

```bash
git switch main && git pull && git switch -c port-<slug>
```

where `<slug>` is a short kebab-case form of the issue title (e.g.
`port-update-library-version-to-1.8.11`).

Apply the ported changes with Edit/Write.

## Step 5 — Verify locally

Install, format, lint, type-check, and test (this is `local_check.sh`):

```bash
poetry install
poetry run black .
poetry run flake8 .
poetry run isort .
poetry run pyright
poetry run pytest
```

`black` and `isort` rewrite files in place; `flake8` and `pyright` are gates that must pass
clean. Fix any formatting, lint, type, or test failures before proceeding. Report the test
results plainly (pass/fail counts); don't claim success if anything failed.

## Step 6 — Commit, push, open PR (pause first)

Show the user a summary of the diff and the proposed PR title/body, and **confirm before
pushing**. Then:

```bash
git add -A
git commit # message: Port '<Title>' from machine PR #<PR>
git push -u origin port-<slug>
gh pr create --title "Port '<Title>' from machine" --body "<body>"
```

Commit message footer:

```
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
```

### PR body template

```markdown
Ports [machine PR #<PR>](https://github.com/sillsdev/machine/pull/<PR>) — <one-line summary of the change>.

## <Section per area changed>
<What changed and why, mirroring the source PR. Include short before/after or code snippets where helpful.>

## Tests
<Tests ported / added.>

Closes #<ISSUE>
```

PR body footer:

```
🤖 Generated with [Claude Code](https://claude.com/claude-code)
```

## Notes

- Keep the two codebases as similar as is reasonable for a Python-vs-C# port.
- If the source PR spans multiple commits, the squashed PR diff is the source of truth, but
reading individual commits can clarify intent.
- If a change has no sensible Python counterpart, say so in the PR body rather than forcing it.
28 changes: 15 additions & 13 deletions machine/corpora/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@
from .file_paratext_project_settings_parser import FileParatextProjectSettingsParser
from .file_paratext_project_terms_parser import FileParatextProjectTermsParser
from .file_paratext_project_text_updater import FileParatextProjectTextUpdater
from .file_paratext_project_versification_error_detector import FileParatextProjectVersificationErrorDetector
from .file_usfm_versification_analyzer import FileUsfmVersificationAnalyzer
from .flatten import flatten
from .memory_alignment_collection import MemoryAlignmentCollection
from .memory_stream_container import MemoryStreamContainer
Expand All @@ -28,7 +28,6 @@
from .paratext_project_settings_parser_base import ParatextProjectSettingsParserBase
from .paratext_project_terms_parser_base import KeyTerm, ParatextProjectTermsParserBase
from .paratext_project_text_updater_base import ParatextProjectTextUpdaterBase
from .paratext_project_versification_error_detector_base import ParatextProjectVersificationErrorDetectorBase
from .paratext_text_corpus import ParatextTextCorpus
from .place_markers_usfm_update_block_handler import PlaceMarkersAlignmentInfo, PlaceMarkersUsfmUpdateBlockHandler
from .scripture_element import ScriptureElement
Expand Down Expand Up @@ -78,10 +77,12 @@
from .usfm_update_block import UsfmUpdateBlock
from .usfm_update_block_element import UsfmUpdateBlockElement, UsfmUpdateBlockElementType
from .usfm_update_block_handler import UsfmUpdateBlockHandler
from .usfm_versification_error_detector import (
UsfmVersificationError,
UsfmVersificationErrorDetector,
UsfmVersificationErrorType,
from .usfm_versification_analyzer_base import UsfmVersificationAnalyzerBase
from .usfm_versification_analyzer_handler import (
UsfmVersificationAnalysis,
UsfmVersificationAnalyzerHandler,
UsfmVersificationDiagnostic,
UsfmVersificationDiagnosticType,
)
from .usx_file_alignment_collection import UsxFileAlignmentCollection
from .usx_file_alignment_corpus import UsxFileAlignmentCorpus
Expand All @@ -93,7 +94,7 @@
from .zip_paratext_project_settings_parser import ZipParatextProjectSettingsParser
from .zip_paratext_project_terms_parser import ZipParatextProjectTermsParser
from .zip_paratext_project_text_updater import ZipParatextProjectTextUpdater
from .zip_paratext_project_versification_detector import ZipParatextProjectVersificationErrorDetector
from .zip_usfm_versification_analyzer import ZipUsfmVersificationAnalyzer

__all__ = [
"AlignedWordPair",
Expand All @@ -114,7 +115,7 @@
"FileParatextProjectSettingsParser",
"FileParatextProjectTermsParser",
"FileParatextProjectTextUpdater",
"FileParatextProjectVersificationErrorDetector",
"FileUsfmVersificationAnalyzer",
"flatten",
"is_scripture",
"KeyTerm",
Expand All @@ -139,7 +140,6 @@
"ParatextProjectSettingsParserBase",
"ParatextProjectTermsParserBase",
"ParatextProjectTextUpdaterBase",
"ParatextProjectVersificationErrorDetectorBase",
"ParatextTextCorpus",
"parse_usfm",
"PlaceMarkersAlignmentInfo",
Expand Down Expand Up @@ -187,9 +187,11 @@
"UsfmUpdateBlockElement",
"UsfmUpdateBlockElementType",
"UsfmUpdateBlockHandler",
"UsfmVersificationError",
"UsfmVersificationErrorDetector",
"UsfmVersificationErrorType",
"UsfmVersificationAnalysis",
"UsfmVersificationAnalyzerBase",
"UsfmVersificationAnalyzerHandler",
"UsfmVersificationDiagnostic",
"UsfmVersificationDiagnosticType",
"UsxFileAlignmentCollection",
"UsxFileAlignmentCorpus",
"UsxFileText",
Expand All @@ -200,5 +202,5 @@
"ZipParatextProjectSettingsParser",
"ZipParatextProjectTermsParser",
"ZipParatextProjectTextUpdater",
"ZipParatextProjectVersificationErrorDetector",
"ZipUsfmVersificationAnalyzer",
]
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,10 @@
from .file_paratext_project_file_handler import FileParatextProjectFileHandler
from .file_paratext_project_settings_parser import FileParatextProjectSettingsParser
from .paratext_project_settings import ParatextProjectSettings
from .paratext_project_versification_error_detector_base import ParatextProjectVersificationErrorDetectorBase
from .usfm_versification_analyzer_base import UsfmVersificationAnalyzerBase


class FileParatextProjectVersificationErrorDetector(ParatextProjectVersificationErrorDetectorBase):
class FileUsfmVersificationAnalyzer(UsfmVersificationAnalyzerBase):
def __init__(self, project_dir: StrPath, parent_settings: Optional[ParatextProjectSettings] = None) -> None:
super().__init__(
FileParatextProjectFileHandler(project_dir),
Expand Down
Original file line number Diff line number Diff line change
@@ -1,14 +1,14 @@
from typing import List, Optional, Set, Union
from typing import Dict, Optional, Set, Union

from ..scripture.canon import book_id_to_number
from .paratext_project_file_handler import ParatextProjectFileHandler
from .paratext_project_settings import ParatextProjectSettings
from .paratext_project_settings_parser_base import ParatextProjectSettingsParserBase
from .usfm_parser import parse_usfm
from .usfm_versification_error_detector import UsfmVersificationError, UsfmVersificationErrorDetector
from .usfm_versification_analyzer_handler import UsfmVersificationAnalysis, UsfmVersificationAnalyzerHandler


class ParatextProjectVersificationErrorDetectorBase:
class UsfmVersificationAnalyzerBase:
def __init__(
self,
paratext_project_file_handler: ParatextProjectFileHandler,
Expand All @@ -20,18 +20,28 @@ def __init__(
else:
self._settings = settings

def get_usfm_versification_errors(
self, handler: Optional[UsfmVersificationErrorDetector] = None, books: Optional[Set[int]] = None
) -> List[UsfmVersificationError]:
handler = handler or UsfmVersificationErrorDetector(self._settings)
def analyze_usfm_versification(
self,
books_and_chapters: Optional[Union[Dict[str, Optional[Set[int]]], Dict[int, Optional[Set[int]]]]] = None,
handler: Optional[UsfmVersificationAnalyzerHandler] = None,
) -> UsfmVersificationAnalysis:
book_nums_and_chapters = (
{
(book_id_to_number(book) if isinstance(book, str) else book): chapters
for book, chapters in books_and_chapters.items()
}
if books_and_chapters is not None
else None
)
handler = handler or UsfmVersificationAnalyzerHandler(self._settings, book_nums_and_chapters)
for book_id in self._settings.get_all_scripture_book_ids():

file_name = self._settings.get_book_file_name(book_id)

if not self._paratext_project_file_handler.exists(file_name):
continue

if books is not None and not book_id_to_number(book_id) in books:
if book_nums_and_chapters is not None and book_id_to_number(book_id) not in book_nums_and_chapters:
continue

with self._paratext_project_file_handler.open(file_name) as sfm_file:
Expand All @@ -45,4 +55,4 @@ def get_usfm_versification_errors(
f". Error: '{e}'"
)
raise RuntimeError(error_message) from e
return handler.errors
return handler.get_analysis()
Loading
Loading