Hyperparameter tuning, FocalDiceLoss, and 5M/10M cross-regime transfer evaluation by sridhs21 · Pull Request #21 · SCOREC/reconClassifier

sridhs21 · 2026-05-07T11:40:11Z

Summary

This PR changes: 1 file modified (XPointMLTest.py), 3 new files (optuna_tuner.py, test_xpoint_transfer.py, build_transfer_cache.py).

What was done:

Optuna-driven hyperparameter tuning (optuna_tuner.py) — TPE sampler + median pruner over base_channels, dropout, weight decay, learning rate, positive-patch ratio, focal/dice weighting, scheduler choice, and SWA start fraction. Replaces the prior ad-hoc grid where base_channels was hard-coded to 32 while tuners assumed 64.
FocalDiceLoss + LR scheduling + SWA — new loss combining focal cross-entropy and Dice with configurable α / γ / dice weight; linear LR warmup followed by cosine annealing (or ReduceLROnPlateau); optional Stochastic Weight Averaging with a custom BN-update step compatible with the dict-based dataloader.
Cross-regime transfer evaluation (test_xpoint_transfer.py) — loads the best PKPM-trained checkpoint and evaluates zero-shot on 5M and 10M Gkeyll datasets (150 frames each), producing per-dataset and combined summaries. Re-evaluates the PKPM validation set as an in-domain reference.
Cache build pipeline (build_transfer_cache.py) — precomputes the deterministic X-point finder for all 150 frames of each transfer dataset, so subsequent evaluation runs read .npy caches instead of re-parsing .gkyl files. Supports --workers N for parallel processing and RC_EXTRACT_DIR / RC_CACHE_BASE env-var path overrides for ramdisk staging.
Augmentation correctness fixes in XPointMLTest.py — brightness/contrast jitter is now applied globally (not per channel) so the physical identities Bx = ∂y ψ, By = -∂x ψ, Jz = -∇²ψ/μ₀ stay consistent across the four input channels; cutout no longer mutates cached frame tensors in place.
Profiling and minor perf in getPgkylData — per-stage [PROFILE] timings around compactRead, gradient computation, getCritPoints, and getXOPoints; Hessian is now packed from already-computed second derivatives and passed to getXOPoints(hessian=…) to avoid recomputing gradients.

…physical field relationships

…ames via in-place cutout

…idation crops

…osRatio, lossFunction, warmupEpochs, and swa so we can actually tune all the stuff that was hardcoded before, especially base_channels which was stuck at 32 while all the optuna tuners used 64. Also threw in a FocalDiceLoss class that combines focal and dice loss to help with the crazy class imbalance, and hooked up linear LR warmup and stochastic weight averaging with a custom BN update that works with our dict based dataloader. Created test_xpoint_transfer.py to evaluate our best PKPM trained model on the 5M and 10M datasets, which includes a monkey patch for the double component indexing bug in getData.py since we cant modify files outside reconClassifier. Then made build_transfer_cache.py to precompute and cache the xpoint finder results for all 150 frames of 5M and 10M data so we dont have to wait 20 minutes per frame every time we want to run the transfer evaluation.

… RC_CACHE_BASE) for ramdisk staging, and repoint transfer eval to the production checkpoint testdir_2026-04-02-13-23-05. XPointMLTest.py now profiles getPgkylData stages and reuses precomputed second derivatives as the Hessian for getXOPoints.

cwsmith

Thank you. A few comments are below.

cwsmith · 2026-05-07T13:40:48Z

+        --xptCacheDir /path/to/cache \
+        --n-trials 50 \
+        --study-name xpoint-tuning \
+        --db sqlite:///optuna_xpoint.db


does optuna automatically create the db or are additional manual setup steps required?

cwsmith · 2026-05-07T13:47:04Z

+Cross-domain inference: evaluate the best PKPM-trained model on 5M and 10M data.
+
+This script:
+  1. Extracts 5M.tgz and 10M.tgz (if not already extracted)


We should force this to use the cache if XPointMLTest.py requires it.

cwsmith · 2026-05-07T13:50:04Z

                        specify the path to the parameter txt file, the parent
                        directory of that file must contain the gkyl input training data
                        ''')
    parser.add_argument('--xptCacheDir', type=Path, default=None,


IIRC, this option will run the hessian based classifier and build the cache. How does this differ from the new build_transfer_cache.py? If they do the same thing we should likely remove the option, and supporting functionality, here and require the use of the cache prepared with build_transfer_cache.py.

On that note, we should probably rename build_transfer_cache.py to run_hessian_and_build_cache.py or something similarly explicit.

cwsmith · 2026-05-07T13:52:43Z

          [fileName, axesNorm, critPoints, xpts, optsMax, optsMin, coords, psi, bx, by, jz] = getPgkylData(self.paramFile, fnum, verbosity=self.verbosity)
          fields = {"psi":psi, "critPts":critPoints, "xpts":xpts,
                    "optsMax":optsMax, "optsMin":optsMin,
                    "axesNorm": axesNorm, "coords": coords,
                    "fileName": fileName,
                    "Bx":bx, "By":by, "Jz":jz}
          writePgkylDataToCache(self.xptCacheDir, fnum, fields)


this looks like the call to run the hessian based classifier and write the cache

cwsmith · 2026-05-07T13:56:15Z

IIRC, a patch was needed for an indexing bug in https://github.com/SCOREC/pgkylFrontEnd. If so, would you please create a PR with that change?

sridhs21 and others added 7 commits November 19, 2025 00:48

Reduce augmentation probabilities to fix underfitting

665fb2b

Fix augmentation bug: apply brightness/contrast globally to preserve …

9fbdff2

…physical field relationships

reverting percentages back to original for testing purposes

7ce3e14

Fixed issue where on-the-fly augmentation was mutating caches base fr…

56c7448

…ames via in-place cutout

Add Optuna hyperparameter tuning with scheduler options and fixed val…

6c9eba3

…idation crops

This was referenced May 7, 2026

Feature/hessian comparison #20

Closed

Reduce augmentation probabilities to fix underfitting #19

Closed

cwsmith reviewed May 7, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hyperparameter tuning, FocalDiceLoss, and 5M/10M cross-regime transfer evaluation#21

Hyperparameter tuning, FocalDiceLoss, and 5M/10M cross-regime transfer evaluation#21
sridhs21 wants to merge 7 commits intomainfrom
feature/hyperparameter-tuning

sridhs21 commented May 7, 2026

Uh oh!

cwsmith left a comment

Uh oh!

cwsmith May 7, 2026

Uh oh!

cwsmith May 7, 2026

Uh oh!

cwsmith May 7, 2026 •

edited

Loading

Uh oh!

cwsmith May 7, 2026

Uh oh!

cwsmith commented May 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

sridhs21 commented May 7, 2026

Summary

What was done:

Uh oh!

cwsmith left a comment

Choose a reason for hiding this comment

Uh oh!

cwsmith May 7, 2026

Choose a reason for hiding this comment

Uh oh!

cwsmith May 7, 2026

Choose a reason for hiding this comment

Uh oh!

cwsmith May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cwsmith May 7, 2026

Choose a reason for hiding this comment

Uh oh!

cwsmith commented May 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

cwsmith May 7, 2026 •

edited

Loading