Implementing a Bio-Physical Loss Function (Fisher-KPP/Diffusion) for 3D Swin UNETR or for 3D-unet #8840

Rut328 · 2026-05-03T21:04:27Z

Rut328
May 3, 2026

Hi MONAI Team,
I am using Swin UNETR for 3D brain tumor segmentation (BraTS 2021). I want to add a Physics-Informed Loss term based on the Reaction-Diffusion equation to improve biological plausibility.
Does MONAI have existing utilities or examples for integrating PDE-based constraints into a loss function?
What is the recommended way to compute spatial gradients (like the Laplacian) on the model's output tensors within the MONAI framework?
Any pointers to examples or relevant documentation would be very helpful.

Thanks!

Lawson-Darrow · 2026-06-03T18:05:46Z

Lawson-Darrow
Jun 3, 2026

MONAI doesn't ship a PDE or physics-informed loss out of the box, so you'd build this as a small custom loss and add it to your Dice/CE term with a weight. The good news is the spatial-gradient part is easy to do in a way that stays differentiable.

Computing the Laplacian. Don't reach for SobelGradients here. That transform is meant for first-order edge maps, not for use inside a trainable loss. The clean approach is a fixed-weight F.conv3d. Register a 3D Laplacian stencil as a buffer (not a Parameter, so it never gets trained), then convolve the network's output with it. A buffer convolution is fully differentiable with respect to the input, which is all you need for backprop.

import torch
import torch.nn.functional as F

# 7-point 3D Laplacian stencil
k = torch.zeros(1, 1, 3, 3, 3)
k[0, 0, 1, 1, 1] = -6.0
for z, y, x in [(0,1,1),(2,1,1),(1,0,1),(1,2,1),(1,1,0),(1,1,2)]:
    k[0, 0, z, y, x] = 1.0

def laplacian(u, spacing=(1.0, 1.0, 1.0)):
    # u: (B, 1, D, H, W), a single probability channel
    u = F.pad(u, (1, 1, 1, 1, 1, 1), mode="replicate")
    return F.conv3d(u, k.to(u))  # divide by spacing**2 if anisotropic

BraTS 2021 is resampled to 1mm isotropic, so you can skip the spacing term. If you ever move to anisotropic data, scale each axis by its voxel spacing squared or the operator will be wrong.

One modeling note worth raising before you commit. The Fisher-KPP equation is ∂u/∂t = D ∇²u + ρ u (1 - u), and that time derivative is the catch. A single segmentation gives you one static volume, not a time series, so there's no ∂u/∂t to match. Two common ways around it: treat it as a steady-state constraint and penalize the residual of D ∇²u + ρ u (1 - u), or use it as a spatial regularizer on the predicted tumor field. If you actually have longitudinal scans, that's when the full time-dependent form pays off. Worth deciding which case you're in first, because it changes what the loss should even measure.

For the field u itself, use the softmax tumor probability rather than the hard argmax mask so the term stays smooth and differentiable. I'd also start the physics weight small. Early in training it tends to fight the Dice term while the predictions are still noise, and a large weight there can stall convergence.

Happy to share a fuller nn.Module version that wraps this alongside DiceCELoss if that helps.

0 replies

Rut328 · 2026-06-03T18:47:03Z

Rut328
Jun 3, 2026
Author

Hi Lawson, Thank you so much for the detailed and insightful response! I truly appreciate you taking the time to break this down. I would absolutely love it if you could share the fuller nn.Module version that wraps this alongside DiceCELoss. That would be incredibly helpful for my project. Also, as I adapt this to my workflow, I might have a few follow-up questions I would be extremely grateful for any further guidance you could spare. Thanks again for your amazing support!

…

On Wed, Jun 3, 2026 at 9:06 PM Lawson Darrow ***@***.***> wrote: MONAI doesn't ship a PDE or physics-informed loss out of the box, so you'd build this as a small custom loss and add it to your Dice/CE term with a weight. The good news is the spatial-gradient part is easy to do in a way that stays differentiable. *Computing the Laplacian.* Don't reach for SobelGradients here. That transform is meant for first-order edge maps, not for use inside a trainable loss. The clean approach is a fixed-weight F.conv3d. Register a 3D Laplacian stencil as a buffer (not a Parameter, so it never gets trained), then convolve the network's output with it. A buffer convolution is fully differentiable with respect to the input, which is all you need for backprop. import torchimport torch.nn.functional as F # 7-point 3D Laplacian stencilk = torch.zeros(1, 1, 3, 3, 3)k[0, 0, 1, 1, 1] = -6.0for z, y, x in [(0,1,1),(2,1,1),(1,0,1),(1,2,1),(1,1,0),(1,1,2)]: k[0, 0, z, y, x] = 1.0 def laplacian(u, spacing=(1.0, 1.0, 1.0)): # u: (B, 1, D, H, W), a single probability channel u = F.pad(u, (1, 1, 1, 1, 1, 1), mode="replicate") return F.conv3d(u, k.to(u)) # divide by spacing**2 if anisotropic BraTS 2021 is resampled to 1mm isotropic, so you can skip the spacing term. If you ever move to anisotropic data, scale each axis by its voxel spacing squared or the operator will be wrong. *One modeling note worth raising before you commit.* The Fisher-KPP equation is ∂u/∂t = D ∇²u + ρ u (1 - u), and that time derivative is the catch. A single segmentation gives you one static volume, not a time series, so there's no ∂u/∂t to match. Two common ways around it: treat it as a steady-state constraint and penalize the residual of D ∇²u + ρ u (1 - u), or use it as a spatial regularizer on the predicted tumor field. If you actually have longitudinal scans, that's when the full time-dependent form pays off. Worth deciding which case you're in first, because it changes what the loss should even measure. For the field u itself, use the softmax tumor probability rather than the hard argmax mask so the term stays smooth and differentiable. I'd also start the physics weight small. Early in training it tends to fight the Dice term while the predictions are still noise, and a large weight there can stall convergence. Happy to share a fuller nn.Module version that wraps this alongside DiceCELoss if that helps. — Reply to this email directly, view it on GitHub <#8840?email_source=notifications&email_token=BZ2I37HBZ4Q2CDL4VV626KT46BSJBA5CNFSNUABIM5UWIORPF5TWS5BNNB2WEL2ENFZWG5LTONUW63SDN5WW2ZLOOQXTCNZRG4YDAOJWUZZGKYLTN5XKMYLVORUG64VFMV3GK3TUVRTG633UMVZF6Y3MNFRWW#discussioncomment-17170096>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/BZ2I37H7AFNNPZYPZNF364D46BSJBAVCNFSM6AAAAACYPLFQLGVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTOMJXGAYDSNQ> . Triage notifications, keep track of coding agent tasks and review pull requests on the go with GitHub Mobile for iOS <https://github.com/notifications/mobile/ios/BZ2I37GQER5XEF7Y6Z6OFAD46BSJBA5CNFSNUABIM5UWIORPF5TWS5BNNB2WEL2ENFZWG5LTONUW63SDN5WW2ZLOOQXTCNZRG4YDAOJWUZZGKYLTN5XKMYLVORUG64VFMV3GK3TUVJTG633UMVZF62LPOM> and Android <https://github.com/notifications/mobile/android/BZ2I37GYJ52IES4XWBMY7FD46BSJBA5CNFSNUABIM5UWIORPF5TWS5BNNB2WEL2ENFZWG5LTONUW63SDN5WW2ZLOOQXTCNZRG4YDAOJWUZZGKYLTN5XKMYLVORUG64VFMV3GK3TUVZTG633UMVZF6YLOMRZG62LE>. Download it today! You are receiving this because you authored the thread.Message ID: ***@***.***>

0 replies

Lawson-Darrow · 2026-06-03T22:55:06Z

Lawson-Darrow
Jun 3, 2026

Here's a self-contained version. It wraps DiceCELoss as the data term and adds the steady-state Fisher-KPP residual on the softmax tumor field. I kept the Laplacian as a fixed buffer (so it's never trained) and added per-axis spacing scaling so it's still correct if you move off 1mm isotropic.

import torch
import torch.nn as nn
import torch.nn.functional as F
from monai.losses import DiceCELoss


class FisherKPPLoss(nn.Module):
    """DiceCELoss + a steady-state Fisher-KPP physics residual.

    Data term: MONAI's DiceCELoss. Physics term penalizes the residual of
    the steady-state reaction-diffusion equation

        R(u) = D * laplacian(u) + rho * u * (1 - u)

    on the predicted tumor probability field u (softmax, not argmax, so it
    stays smooth/differentiable). We minimize mean(R**2). No time derivative
    on purpose: one segmentation is a single static volume, so we treat
    Fisher-KPP as a steady-state spatial constraint. With longitudinal scans
    you'd swap this for the full time-dependent form.
    """

    def __init__(
        self,
        tumor_index: int = 1,      # channel of the tumor field u in the logits
        D: float = 1.0,
        rho: float = 1.0,
        spacing=(1.0, 1.0, 1.0),   # (dz, dy, dx) mm; default fine for BraTS 1mm iso
        physics_weight: float = 1e-3,
        warmup_steps: int = 1000,  # ramp physics weight so it doesn't fight Dice early
        **dice_ce_kwargs,
    ):
        super().__init__()
        self.data_loss = DiceCELoss(softmax=True, **dice_ce_kwargs)
        self.tumor_index = tumor_index
        self.D, self.rho = D, rho
        self.physics_weight = physics_weight
        self.warmup_steps = max(1, warmup_steps)
        self.register_buffer("_step", torch.zeros((), dtype=torch.long))

        # 7-point 3D Laplacian, scaled per-axis by 1/spacing**2
        dz, dy, dx = spacing
        k = torch.zeros(1, 1, 3, 3, 3)
        k[0, 0, 1, 1, 1] = -2.0 * (1/dz**2 + 1/dy**2 + 1/dx**2)
        k[0, 0, 0, 1, 1] = k[0, 0, 2, 1, 1] = 1.0 / dz**2
        k[0, 0, 1, 0, 1] = k[0, 0, 1, 2, 1] = 1.0 / dy**2
        k[0, 0, 1, 1, 0] = k[0, 0, 1, 1, 2] = 1.0 / dx**2
        self.register_buffer("lap_kernel", k)

    def laplacian(self, u):
        u = F.pad(u, (1, 1, 1, 1, 1, 1), mode="replicate")
        return F.conv3d(u, self.lap_kernel)

    def physics_residual(self, logits):
        u = torch.softmax(logits, dim=1)[:, self.tumor_index : self.tumor_index + 1]
        R = self.D * self.laplacian(u) + self.rho * u * (1.0 - u)
        return (R ** 2).mean()

    def forward(self, logits, target):
        data = self.data_loss(logits, target)
        phys = self.physics_residual(logits)
        if self.training:
            w = self.physics_weight * min(1.0, self._step.item() / self.warmup_steps)
            self._step += 1
        else:
            w = self.physics_weight
        return data + w * phys

A couple of usage notes:

Your network should output raw logits (B, C, D, H, W); DiceCELoss(softmax=True) and the physics term both handle the softmax. Pass to_onehot_y=True (etc.) straight through as kwargs.
tumor_index picks which class the PDE constrains. Set it to whichever channel is your tumor field (e.g. WT). You can also sum the residual over several subregion channels if you want it on more than one.
Start physics_weight small (the 1e-3 + warmup is deliberately gentle) and watch that Dice still drops normally for the first few hundred steps before the physics term ramps in.

Happy to keep helping as you wire it into your training loop!

0 replies

Rut328 · 2026-06-03T23:08:10Z

Rut328
Jun 3, 2026
Author

Thank you very much. By the way, it seems that your idea is very similar to what is written in this article: https://arxiv.org/pdf/2403.09136 Please confirm if I am right, I tried to implement it, I am attaching a link to my code in Git, this is without the model, just the idea of this article, if you want I will also attach the code of my model (SWIN UNETR), Here is the link to my code in Git: https://github.com/Rut328/biophysics_regulariser.git And again, thank you very much.

…

On Thu, Jun 4, 2026 at 1:55 AM Lawson Darrow ***@***.***> wrote: Here's a self-contained version. It wraps DiceCELoss as the data term and adds the steady-state Fisher-KPP residual on the softmax tumor field. I kept the Laplacian as a fixed buffer (so it's never trained) and added per-axis spacing scaling so it's still correct if you move off 1mm isotropic. import torchimport torch.nn as nnimport torch.nn.functional as Ffrom monai.losses import DiceCELoss class FisherKPPLoss(nn.Module): """DiceCELoss + a steady-state Fisher-KPP physics residual. Data term: MONAI's DiceCELoss. Physics term penalizes the residual of the steady-state reaction-diffusion equation R(u) = D * laplacian(u) + rho * u * (1 - u) on the predicted tumor probability field u (softmax, not argmax, so it stays smooth/differentiable). We minimize mean(R**2). No time derivative on purpose: one segmentation is a single static volume, so we treat Fisher-KPP as a steady-state spatial constraint. With longitudinal scans you'd swap this for the full time-dependent form. """ def __init__( self, tumor_index: int = 1, # channel of the tumor field u in the logits D: float = 1.0, rho: float = 1.0, spacing=(1.0, 1.0, 1.0), # (dz, dy, dx) mm; default fine for BraTS 1mm iso physics_weight: float = 1e-3, warmup_steps: int = 1000, # ramp physics weight so it doesn't fight Dice early **dice_ce_kwargs, ): super().__init__() self.data_loss = DiceCELoss(softmax=True, **dice_ce_kwargs) self.tumor_index = tumor_index self.D, self.rho = D, rho self.physics_weight = physics_weight self.warmup_steps = max(1, warmup_steps) self.register_buffer("_step", torch.zeros((), dtype=torch.long)) # 7-point 3D Laplacian, scaled per-axis by 1/spacing**2 dz, dy, dx = spacing k = torch.zeros(1, 1, 3, 3, 3) k[0, 0, 1, 1, 1] = -2.0 * (1/dz**2 + 1/dy**2 + 1/dx**2) k[0, 0, 0, 1, 1] = k[0, 0, 2, 1, 1] = 1.0 / dz**2 k[0, 0, 1, 0, 1] = k[0, 0, 1, 2, 1] = 1.0 / dy**2 k[0, 0, 1, 1, 0] = k[0, 0, 1, 1, 2] = 1.0 / dx**2 self.register_buffer("lap_kernel", k) def laplacian(self, u): u = F.pad(u, (1, 1, 1, 1, 1, 1), mode="replicate") return F.conv3d(u, self.lap_kernel) def physics_residual(self, logits): u = torch.softmax(logits, dim=1)[:, self.tumor_index : self.tumor_index + 1] R = self.D * self.laplacian(u) + self.rho * u * (1.0 - u) return (R ** 2).mean() def forward(self, logits, target): data = self.data_loss(logits, target) phys = self.physics_residual(logits) if self.training: w = self.physics_weight * min(1.0, self._step.item() / self.warmup_steps) self._step += 1 else: w = self.physics_weight return data + w * phys A couple of usage notes: - Your network should output raw logits (B, C, D, H, W); DiceCELoss(softmax=True) and the physics term both handle the softmax. Pass to_onehot_y=True (etc.) straight through as kwargs. - tumor_index picks which class the PDE constrains — set it to whichever channel is your tumor field (e.g. WT). You can also sum the residual over several subregion channels if you want it on more than one. - Start physics_weight small (the 1e-3 + warmup is deliberately gentle) and watch that Dice still drops normally for the first few hundred steps before the physics term ramps in. Happy to keep helping as you wire it into your training loop — feel free to fire off those follow-ups. — Reply to this email directly, view it on GitHub <#8840?email_source=notifications&email_token=BZ2I37CLZGISVR73TOUTTVL46CUF5A5CNFSNUABIM5UWIORPF5TWS5BNNB2WEL2ENFZWG5LTONUW63SDN5WW2ZLOOQXTCNZRG4ZDQNRYUZZGKYLTN5XKMYLVORUG64VFMV3GK3TUVRTG633UMVZF6Y3MNFRWW#discussioncomment-17172868>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/BZ2I37BIQCZHNPYFRP5LSAT46CUF5AVCNFSM6AAAAACYPLFQLGVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTOMJXGI4DMOA> . Triage notifications, keep track of coding agent tasks and review pull requests on the go with GitHub Mobile for iOS <https://github.com/notifications/mobile/ios/BZ2I37G2ZQFQKANXT4XNORT46CUF5A5CNFSNUABIM5UWIORPF5TWS5BNNB2WEL2ENFZWG5LTONUW63SDN5WW2ZLOOQXTCNZRG4ZDQNRYUZZGKYLTN5XKMYLVORUG64VFMV3GK3TUVJTG633UMVZF62LPOM> and Android <https://github.com/notifications/mobile/android/BZ2I37GEDNOTPHHW7XAXOGD46CUF5A5CNFSNUABIM5UWIORPF5TWS5BNNB2WEL2ENFZWG5LTONUW63SDN5WW2ZLOOQXTCNZRG4ZDQNRYUZZGKYLTN5XKMYLVORUG64VFMV3GK3TUVZTG633UMVZF6YLOMRZG62LE>. Download it today! You are receiving this because you authored the thread.Message ID: ***@***.***>

0 replies

Lawson-Darrow · 2026-06-04T00:03:20Z

Lawson-Darrow
Jun 4, 2026

Yes, that's the paper. What you linked (Zhang et al., MICCAI 2024) is basically the formalized version of what we sketched. The diffusion plus proliferation core is the same D∇²u + ρu(1−u) reaction-diffusion term, with a Dice + λ·PDE + λ·BC total loss and Neumann zero-flux boundaries.

The one real difference is the time derivative I flagged before. We talked about dropping it and going steady-state, since a single scan has no time axis. The paper instead keeps ∂u/∂t by learning a time-conditioned field for u. That's what the SIREN periodic-sine network is doing. It gives you a differentiable u(features, t) so ∂u/∂t exists via autograd. Same physics. They just learn a surrogate for the time axis instead of removing it. So your instinct to reach for that paper was right.

I read through your notebook. The structure matches the paper well. You've got the SIREN estimator, the PDE residual, the Neumann BC, and d and ρ sampled from the biophysical ranges. A few things to fix before it trains correctly.

It won't run as-is. A few lines are over-indented by one space and throw IndentationError. The x-normalization line in TumourCellDensityEstimator, the first kernel assignment in get_laplacian_kernel, the if isinstance(d, ...) in BoundaryConditionLoss, and the du_dt = du_dt / ... line in BiophysicsRegulariser.
That same du_dt = du_dt / (du_dt.abs() + 1e-6) line is the bigger issue. It normalizes the time derivative down to its sign (±1), so you lose the magnitude. The residual then compares a unit-scale du/dt against physically-scaled diffusion and proliferation terms, which means the PDE term isn't really measuring the PDE anymore. I'd just delete that line.
du_dt isn't per-voxel yet. With a scalar t, autograd.grad(u_hat, t_tensor, grad_outputs=ones_like(u_hat)) returns the sum of ∂uᵢ/∂t over all voxels. That's one number, and it gets expanded back to every voxel. The residual wants each voxel's own ∂uᵢ/∂t. The clean fix is to make t a per-voxel or per-sample input and use torch.func.jacrev with vmap so the output's spatial dimension is preserved.

One smaller thing. diffusion = d * laplacian equals ∇·(d∇u) only when d is constant in space. Since you're sampling d per voxel, the two differ by a ∇d·∇u term. If you want to match the paper's ∇·(d∇u) exactly with spatially-varying d, take the divergence of d∇u directly rather than d times the Laplacian. If d is effectively constant per forward pass, it's fine as-is.

Happy to look at the SWIN UNETR model code too whenever you want to wire this in.

0 replies

Rut328 · 2026-06-04T13:10:57Z

Rut328
Jun 4, 2026
Author

Thank you very much, I really appreciate your help, I am attaching my model here along with the combination of the laws of physics https://github.com/Rut328/biophysics_regulariser_SwinUNETR.git (there are two different implementations of physics there, I still don't know which one is better) This is an unrunned code version. If you want, I can also attach the runned code with the output results. The result I got when I ran the model *without *the laws of physics is: *HD95 *6.26mm *Dice *0.911916 And when I ran *with *physics, I ran five epochs *HD95 *5.82mm *Dice *0.9141 I have a few questions. I would really appreciate it if you could answer them. I ran the version with the physics in Fine Tuning and my question is, is it really correct to run it like this or is it better to run the model starting with the physics? Another question, as I said, I only ran five epochs with physics because during the run I saw things that I didn't really understand, for example, the BC in each epoch was the same for everyone and during the run it decreased until it reached zero. I would be very happy if you could go through my code, both for the model itself and for the integration with physics. I would really appreciate comments and answers to my questions. And again, thank you very much.

…

On Thu, Jun 4, 2026 at 3:03 AM Lawson Darrow ***@***.***> wrote: Yes, that's the paper. What you linked (Zhang et al., MICCAI 2024) is basically the formalized version of what we sketched. The diffusion plus proliferation core is the same D∇²u + ρu(1−u) reaction-diffusion term, with a Dice + λ·PDE + λ·BC total loss and Neumann zero-flux boundaries. The one real difference is the time derivative I flagged before. We talked about dropping it and going steady-state, since a single scan has no time axis. The paper instead keeps ∂u/∂t by learning a time-conditioned field for u. That's what the SIREN periodic-sine network is doing. It gives you a differentiable u(features, t) so ∂u/∂t exists via autograd. Same physics. They just learn a surrogate for the time axis instead of removing it. So your instinct to reach for that paper was right. I read through your notebook. The structure matches the paper well. You've got the SIREN estimator, the PDE residual, the Neumann BC, and d and ρ sampled from the biophysical ranges. A few things to fix before it trains correctly. 1. It won't run as-is. A few lines are over-indented by one space and throw IndentationError. The x-normalization line in TumourCellDensityEstimator, the first kernel assignment in get_laplacian_kernel, the if isinstance(d, ...) in BoundaryConditionLoss, and the du_dt = du_dt / ... line in BiophysicsRegulariser. 2. That same du_dt = du_dt / (du_dt.abs() + 1e-6) line is the bigger issue. It normalizes the time derivative down to its sign (±1), so you lose the magnitude. The residual then compares a unit-scale du/dt against physically-scaled diffusion and proliferation terms, which means the PDE term isn't really measuring the PDE anymore. I'd just delete that line. 3. du_dt isn't per-voxel yet. With a scalar t, autograd.grad(u_hat, t_tensor, grad_outputs=ones_like(u_hat)) returns the sum of ∂uᵢ/∂t over all voxels. That's one number, and it gets expanded back to every voxel. The residual wants each voxel's own ∂uᵢ/∂t. The clean fix is to make t a per-voxel or per-sample input and use torch.func.jacrev with vmap so the output's spatial dimension is preserved. One smaller thing. diffusion = d * laplacian equals ∇·(d∇u) only when d is constant in space. Since you're sampling d per voxel, the two differ by a ∇d·∇u term. If you want to match the paper's ∇·(d∇u) exactly with spatially-varying d, take the divergence of d∇u directly rather than d times the Laplacian. If d is effectively constant per forward pass, it's fine as-is. Happy to look at the SWIN UNETR model code too whenever you want to wire this in. — Reply to this email directly, view it on GitHub <#8840?email_source=notifications&email_token=BZ2I37EZ7FGPMJN2YXI7VWL46C4F5A5CNFSNUABIM5UWIORPF5TWS5BNNB2WEL2ENFZWG5LTONUW63SDN5WW2ZLOOQXTCNZRG4ZTGMRYUZZGKYLTN5XKMYLVORUG64VFMV3GK3TUVRTG633UMVZF6Y3MNFRWW#discussioncomment-17173328>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/BZ2I37B2F2J4AF3A26BTRF346C4F5AVCNFSM6AAAAACYPLFQLGVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTOMJXGMZTEOA> . Triage notifications, keep track of coding agent tasks and review pull requests on the go with GitHub Mobile for iOS <https://github.com/notifications/mobile/ios/BZ2I37F6V7EV2Q7KQSERKKD46C4F5A5CNFSNUABIM5UWIORPF5TWS5BNNB2WEL2ENFZWG5LTONUW63SDN5WW2ZLOOQXTCNZRG4ZTGMRYUZZGKYLTN5XKMYLVORUG64VFMV3GK3TUVJTG633UMVZF62LPOM> and Android <https://github.com/notifications/mobile/android/BZ2I37CEXKK6LAMPZPWY3CD46C4F5A5CNFSNUABIM5UWIORPF5TWS5BNNB2WEL2ENFZWG5LTONUW63SDN5WW2ZLOOQXTCNZRG4ZTGMRYUZZGKYLTN5XKMYLVORUG64VFMV3GK3TUVZTG633UMVZF6YLOMRZG62LE>. Download it today! You are receiving this because you authored the thread.Message ID: ***@***.***>

0 replies

Lawson-Darrow · 2026-06-04T16:01:18Z

Lawson-Darrow
Jun 4, 2026

Nice, thanks for sharing the notebook. Quick answer first: go with Implementation A. It's the one actually wired into your training loop, and B has a small bug in the z-boundary term (u[:, :, -2] indexes the channel axis instead of the z boundary), though since B isn't used it's not urgent.

Honestly though, the A-vs-B choice matters less than a couple of things both versions share.

The biggest one: the physics term is computed on u_hat, the output of the SIREN MLP sitting on the bottleneck features, rather than on the segmentation itself. So right now it's regularizing that side network instead of your actual tumor prediction. If what you want is for the prediction to look biologically plausible, I'd run the residual on the softmax (or sigmoid) of the SwinUNETR output directly.

On the time piece: with a single static volume there's no real time axis to fit, so I'd drop the pseudo-time and the du_dt term. One honest caveat once you do, a steady-state D*laplacian(u) + rho*u*(1-u) isn't really capturing growth dynamics (Fisher-KPP is fundamentally time-dependent, and the steady state mostly drifts toward trivial or saturated solutions). It's better to treat it as a soft spatial shape prior than as literal physics. Real growth dynamics would need longitudinal scans or a properly calibrated time model.

A few smaller things:

D and rho are resampled randomly per voxel every step, so there's never one consistent PDE to satisfy. Pinning them to sensible constants (or small learnable scalars) fixes that.
For a constant D, multiplying after the Laplacian is fine. If D ever varies in space, the diffusion term should be div(D*grad(u)) rather than laplacian(D*u).
The conv3d uses zero padding, which treats everything outside the volume as 0 and makes artificial sinks at the borders. Replicate padding is much closer to the zero-flux (Neumann) behavior you want. And the Laplacian has no voxel-spacing term, so D is effectively in voxel units rather than mm.

None of this is far off, mostly it's about pointing the term at the prediction and being realistic about what it represents.

0 replies

Rut328 · 2026-06-04T20:19:57Z

Rut328
Jun 4, 2026
Author

Thanks again, you guys are really helping me, I have added a file to the Git repository called: biophysics_regulariser_SwinUNETR_fixed.ipynb I would really appreciate it if you could go through the corrections I made - based on what you told me to do, and tell me if it is now written correctly and well, (I haven't run it yet). Thank you very much

…

On Thu, Jun 4, 2026 at 7:01 PM Lawson Darrow ***@***.***> wrote: Nice, thanks for sharing the notebook. Quick answer first: go with Implementation A. It's the one actually wired into your training loop, and B has a small bug in the z-boundary term (u[:, :, -2] indexes the channel axis instead of the z boundary), though since B isn't used it's not urgent. Honestly though, the A-vs-B choice matters less than a couple of things both versions share. The biggest one: the physics term is computed on u_hat, the output of the SIREN MLP sitting on the bottleneck features, rather than on the segmentation itself. So right now it's regularizing that side network instead of your actual tumor prediction. If what you want is for the prediction to look biologically plausible, I'd run the residual on the softmax (or sigmoid) of the SwinUNETR output directly. On the time piece: with a single static volume there's no real time axis to fit, so I'd drop the pseudo-time and the du_dt term. One honest caveat once you do, a steady-state D*laplacian(u) + rho*u*(1-u) isn't really capturing growth dynamics (Fisher-KPP is fundamentally time-dependent, and the steady state mostly drifts toward trivial or saturated solutions). It's better to treat it as a soft spatial shape prior than as literal physics. Real growth dynamics would need longitudinal scans or a properly calibrated time model. A few smaller things: - D and rho are resampled randomly per voxel every step, so there's never one consistent PDE to satisfy. Pinning them to sensible constants (or small learnable scalars) fixes that. - For a constant D, multiplying after the Laplacian is fine. If D ever varies in space, the diffusion term should be div(D*grad(u)) rather than laplacian(D*u). - The conv3d uses zero padding, which treats everything outside the volume as 0 and makes artificial sinks at the borders. Replicate padding is much closer to the zero-flux (Neumann) behavior you want. And the Laplacian has no voxel-spacing term, so D is effectively in voxel units rather than mm. None of this is far off, mostly it's about pointing the term at the prediction and being realistic about what it represents. — Reply to this email directly, view it on GitHub <#8840?email_source=notifications&email_token=BZ2I37FSOP2EEQFK6XBIY2L46GMOJA5CNFSNUABIM5UWIORPF5TWS5BNNB2WEL2ENFZWG5LTONUW63SDN5WW2ZLOOQXTCNZRHAZDSOJWUZZGKYLTN5XKMYLVORUG64VFMV3GK3TUVRTG633UMVZF6Y3MNFRWW#discussioncomment-17182996>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/BZ2I37DBT35SFVHUHIUDEPT46GMOJAVCNFSM6AAAAACYPLFQLGVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTOMJYGI4TSNQ> . Triage notifications, keep track of coding agent tasks and review pull requests on the go with GitHub Mobile for iOS <https://github.com/notifications/mobile/ios/BZ2I37HOFTVGP3HBBEL35S346GMOJA5CNFSNUABIM5UWIORPF5TWS5BNNB2WEL2ENFZWG5LTONUW63SDN5WW2ZLOOQXTCNZRHAZDSOJWUZZGKYLTN5XKMYLVORUG64VFMV3GK3TUVJTG633UMVZF62LPOM> and Android <https://github.com/notifications/mobile/android/BZ2I37AW4SA7P5HHQ5L6ZXD46GMOJA5CNFSNUABIM5UWIORPF5TWS5BNNB2WEL2ENFZWG5LTONUW63SDN5WW2ZLOOQXTCNZRHAZDSOJWUZZGKYLTN5XKMYLVORUG64VFMV3GK3TUVZTG633UMVZF6YLOMRZG62LE>. Download it today! You are receiving this because you authored the thread.Message ID: ***@***.***>

0 replies

Rut328 · 2026-06-05T07:49:04Z

Rut328
Jun 5, 2026
Author

i'm sorry i didn't give you the link: https://github.com/Rut328/biophysics_regulariser_SwinUNETR.git It is called biophysics_regulariser_SwinUNETR_*fixed*.ipynb thank you so much again.

…

On Thu, Jun 4, 2026 at 11:19 PM RUT ***@***.***> wrote: Thanks again, you guys are really helping me, I have added a file to the Git repository called: biophysics_regulariser_SwinUNETR_fixed.ipynb I would really appreciate it if you could go through the corrections I made - based on what you told me to do, and tell me if it is now written correctly and well, (I haven't run it yet). Thank you very much On Thu, Jun 4, 2026 at 7:01 PM Lawson Darrow ***@***.***> wrote: > Nice, thanks for sharing the notebook. Quick answer first: go with > Implementation A. It's the one actually wired into your training loop, and > B has a small bug in the z-boundary term (u[:, :, -2] indexes the > channel axis instead of the z boundary), though since B isn't used it's not > urgent. > > Honestly though, the A-vs-B choice matters less than a couple of things > both versions share. > > The biggest one: the physics term is computed on u_hat, the output of > the SIREN MLP sitting on the bottleneck features, rather than on the > segmentation itself. So right now it's regularizing that side network > instead of your actual tumor prediction. If what you want is for the > prediction to look biologically plausible, I'd run the residual on the > softmax (or sigmoid) of the SwinUNETR output directly. > > On the time piece: with a single static volume there's no real time axis > to fit, so I'd drop the pseudo-time and the du_dt term. One honest > caveat once you do, a steady-state D*laplacian(u) + rho*u*(1-u) isn't > really capturing growth dynamics (Fisher-KPP is fundamentally > time-dependent, and the steady state mostly drifts toward trivial or > saturated solutions). It's better to treat it as a soft spatial shape prior > than as literal physics. Real growth dynamics would need longitudinal scans > or a properly calibrated time model. > > A few smaller things: > > - D and rho are resampled randomly per voxel every step, so there's > never one consistent PDE to satisfy. Pinning them to sensible constants (or > small learnable scalars) fixes that. > - For a constant D, multiplying after the Laplacian is fine. If D > ever varies in space, the diffusion term should be div(D*grad(u)) > rather than laplacian(D*u). > - The conv3d uses zero padding, which treats everything outside the > volume as 0 and makes artificial sinks at the borders. Replicate padding is > much closer to the zero-flux (Neumann) behavior you want. And the Laplacian > has no voxel-spacing term, so D is effectively in voxel units rather > than mm. > > None of this is far off, mostly it's about pointing the term at the > prediction and being realistic about what it represents. > > — > Reply to this email directly, view it on GitHub > <#8840?email_source=notifications&email_token=BZ2I37FSOP2EEQFK6XBIY2L46GMOJA5CNFSNUABIM5UWIORPF5TWS5BNNB2WEL2ENFZWG5LTONUW63SDN5WW2ZLOOQXTCNZRHAZDSOJWUZZGKYLTN5XKMYLVORUG64VFMV3GK3TUVRTG633UMVZF6Y3MNFRWW#discussioncomment-17182996>, > or unsubscribe > <https://github.com/notifications/unsubscribe-auth/BZ2I37DBT35SFVHUHIUDEPT46GMOJAVCNFSM6AAAAACYPLFQLGVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTOMJYGI4TSNQ> > . > Triage notifications, keep track of coding agent tasks and review pull > requests on the go with GitHub Mobile for iOS > <https://github.com/notifications/mobile/ios/BZ2I37HOFTVGP3HBBEL35S346GMOJA5CNFSNUABIM5UWIORPF5TWS5BNNB2WEL2ENFZWG5LTONUW63SDN5WW2ZLOOQXTCNZRHAZDSOJWUZZGKYLTN5XKMYLVORUG64VFMV3GK3TUVJTG633UMVZF62LPOM> > and Android > <https://github.com/notifications/mobile/android/BZ2I37AW4SA7P5HHQ5L6ZXD46GMOJA5CNFSNUABIM5UWIORPF5TWS5BNNB2WEL2ENFZWG5LTONUW63SDN5WW2ZLOOQXTCNZRHAZDSOJWUZZGKYLTN5XKMYLVORUG64VFMV3GK3TUVZTG633UMVZF6YLOMRZG62LE>. > Download it today! > You are receiving this because you authored the thread.Message ID: > ***@***.***> >

0 replies

Lawson-Darrow · 2026-06-05T12:39:10Z

Lawson-Darrow
Jun 5, 2026

No worries on the link I found the fixed notebook! This is looking a lot better, nice work. The important one is fixed: the residual now runs on sigmoid(logits), so it constrains the actual segmentation instead of a side network. You also dropped the time term and went steady-state, made d and rho scalars, switched diffusion to d*laplacian(u), moved to replicate padding, added the dx spacing term, and the boundary loss is now correct on all six faces.

Two cleanups worth doing.

The SIREN MLP (TumourCellDensityEstimator) and the bottleneck hook are now unused, since the loss takes sigmoid(logits) directly. They are still built and their parameters are still in the optimizer. I would remove the estimator, the hook, and those optimizer params so nothing trains or runs that the loss never reads.

d and rho are learnable and the only thing pulling on them is the mean squared residual. With nothing else anchoring them, the optimizer can lower that loss by shrinking or rebalancing d and rho rather than by making u more physical, so the term can quietly satisfy itself. Since seg_loss dominates and the physics weight is 1e-4 this will not derail training, but it leaves d and rho poorly identified and weakens the regularizer. Safer to fix them to constants (your 0.1 and 0.05 are reasonable), or keep them learnable only if you add something that anchors them.

Minor: lambda1 and lambda2 are small and fixed, which is fine. If the physics term ever fights Dice early in training you can ramp it in over the first epochs.

0 replies

Rut328 · 2026-06-06T22:45:14Z

Rut328
Jun 6, 2026
Author

I uploaded a corrected file according to what you told me, it is called biophysics_regulariser_SwinUNETR_fixed (2).ipynb I would really appreciate it if you could go through it and tell me if it is good now. By the way, the changes you told me - this is through your solution, not what is presented in the article. Thank you very much.

…

On Fri, Jun 5, 2026 at 3:39 PM Lawson Darrow ***@***.***> wrote: No worries on the link, I found the fixed notebook. This is looking a lot better, nice work. The important one is fixed: the residual now runs on sigmoid(logits), so it constrains the actual segmentation instead of a side network. You also dropped the time term and went steady-state, made d and rho scalars, switched diffusion to d*laplacian(u), moved to replicate padding, added the dx spacing term, and the boundary loss is now correct on all six faces. Two cleanups worth doing. The SIREN MLP (TumourCellDensityEstimator) and the bottleneck hook are now unused, since the loss takes sigmoid(logits) directly. They are still built and their parameters are still in the optimizer. I would remove the estimator, the hook, and those optimizer params so nothing trains or runs that the loss never reads. d and rho are learnable and the only thing pulling on them is the mean squared residual. With nothing else anchoring them, the optimizer can lower that loss by shrinking or rebalancing d and rho rather than by making u more physical, so the term can quietly satisfy itself. Since seg_loss dominates and the physics weight is 1e-4 this will not derail training, but it leaves d and rho poorly identified and weakens the regularizer. Safer to fix them to constants (your 0.1 and 0.05 are reasonable), or keep them learnable only if you add something that anchors them. Minor: lambda1 and lambda2 are small and fixed, which is fine. If the physics term ever fights Dice early in training you can ramp it in over the first epochs. — Reply to this email directly, view it on GitHub <#8840?email_source=notifications&email_token=BZ2I37F2KWNW32UGGTIWAR346K5QJA5CNFSNUABIM5UWIORPF5TWS5BNNB2WEL2ENFZWG5LTONUW63SDN5WW2ZLOOQXTCNZRHEZDONZRUZZGKYLTN5XKMYLVORUG64VFMV3GK3TUVRTG633UMVZF6Y3MNFRWW#discussioncomment-17192771>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/BZ2I37BJS6QYYIXUIJIC6K346K5QJAVCNFSM6AAAAACYPLFQLGVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTOMJZGI3TOMI> . Triage notifications, keep track of coding agent tasks and review pull requests on the go with GitHub Mobile for iOS <https://github.com/notifications/mobile/ios/BZ2I37C2Y4CJOPIMNYQWZ5D46K5QJA5CNFSNUABIM5UWIORPF5TWS5BNNB2WEL2ENFZWG5LTONUW63SDN5WW2ZLOOQXTCNZRHEZDONZRUZZGKYLTN5XKMYLVORUG64VFMV3GK3TUVJTG633UMVZF62LPOM> and Android <https://github.com/notifications/mobile/android/BZ2I37DJDL3DGMGSRSBYDST46K5QJA5CNFSNUABIM5UWIORPF5TWS5BNNB2WEL2ENFZWG5LTONUW63SDN5WW2ZLOOQXTCNZRHEZDONZRUZZGKYLTN5XKMYLVORUG64VFMV3GK3TUVZTG633UMVZF6YLOMRZG62LE>. Download it today! You are receiving this because you authored the thread.Message ID: ***@***.***>

0 replies

Rut328 · 2026-06-06T22:50:43Z

Rut328
Jun 6, 2026
Author

I forgot to write to you, I ran this version biophysics_regulariser_SwinUNETR_fixed (2).ipynb and it didn't show me any improvement in HD95 at all, there was an increase in DICE but not significant.

…

On Sun, Jun 7, 2026 at 1:45 AM RUT ***@***.***> wrote: I uploaded a corrected file according to what you told me, it is called biophysics_regulariser_SwinUNETR_fixed (2).ipynb I would really appreciate it if you could go through it and tell me if it is good now. By the way, the changes you told me - this is through your solution, not what is presented in the article. Thank you very much. On Fri, Jun 5, 2026 at 3:39 PM Lawson Darrow ***@***.***> wrote: > No worries on the link, I found the fixed notebook. This is looking a lot > better, nice work. The important one is fixed: the residual now runs on > sigmoid(logits), so it constrains the actual segmentation instead of a > side network. You also dropped the time term and went steady-state, made > d and rho scalars, switched diffusion to d*laplacian(u), moved to > replicate padding, added the dx spacing term, and the boundary loss is > now correct on all six faces. > > Two cleanups worth doing. > > The SIREN MLP (TumourCellDensityEstimator) and the bottleneck hook are > now unused, since the loss takes sigmoid(logits) directly. They are > still built and their parameters are still in the optimizer. I would remove > the estimator, the hook, and those optimizer params so nothing trains or > runs that the loss never reads. > > d and rho are learnable and the only thing pulling on them is the mean > squared residual. With nothing else anchoring them, the optimizer can lower > that loss by shrinking or rebalancing d and rho rather than by making u > more physical, so the term can quietly satisfy itself. Since seg_loss > dominates and the physics weight is 1e-4 this will not derail training, but > it leaves d and rho poorly identified and weakens the regularizer. Safer > to fix them to constants (your 0.1 and 0.05 are reasonable), or keep them > learnable only if you add something that anchors them. > > Minor: lambda1 and lambda2 are small and fixed, which is fine. If the > physics term ever fights Dice early in training you can ramp it in over the > first epochs. > > — > Reply to this email directly, view it on GitHub > <#8840?email_source=notifications&email_token=BZ2I37F2KWNW32UGGTIWAR346K5QJA5CNFSNUABIM5UWIORPF5TWS5BNNB2WEL2ENFZWG5LTONUW63SDN5WW2ZLOOQXTCNZRHEZDONZRUZZGKYLTN5XKMYLVORUG64VFMV3GK3TUVRTG633UMVZF6Y3MNFRWW#discussioncomment-17192771>, > or unsubscribe > <https://github.com/notifications/unsubscribe-auth/BZ2I37BJS6QYYIXUIJIC6K346K5QJAVCNFSM6AAAAACYPLFQLGVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTOMJZGI3TOMI> > . > Triage notifications, keep track of coding agent tasks and review pull > requests on the go with GitHub Mobile for iOS > <https://github.com/notifications/mobile/ios/BZ2I37C2Y4CJOPIMNYQWZ5D46K5QJA5CNFSNUABIM5UWIORPF5TWS5BNNB2WEL2ENFZWG5LTONUW63SDN5WW2ZLOOQXTCNZRHEZDONZRUZZGKYLTN5XKMYLVORUG64VFMV3GK3TUVJTG633UMVZF62LPOM> > and Android > <https://github.com/notifications/mobile/android/BZ2I37DJDL3DGMGSRSBYDST46K5QJA5CNFSNUABIM5UWIORPF5TWS5BNNB2WEL2ENFZWG5LTONUW63SDN5WW2ZLOOQXTCNZRHEZDONZRUZZGKYLTN5XKMYLVORUG64VFMV3GK3TUVZTG633UMVZF6YLOMRZG62LE>. > Download it today! > You are receiving this because you authored the thread.Message ID: > ***@***.***> >

0 replies

Lawson-Darrow · 2026-06-07T15:08:57Z

Lawson-Darrow
Jun 7, 2026

The cleanup looks right: the SIREN estimator and the bottleneck hook are gone d and rho are fixed constants now and the optimizer only holds the model parameters. One harmless leftover is clip_grad_norm_ still lists biophy_regulariser.parameters() which is now empty so it does nothing and can be dropped.

On the results, the flat HD95 is expected, and it's worth being clear why. What you have is a plausibility prior on the probability field, not a boundary-accuracy objective. The reaction part rho*u*(1-u) is largest where the probabilities are uncertain, and the Laplacian part just penalizes curvature in the field, so together they act as a smoothness prior. That can move DICE a little, since DICE is a volume overlap, but HD95 is a tail boundary-distance metric and nothing here optimizes boundary placement, so it stays flat.

Two things specifically working against you:

The boundary term runs on the faces of the 128^3 training crop not on anatomical boundaries. Those faces are arbitrary RandCropByPosNegLabel windows so you're penalizing differences in u at random crop edges which isn't meaningful and can mildly hurt.
lambda1=50 is worth checking against your logged loss magnitudes. Depending on the residual scale the PDE term may still be tiny or large enough to over smooth. Comparing loss_pde against loss_seg in your printouts will tell you which.

If HD95 is the metric you care about then a soft physics prior is the wrong tool. Losses that target the boundary directly move HD95: boundary/surface loss (Kervadec et al.), a Hausdorff-distance-based loss, or suppressing small false-positive components (usually what drives the 95th percentile up).

The fixes we discussed made the regularizer correct, but a steady-state Fisher-KPP prior on an already-strong fine-tuned model isn't expected to improve boundary metrics much.

0 replies

Rut328 · 2026-06-07T18:33:19Z

Rut328
Jun 7, 2026
Author

I will explain to you why I wrote about HD95, because the first variation I showed you: biophysics_regulariser_SwinUNETR.ipynb where I implemented the idea of the article more: https://arxiv.org/pdf/2403.09136, had this improvement: Model without physics 6.26mm HD95, DICE 0.9011 This model with the physics from the article HD95 5.82mm, DICE 0.9141 These results were after one epoch, after I continued this run it showed an increase in 6.18mm - HD95 and a very slight improvement in DICE - 0.9152 so I assumed that this was not true. What do you recommend I do now, try to make the model biophysics_regulariser_SwinUNETR.ipynb more precise or go more with your idea (the one I named with the extension fixed (2)) I would love to get help from you because I'm really getting into trouble, maybe I have a problem with my initial model? Which is without physics at all? Because I don't see any significant improvement. I really thank you for the help.

…

On Sun, Jun 7, 2026 at 6:09 PM Lawson Darrow ***@***.***> wrote: The cleanup looks right: the SIREN estimator and the bottleneck hook are gone d and rho are fixed constants now and the optimizer only holds the model parameters. One harmless leftover is clip_grad_norm_ still lists biophy_regulariser.parameters() which is now empty so it does nothing and can be dropped. On the results, the flat HD95 is expected, and it's worth being clear why. What you have is a plausibility prior on the probability field, not a boundary-accuracy objective. The reaction part rho*u*(1-u) is largest where the probabilities are uncertain, and the Laplacian part just penalizes curvature in the field, so together they act as a smoothness prior. That can move DICE a little, since DICE is a volume overlap, but HD95 is a tail boundary-distance metric and nothing here optimizes boundary placement, so it stays flat. Two things specifically working against you: - The boundary term runs on the faces of the 128^3 training crop not on anatomical boundaries. Those faces are arbitrary RandCropByPosNegLabel windows so you're penalizing differences in u at random crop edges which isn't meaningful and can mildly hurt. - lambda1=50 is worth checking against your logged loss magnitudes. Depending on the residual scale the PDE term may still be tiny or large enough to over smooth. Comparing loss_pde against loss_seg in your printouts will tell you which. If HD95 is the metric you care about then a soft physics prior is the wrong tool. Losses that target the boundary directly move HD95: boundary/surface loss (Kervadec et al.), a Hausdorff-distance-based loss, or suppressing small false-positive components (usually what drives the 95th percentile up). The fixes we discussed made the regularizer correct, but a steady-state Fisher-KPP prior on an already-strong fine-tuned model isn't expected to improve boundary metrics much. — Reply to this email directly, view it on GitHub <#8840?email_source=notifications&email_token=BZ2I37DFGFM5XTENVGXOZZ346WAR5A5CNFSNUABIM5UWIORPF5TWS5BNNB2WEL2ENFZWG5LTONUW63SDN5WW2ZLOOQXTCNZSGEYDIMZWUZZGKYLTN5XKMYLVORUG64VFMV3GK3TUVRTG633UMVZF6Y3MNFRWW#discussioncomment-17210436>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/BZ2I37G6KU4JB76KTVDMS7346WAR5AVCNFSM6AAAAACYPLFQLGVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTOMRRGA2DGNQ> . Triage notifications, keep track of coding agent tasks and review pull requests on the go with GitHub Mobile for iOS <https://github.com/notifications/mobile/ios/BZ2I37D3HKUGZMBXQBRXAXD46WAR5A5CNFSNUABIM5UWIORPF5TWS5BNNB2WEL2ENFZWG5LTONUW63SDN5WW2ZLOOQXTCNZSGEYDIMZWUZZGKYLTN5XKMYLVORUG64VFMV3GK3TUVJTG633UMVZF62LPOM> and Android <https://github.com/notifications/mobile/android/BZ2I37BUXDE3SLKE3ZGKJ5L46WAR5A5CNFSNUABIM5UWIORPF5TWS5BNNB2WEL2ENFZWG5LTONUW63SDN5WW2ZLOOQXTCNZSGEYDIMZWUZZGKYLTN5XKMYLVORUG64VFMV3GK3TUVZTG633UMVZF6YLOMRZG62LE>. Download it today! You are receiving this because you authored the thread.Message ID: ***@***.***>

0 replies

Rut328 · 2026-06-10T08:13:43Z

Rut328
Jun 10, 2026
Author

Hi again, I have a question that I would really appreciate if you could answer, I showed you that there was an increase in DICE and a decrease in HD95 - which shows that the physics were better, But I have a question, I did tests and most of the images showed a better situation with the physics, but there were images that showed a setback, and my question is does it make sense? Because the law of physics = true for everyone, right? If I saw that there are images in which adding physics reduced the accuracy - can I conclude from this that this law is wrong? Or that my implementation is wrong? , or does it make sense? , I would really appreciate your answer Thank you very much.

…

On Sun, Jun 7, 2026 at 9:33 PM RUT ***@***.***> wrote: I will explain to you why I wrote about HD95, because the first variation I showed you: biophysics_regulariser_SwinUNETR.ipynb where I implemented the idea of the article more: https://arxiv.org/pdf/2403.09136, had this improvement: Model without physics 6.26mm HD95, DICE 0.9011 This model with the physics from the article HD95 5.82mm, DICE 0.9141 These results were after one epoch, after I continued this run it showed an increase in 6.18mm - HD95 and a very slight improvement in DICE - 0.9152 so I assumed that this was not true. What do you recommend I do now, try to make the model biophysics_regulariser_SwinUNETR.ipynb more precise or go more with your idea (the one I named with the extension fixed (2)) I would love to get help from you because I'm really getting into trouble, maybe I have a problem with my initial model? Which is without physics at all? Because I don't see any significant improvement. I really thank you for the help. On Sun, Jun 7, 2026 at 6:09 PM Lawson Darrow ***@***.***> wrote: > The cleanup looks right: the SIREN estimator and the bottleneck hook are > gone d and rho are fixed constants now and the optimizer only holds the > model parameters. One harmless leftover is clip_grad_norm_ still lists > biophy_regulariser.parameters() which is now empty so it does nothing > and can be dropped. > > On the results, the flat HD95 is expected, and it's worth being clear > why. What you have is a plausibility prior on the probability field, not a > boundary-accuracy objective. The reaction part rho*u*(1-u) is largest > where the probabilities are uncertain, and the Laplacian part just > penalizes curvature in the field, so together they act as a smoothness > prior. That can move DICE a little, since DICE is a volume overlap, but > HD95 is a tail boundary-distance metric and nothing here optimizes boundary > placement, so it stays flat. > > Two things specifically working against you: > > - The boundary term runs on the faces of the 128^3 training crop not > on anatomical boundaries. Those faces are arbitrary > RandCropByPosNegLabel windows so you're penalizing differences in u > at random crop edges which isn't meaningful and can mildly hurt. > - lambda1=50 is worth checking against your logged loss magnitudes. > Depending on the residual scale the PDE term may still be tiny or large > enough to over smooth. Comparing loss_pde against loss_seg in your > printouts will tell you which. > > If HD95 is the metric you care about then a soft physics prior is the > wrong tool. Losses that target the boundary directly move HD95: > boundary/surface loss (Kervadec et al.), a Hausdorff-distance-based loss, > or suppressing small false-positive components (usually what drives the > 95th percentile up). > > The fixes we discussed made the regularizer correct, but a steady-state > Fisher-KPP prior on an already-strong fine-tuned model isn't expected to > improve boundary metrics much. > > — > Reply to this email directly, view it on GitHub > <#8840?email_source=notifications&email_token=BZ2I37DFGFM5XTENVGXOZZ346WAR5A5CNFSNUABIM5UWIORPF5TWS5BNNB2WEL2ENFZWG5LTONUW63SDN5WW2ZLOOQXTCNZSGEYDIMZWUZZGKYLTN5XKMYLVORUG64VFMV3GK3TUVRTG633UMVZF6Y3MNFRWW#discussioncomment-17210436>, > or unsubscribe > <https://github.com/notifications/unsubscribe-auth/BZ2I37G6KU4JB76KTVDMS7346WAR5AVCNFSM6AAAAACYPLFQLGVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTOMRRGA2DGNQ> > . > Triage notifications, keep track of coding agent tasks and review pull > requests on the go with GitHub Mobile for iOS > <https://github.com/notifications/mobile/ios/BZ2I37D3HKUGZMBXQBRXAXD46WAR5A5CNFSNUABIM5UWIORPF5TWS5BNNB2WEL2ENFZWG5LTONUW63SDN5WW2ZLOOQXTCNZSGEYDIMZWUZZGKYLTN5XKMYLVORUG64VFMV3GK3TUVJTG633UMVZF62LPOM> > and Android > <https://github.com/notifications/mobile/android/BZ2I37BUXDE3SLKE3ZGKJ5L46WAR5A5CNFSNUABIM5UWIORPF5TWS5BNNB2WEL2ENFZWG5LTONUW63SDN5WW2ZLOOQXTCNZSGEYDIMZWUZZGKYLTN5XKMYLVORUG64VFMV3GK3TUVZTG633UMVZF6YLOMRZG62LE>. > Download it today! > You are receiving this because you authored the thread.Message ID: > ***@***.***> >

0 replies

Lawson-Darrow · 2026-06-10T14:45:17Z

Lawson-Darrow
Jun 10, 2026

Yes, that makes sense. Even if Fisher-KPP is a useful tumor growth model, a soft residual penalty is not the physical law itself.

In your setup the term is a generic prior. It uses one fixed d and rho for every patient. It assumes a steady state and isotropic diffusion. And it is applied to a segmentation output rather than to a real tumor cell density. Those assumptions fit some cases better than others. When they fit, the term helps. When they do not, it can pull the prediction away from the image evidence.

So a per-image regression does not by itself mean the model is physically wrong or that your implementation is wrong. It means the physics term acts as a population-level bias and lambda sets the tradeoff. Some cases can lose while the average improves.

On the HD95 numbers, going 6.26 to 5.82 after one epoch and then back to 6.18 is not strong evidence of a stable gain yet. It could be checkpoint noise or run-to-run variation or just HD95 sensitivity. I would compare physics against no physics at matched checkpoints on the same validation set, ideally over several seeds. The most useful diagnostic is a per-image paired difference. Take physics minus baseline on the same images, then look at the mean and median and the spread across seeds. If the average improves while a few cases regress, that is normal regularizer behavior. If the same anatomy or failure mode regresses repeatedly, check the residual scaling and sign and boundary handling and whether the fixed parameters are too restrictive.

On which direction to take, I would not choose based on the one-epoch numbers from either notebook. The article version is more expressive than fixed (2), but both are still soft physics regularizers on an already strong segmentation model. I would expect limited HD95 headroom from either compared with a loss that directly targets boundaries. I do not see evidence that your no-physics baseline is the problem. Run a paired comparison against the same no-physics baseline and check whether any physics variant beats it by more than the run-to-run spread. If one does, keep it. If none does and HD95 is the target, a boundary or surface loss is the more direct lever, as I mentioned earlier.

0 replies

Rut328 · 2026-06-12T10:04:48Z

Rut328
Jun 12, 2026
Author

I really appreciate your help, I'm trying to run it again with more epochs and with some changes in lambda, but I'm not succeeding at all, all I managed to do is DICE which increases by a very small 0.002 - approximately, and in parallel it makes HD95 increase more than my original version - without physics at all. I also tried the version that did work for me in which HD95 dropped to 5.82, but even there if I run more epochs the HD95 becomes less accurate and DICE only increases by ~ 0.002, so I'm asking what do you think is the best thing for me to do now? Give up? , stop here and say that this is the improvement I managed to make? (the version with HD95 = 0.582), or is there anything else to try? I would appreciate your answer, thanks again.

…

On Wed, Jun 10, 2026 at 5:45 PM Lawson Darrow ***@***.***> wrote: Yes, that makes sense. Even if Fisher-KPP is a useful tumor growth model, a soft residual penalty is not the physical law itself. In your setup the term is a generic prior. It uses one fixed d and rho for every patient. It assumes a steady state and isotropic diffusion. And it is applied to a segmentation output rather than to a real tumor cell density. Those assumptions fit some cases better than others. When they fit, the term helps. When they do not, it can pull the prediction away from the image evidence. So a per-image regression does not by itself mean the model is physically wrong or that your implementation is wrong. It means the physics term acts as a population-level bias and lambda sets the tradeoff. Some cases can lose while the average improves. On the HD95 numbers, going 6.26 to 5.82 after one epoch and then back to 6.18 is not strong evidence of a stable gain yet. It could be checkpoint noise or run-to-run variation or just HD95 sensitivity. I would compare physics against no physics at matched checkpoints on the same validation set, ideally over several seeds. The most useful diagnostic is a per-image paired difference. Take physics minus baseline on the same images, then look at the mean and median and the spread across seeds. If the average improves while a few cases regress, that is normal regularizer behavior. If the same anatomy or failure mode regresses repeatedly, check the residual scaling and sign and boundary handling and whether the fixed parameters are too restrictive. On which direction to take, I would not choose based on the one-epoch numbers from either notebook. The article version is more expressive than fixed (2), but both are still soft physics regularizers on an already strong segmentation model. I would expect limited HD95 headroom from either compared with a loss that directly targets boundaries. I do not see evidence that your no-physics baseline is the problem. Run a paired comparison against the same no-physics baseline and check whether any physics variant beats it by more than the run-to-run spread. If one does, keep it. If none does and HD95 is the target, a boundary or surface loss is the more direct lever, as I mentioned earlier. — Reply to this email directly, view it on GitHub <#8840?email_source=notifications&email_token=BZ2I37BW2UVUSUYGAPWGY3T47FYBDA5CNFSNUABIM5UWIORPF5TWS5BNNB2WEL2ENFZWG5LTONUW63SDN5WW2ZLOOQXTCNZSGUZDINBZUZZGKYLTN5XKMYLVORUG64VFMV3GK3TUVRTG633UMVZF6Y3MNFRWW#discussioncomment-17252449>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/BZ2I37HDDDDV3ROQN46WMFD47FYBDAVCNFSNUABHKJSXA33TNF2G64TZHMZDCNBUHA2TAMBRHNCGS43DOVZXG2LPNY5TSOJZG43DCM5BOYBA> . Triage notifications, keep track of coding agent tasks and review pull requests on the go with GitHub Mobile for iOS <https://github.com/notifications/mobile/ios/BZ2I37F6ROVEANNQFDDTHU347FYBDA5CNFSNUABIM5UWIORPF5TWS5BNNB2WEL2ENFZWG5LTONUW63SDN5WW2ZLOOQXTCNZSGUZDINBZUZZGKYLTN5XKMYLVORUG64VFMV3GK3TUVJTG633UMVZF62LPOM> and Android <https://github.com/notifications/mobile/android/BZ2I37EJW4EUOMDYK65KWKD47FYBDA5CNFSNUABIM5UWIORPF5TWS5BNNB2WEL2ENFZWG5LTONUW63SDN5WW2ZLOOQXTCNZSGUZDINBZUZZGKYLTN5XKMYLVORUG64VFMV3GK3TUVZTG633UMVZF6YLOMRZG62LE>. Download it today! You are receiving this because you authored the thread.Message ID: ***@***.***>

0 replies

Implementing a Bio-Physical Loss Function (Fisher-KPP/Diffusion) for 3D Swin UNETR or for 3D-unet #8840

Uh oh!

Replies: 17 comments

Uh oh!

Uh oh!

Rut328 Jun 3, 2026 Author

Uh oh!

Uh oh!

Uh oh!

Rut328 Jun 3, 2026 Author

Uh oh!

Uh oh!

Rut328 Jun 4, 2026 Author

Uh oh!

Uh oh!

Rut328 Jun 4, 2026 Author

Uh oh!

Rut328 Jun 5, 2026 Author

Uh oh!

Uh oh!

Uh oh!

Rut328 Jun 6, 2026 Author

Uh oh!

Rut328 Jun 6, 2026 Author

Uh oh!

Uh oh!

Rut328 Jun 7, 2026 Author

Uh oh!

Rut328 Jun 10, 2026 Author

Uh oh!

Uh oh!

Rut328 Jun 12, 2026 Author

Rut328
Jun 3, 2026
Author

Rut328
Jun 3, 2026
Author

Rut328
Jun 4, 2026
Author

Rut328
Jun 4, 2026
Author

Rut328
Jun 5, 2026
Author

Rut328
Jun 6, 2026
Author

Rut328
Jun 6, 2026
Author

Rut328
Jun 7, 2026
Author

Rut328
Jun 10, 2026
Author

Rut328
Jun 12, 2026
Author