Skip to content

Add experimental Windows wheel build support#1359

Open
zym1998year wants to merge 3 commits intoGalSim-developers:releases/2.8from
zym1998year:windows-msvc-port
Open

Add experimental Windows wheel build support#1359
zym1998year wants to merge 3 commits intoGalSim-developers:releases/2.8from
zym1998year:windows-msvc-port

Conversation

@zym1998year
Copy link
Copy Markdown

Draft PR against releases/2.8 (baseline ee177aa50, v2.8.4).
Branch on file: windows-msvc-port.

Status

  • Verified locally: setup.py build_clib + build_ext, pip wheel,
    delvewheel repair, fresh-venv install (no conda PATH), import galsim, Gaussian.drawImage, FFT-backed convolve,
    BaseDeviate(42) determinism, hsm.EstimateShear.
  • Not addressed in this PR: Windows-aware multiprocessing config
    flow, download_cosmos symlink fallback, full Windows pytest pass,
    GitHub Actions CI. See Follow-ups.
  • Linux / macOS code paths are textually unchanged outside
    IS_WINDOWS branches and _WIN32 C++ guards. I have not re-run the
    existing CI from this branch; the diff stays under
    if IS_WINDOWS: so no GCC/Clang behavioural change is intended.

Summary

Three small commits make GalSim build, install, and import on native
Windows x64 with the MSVC toolchain. The result is a self-contained
win_amd64 wheel that installs into a fresh venv with no conda
dependency at runtime, and that produces correct numbers on a small
smoke surface (drawImage, FFT convolve, random determinism, HSM
EstimateShear).

The PR stays on the minimal-diff setuptools path. A
scikit-build-core rewrite is a larger discussion this PR does not
attempt.

What this PR changes

# Commit Files Diff
1 Make setup metadata and path handling Windows-safe setup.py +14 / -6
2 Add MSVC and Windows FFTW/Eigen build support setup.py +190 / -84
3 Port GalSim C++ sources for MSVC 8 files in include/, src/ +93 / -65

Branch tip: 9f4db829c.

Per-commit detail (click to expand)

1. setup metadata and paths

setup.py only. Surgical:

  • Guard the from setuptools.command.test import test import.
    setuptools 72.0+ removed it; the cmdclass below never registered a
    test command.
  • Add IS_WINDOWS = sys.platform == 'win32'.
  • cpp_sources.remove('src/mmgr.cpp') -> compare via
    os.path.normpath, so the glob-returned src\mmgr.cpp on Windows
    still matches.
  • ':' -> os.pathsep when splitting LIBRARY_PATH,
    LD_LIBRARY_PATH, DYLD_LIBRARY_PATH, C_INCLUDE_PATH, and the
    installed-script PATH check. Avoids treating drive-letter colons as
    separators.

2. MSVC and Windows FFTW / Eigen

setup.py only. The Windows build path:

  • copt / lopt gain an 'msvc' entry: /O2 /std:c++14 /EHsc /openmp /Zc:__cplusplus /utf-8 /DNOMINMAX.
  • get_compiler_type short-circuits via getattr(compiler, 'compiler_type', None) == 'msvc' before touching compiler_so
    (Unix-only on MSVCCompiler).
  • try_compile adds an MSVC branch using compiler.compile() /
    compiler.link_executable() instead of constructing cc -c -o
    command lines.
  • fix_compiler skips ccache, -msse2, -stdlib=libc++ removal, and
    the -L linker_so rewrite on MSVC; forces single-process compile on
    Windows (parallel_compile is shaped around the Unix invocation
    pattern).
  • find_fftw_lib searches %CONDA_PREFIX%\Library\lib, vcpkg's
    installed/x64-windows/lib, and accepts fftw3.lib /
    libfftw3.lib / libfftw3-3.lib. The ctypes.cdll.LoadLibrary
    probe is skipped on Windows because the resolved path is the import
    library, not a runnable DLL.
  • find_eigen_dir gets the same conda / vcpkg additions for include
    directories.
  • The Extension uses libraries=['fftw3'] on Windows, keeps
    extra_link_args=['-lfftw3'] on the GCC/Clang path.
  • my_build_ext.run skips GALSIM_BUILD_SHARED on Windows: that
    codepath uses os.symlink and .so / .dylib naming.

3. C++ sources for MSVC

8 files:

  • include/galsim/Std.h: #define NOMINMAX before <Windows.h> so
    std::min / std::max (and Eigen members) survive.
  • include/galsim/Stopwatch.h: switch from gettimeofday to
    std::chrono::steady_clock. No GalSim TU includes this header
    today; the rewrite is hygiene.
  • src/Image.cpp, src/Polygon.cpp: replace alternative token or
    with ||. MSVC parses or as an identifier without /Za plus
    <ciso646>.
  • src/Random.cpp: split the POSIX <sys/time.h> / <unistd.h> /
    <fcntl.h> includes behind #ifndef _WIN32; add _WIN32 branches
    for seedurandom() (std::random_device, which delegates to
    BCryptGenRandom on the Windows CRT) and seedtime()
    (std::chrono::system_clock with the same microsecond-modulo seed).
  • src/SBInterpolatedImage.cpp, src/WCS.cpp, src/SBTransform.cpp:
    replace GCC variable-length arrays with std::vector<...>; pass
    .data() to consumers expecting raw pointers (Horner2D,
    memset/std::fill, reinterpret_cast onto std::complex<T>,
    KValueInnerLoop).

Verification

Built and tested on Windows 11 x64, Python 3.11.15 (conda-forge), VS
2026 Community (MSVC 14.50.35717), conda-forge fftw 3.3.10 /
eigen 3.4.0 / pybind11 3.0.3.

Build / wheel

python setup.py build_clib -j1 build_ext -j1                # 0 errors
python -m pip wheel . -w dist -v --no-deps                  # 0 errors
python -m delvewheel show dist\GalSim-2.8.4-cp311-...whl    # see table
python -m delvewheel repair --add-path %CONDA_PREFIX%\Library\bin \
                            -w wheelhouse dist\GalSim-2.8.4-cp311-...whl
Bundled into wheel Assumed present on host
fftw3.dll vcruntime140.dll, vcruntime140_1.dll
msvcp140.dll python311.dll, kernel32.dll, api-ms-win-crt-*
vcomp140.dll

Fresh-venv smoke

python -m venv .venv-clean
:: PATH set to .venv-clean\Scripts;System32;... (no conda Library\bin)
.venv-clean\Scripts\pip install <repaired wheel> LSSTDESC.Coord astropy
.venv-clean\Scripts\python phase3_smoke.py
Test Result
import galsim 2.8.4, _galsim.cp311-win_amd64.pyd resolved from venv site-packages
Gaussian(sigma=1.2).drawImage(nx=16, ny=16, scale=0.2) shape (16, 16), sum 0.6684
FFT draw Convolve([Sersic(n=2.5), Gaussian]) sum 0.9737 (flux=1 retained ~3%)
BaseDeviate(42) first GaussianDeviate sample -0.23898784173300305 reproducible across instances
hsm.EstimateShear on g1=0.2, g2=0.1 status=0, e1=0.381, e2=0.190

The cleaned PATH does not include %CONDA_PREFIX%\Library\bin, so
the bundled fftw3.dll is the one being loaded.

Reproduction transcript (full local commands)
:: From a Developer Command Prompt (vcvars64) with the conda env active
set CMAKE_BUILD_PARALLEL_LEVEL=
set FFTW_DIR=%CONDA_PREFIX%\Library
python setup.py build_clib -j1 build_ext -j1
python -m pip install -U pip build wheel delvewheel
python -m pip wheel . -w dist -v --no-deps
python -m delvewheel show dist\GalSim-2.8.4-cp311-cp311-win_amd64.whl
python -m delvewheel repair --add-path %CONDA_PREFIX%\Library\bin ^
                            -w wheelhouse ^
                            dist\GalSim-2.8.4-cp311-cp311-win_amd64.whl

python -m venv .venv-clean
set "PATH=%cd%\.venv-clean\Scripts;%SystemRoot%\System32;%SystemRoot%;%SystemRoot%\System32\Wbem"
set "CONDA_PREFIX="
set "FFTW_DIR="
.venv-clean\Scripts\python -m pip install -U pip wheel
.venv-clean\Scripts\python -m pip install ^
    wheelhouse\GalSim-2.8.4-cp311-cp311-win_amd64.whl LSSTDESC.Coord astropy
.venv-clean\Scripts\python phase3_smoke.py

phase3_smoke.py runs the five tests in the table above and exits
non-zero on any failure.

Wheel notes

  • Bundled DLLs come from conda-forge: fftw3 from
    conda-forge::fftw, msvcp140 / vcomp140 from the env's MSVC
    redistributable mirror.
  • FFTW is GPL-licensed; redistributing fftw3.dll inside a wheel is
    permitted under GPL-2.0-or-later but pushes the binary distribution
    under GPL constraints. If MIT-only binary distribution is preferred,
    an alternative is to ship two flavours (FFTW vs. KissFFT or
    pocketfft); out of scope for this PR. Reviewer input requested.
  • vcomp140.dll is the older Microsoft OpenMP runtime;
    /openmp:llvm would pull libomp.dll instead. Reviewer input
    requested
    .
  • Wheel size in this branch: ~5 MB, dominated by the bundled FFTW.
  • The wheel is cp311-cp311-win_amd64. cibuildwheel will produce one
    wheel per Python minor version.

Follow-ups

Deliberately not in this PR. Each is a candidate for a separate PR
once this one lands.

  • galsim/config/util.py uses multiprocessing.get_context('fork'),
    which Windows lacks. Needs a spawn fallback plus an audit of
    config processing / phase-screen code for picklability under
    spawn. Not exercised by the smoke tests above; surfaces as soon as
    a config-driven run hits the multiprocessing branch.
  • galsim/download_cosmos.py uses os.symlink, which non-elevated
    Windows users cannot create unless Developer Mode is on. Needs
    copy / directory-junction / --nolink fallback.
  • Test markers: sweep tests/ for fork-only or symlink-only
    assumptions, mark with pytest.mark.skipif(sys.platform == 'win32')
    where appropriate, and run the full pytest on windows-latest.
  • GitHub Actions Windows CI + cibuildwheel matrix
    (cp39 .. cp313 -win_amd64). I have a working local recipe but want
    to align with project preferences (FFTW source, OpenMP runtime,
    delvewheel command) before sending it; see
    Reviewer questions below.

Risk and compatibility

  • Linux / macOS: every Windows-specific change is gated by
    IS_WINDOWS (in setup.py) or _WIN32 (in C++). I expect no
    behavioural regression on those platforms but did not re-run their
    CI from this branch.
  • VLA -> std::vector in SBInterpolatedImage.cpp, WCS.cpp,
    SBTransform.cpp: per-call adds heap allocation; the loops affected
    are inner enough that I expect dominance from per-element ray
    operations, but I have not benchmarked. If micro-benchmarks regress
    noticeably, an alternative is _alloca / alloca on the Windows
    path with VLAs preserved on GCC. I went with std::vector for
    cross-platform clarity.
  • Random.cpp Windows seeding uses std::random_device. On
    MSVC's CRT this delegates to the OS CSPRNG, matching the
    /dev/urandom contract. POSIX builds are unchanged.
  • MSVC toolset vs runtime mismatch: delvewheel warns when the
    bundled msvcp140.dll is older than the toolset that built the
    .pyd. I built locally with VS 2026 (14.50) and bundled
    conda-forge's 14.44 redistributable. The wheel still loaded and
    passed the smoke tests; CI on windows-latest (VS 2022 14.4x)
    would not see this warning. No code change required.

Reviewer questions

  1. FFTW source for the binary wheel: conda-forge, vcpkg-built,
    or fftw.org pre-built. Affects the GPL-redistribution note.
  2. /openmp (MS) vs /openmp:llvm. The latter aligns better
    with modern OpenMP but adds a libomp.dll redistribution step.
  3. VLA -> std::vector change: keep it global (current PR), or
    keep VLAs on GCC and switch only on MSVC.
  4. PR split: land this PR first (build / wheel / import) and
    submit multiprocessing / symlink / Windows test markers
    separately, or fold in.
  5. CI: I can send a windows-latest + cibuildwheel workflow as
    a follow-up PR once questions 1-2 are resolved. A draft strategy
    doc is available locally if helpful.

Happy to split, rebase, or extend as the project prefers.

setuptools 72.0+ removed setuptools.command.test; guard the import.
Add IS_WINDOWS sentinel for subsequent platform branches. Normalize
the mmgr.cpp source-list match via os.path.normpath so glob's
backslash paths still resolve. Use os.pathsep when splitting
LIBRARY_PATH / C_INCLUDE_PATH / PATH-style env vars so the colon in
Windows drive letters isn't treated as a separator.
- copt/lopt: MSVC entry with /O2 /std:c++14 /EHsc /openmp
  /Zc:__cplusplus /utf-8 /DNOMINMAX (/openmp wins MSVC's old runtime
  but suffices for the loops GalSim parallelizes today).
- get_compiler_type: short-circuit MSVC via compiler_type before
  touching compiler_so (Unix-only attribute).
- try_compile: route MSVC probes through compiler.compile() and
  compiler.link_executable() instead of hand-built cc -c / -o lines.
- fix_compiler: skip ccache, -msse2, -stdlib=libc++ removal and
  linker_so editing on MSVC; force single-process compile on Windows
  (parallel_compile pool path is Unix-shaped; MSVC can later regain
  per-extension parallelism via /MP).
- find_fftw_lib: search %CONDA_PREFIX%\Library\lib and the vcpkg
  layout, accept fftw3.lib / libfftw3.lib / libfftw3-3.lib, and skip
  ctypes.LoadLibrary on Windows (the located file is the import
  library, not the runtime DLL).
- find_eigen_dir: same conda/vcpkg additions for include dirs.
- Extension: use libraries=['fftw3'] on Windows; -lfftw3 stays for
  GCC/Clang.
- my_build_ext.run: skip GALSIM_BUILD_SHARED on Windows; that path
  uses os.symlink and bakes in .so/.dylib naming.
- include/galsim/Std.h: define NOMINMAX before <Windows.h> so std::min
  / std::max (and Eigen members) survive unshadowed.  The MSVC build
  also passes /DNOMINMAX, but defining it here protects direct header
  consumers.
- include/galsim/Stopwatch.h: switch to std::chrono::steady_clock from
  gettimeofday.  No GalSim TU includes this header today, but keeping
  the public surface portable avoids surprises.
- src/Image.cpp, src/Polygon.cpp: replace alternative tokens (`or`)
  with `||`.  MSVC parses these as identifiers without /Za + ciso646.
- src/Random.cpp: split the POSIX <sys/time.h> / <unistd.h> / <fcntl.h>
  includes behind #ifndef _WIN32 and add _WIN32 branches for
  seedurandom() (std::random_device, which delegates to
  BCryptGenRandom/CryptGenRandom on Windows CRTs) and seedtime()
  (std::chrono::system_clock with the same microsecond-modulo seed).
- src/SBInterpolatedImage.cpp, src/WCS.cpp, src/SBTransform.cpp:
  replace GCC variable-length arrays with std::vector<...>.  Pass
  .data() to consumers expecting raw pointers (Horner2D, memset,
  reinterpret_cast onto std::complex<T>, KValueInnerLoop).  The
  std::vector path keeps the original observable behaviour: heap
  allocation per call instead of stack, but the inner loops dominate
  any allocation cost.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant