Skip to content

feat: Add DataFrame support in dy.Collection#335

Merged
Oliver Borchert (borchero) merged 17 commits into
Quantco:mainfrom
gab23r:dataframe-support-collection
May 24, 2026
Merged

feat: Add DataFrame support in dy.Collection#335
Oliver Borchert (borchero) merged 17 commits into
Quantco:mainfrom
gab23r:dataframe-support-collection

Conversation

@gab23r

@gab23r gab23r commented Apr 27, 2026

Copy link
Copy Markdown
Contributor

Motivation

Closes #319

Changes

  • Add is_lazy attribute in MemberInfo.
  • Add lazy_members and eager_members to dy.Collection-
  • Internals still works on lazyframes via the new _to_lazy_dict methods.
  • Modify Collection._init to collect the dataframe is the eager case.

@codecov

codecov Bot commented Apr 27, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 100.00%. Comparing base (b339296) to head (245f0f4).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff            @@
##              main      #335   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           56        56           
  Lines         3404      3427   +23     
=========================================
+ Hits          3404      3427   +23     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds first-class support for eager dy.DataFrame[...] members in dy.Collection, enabling collections to expose Polars DataFrames directly (instead of always returning LazyFrames).

Changes:

  • Extend member metadata with an is_lazy flag and add lazy_members() / eager_members() helpers.
  • Introduce internal _to_lazy_dict() and switch internal operations (join/collect_all/storage/filter-result collection) to use it.
  • Update collection initialization to eagerly .collect() members annotated as dy.DataFrame[...], plus add dedicated tests for eager/lazy/mixed collections.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
dataframely/collection/_base.py Adds MemberInfo.is_lazy, derives eager vs lazy from annotations, introduces _to_lazy_dict(), and updates _init() to materialize eager members.
dataframely/collection/collection.py Routes internal operations (join/collect_all/storage) through _to_lazy_dict() to support mixed eager/lazy collections.
dataframely/collection/filter_result.py Ensures CollectionFilterResult.collect_all() operates on lazy views of members for mixed collections.
tests/collection/test_implementation.py Updates annotation implementation tests to reflect dy.DataFrame[...] support.
tests/collection/test_dataframe_members.py Adds new end-to-end tests for eager-only, lazy-only, mixed, and optional eager members.

Comment thread dataframely/collection/_base.py Outdated
Comment thread dataframely/collection/collection.py Outdated
Comment thread dataframely/collection/filter_result.py
Comment thread dataframely/collection/collection.py
Comment thread tests/collection/test_dataframe_members.py Outdated
Comment thread tests/collection/test_dataframe_members.py Outdated

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry it took so long to review this! Generally, I think that we can support this, I left a few comments to simplify the implementation a little

Comment thread dataframely/collection/_base.py
Comment thread dataframely/collection/_base.py Outdated
Comment thread dataframely/collection/_base.py Outdated
Comment thread dataframely/collection/collection.py Outdated
Comment thread dataframely/collection/collection.py Outdated
dependabot Bot and others added 8 commits May 21, 2026 15:02
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: quant-ranger[bot] <132915763+quant-ranger[bot]@users.noreply.github.com>
Co-authored-by: Oliver Borchert <oliver.borchert@quantco.com>
Co-authored-by: Oliver Borchert <me@borchero.com>
Co-authored-by: quant-ranger[bot] <132915763+quant-ranger[bot]@users.noreply.github.com>
Co-authored-by: Oliver Borchert <oliver.borchert@quantco.com>
@gab23r gab23r force-pushed the dataframe-support-collection branch from 04018ab to 4bf6d8b Compare May 21, 2026 13:03

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates gab23r!

Comment thread dataframely/collection/_base.py
@borchero Oliver Borchert (borchero) enabled auto-merge (squash) May 24, 2026 15:13
@borchero Oliver Borchert (borchero) merged commit be94364 into Quantco:main May 24, 2026
32 checks passed
@gab23r gab23r deleted the dataframe-support-collection branch May 24, 2026 17:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature Request: dy.DataFrame Support in Collections

5 participants