Skip to content

Feature Request: dy.DataFrame Support in Collections #319

@gab23r

Description

@gab23r

Allow Collections to declare dy.DataFrame members alongside dy.LazyFrame:

class MyCollection(dy.Collection):
    users: dy.DataFrame[UserSchema]      # Eager
    orders: dy.DataFrame[OrderSchema]    # Eager

Motivation

  • Small datasets: When working with small data, .collect() is unnecessary ceremony
  • EDA workflows: Inspecting results in notebooks/REPLs is more convenient with DataFrames
  • Consistency: Schema validation already supports both df and lf - Collections should too
# Current (verbose for small data)
collection = MyCollection.validate(data)
collection.users.collect()  # extra step just to see data

# Proposed (direct)
collection = MyCollection.validate(data)
collection.users  # already a DataFrame

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions