Skip to content

Nullable column missing fails validate(..., cast=True) #253

@mattijsdp

Description

If a column is nullable, why do we not allow dataframely to create the column if it doesn't exist yet (filled with nulls) when casting? I realise you've probably thought about this but I couldn't find an explicit mention of this in the docs or GitHub issues.

import dataframely as dy
import polars as pl

class TableSchema(dy.Schema):
    column_a = dy.String(nullable=False)
    column_b = dy.String(nullable=True)

df = pl.DataFrame({"column_a": 0})
TableSchema.validate(df, cast=True)

# Raises SchemaError
# SchemaError: 1 missing columns for schema 'TableSchema': 'column_b'

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions