1. Anuncie Aqui ! Entre em contato fdantas@4each.com.br

[Python] Typing polars dataframe with pandera and mypy validation

Discussão em 'Python' iniciado por Stack, Outubro 8, 2024.

  1. Stack

    Stack Membro Participativo

    I am considering pandera to implement strong typing of my project uses polars dataframes.

    I am puzzled on how I can type my functions correctly.

    As an example let's have:


    import polars as pl
    import pandera.polars as pa
    from pandera.typing.polars import LazyFrame as PALazyFrame

    class MyModel(pa.DataFrameModel):
    a: int
    class Config:
    strict = True


    def foo(
    f: pl.LazyFrame
    ) -> PALazyFrame[MyModel]:
    # Our input is unclean, probably coming from pl.scan_parquet on some files
    # The validation is dummy here
    return MyModel.validate(f.select('a'))


    If I'm calling mypy it will return the following error

    error: Incompatible return value type (got "DataFrameBase[MyModel]", expected "LazyFrame[MyModel]")


    Sure, I can modify my signature to specify the return Type DataFrameBase[MyModel], but I'll lose the precision that I'm returning a LazyFrame.

    Further more LazyFrame is defined as implementing DataFrameBase in pandera code.

    How can I fix my code so that the return type LazyFrame[MyModel] works?

    Continue reading...

Compartilhe esta Página