1. Anuncie Aqui ! Entre em contato fdantas@4each.com.br

[Python] PyPolars, conditional join on two columns

Discussão em 'Python' iniciado por Stack, Setembro 28, 2024 às 14:13.

  1. Stack

    Stack Membro Participativo

    How should one join two pl.LazyFrame using two columns from each pl.LazyFrame based on content in the columns of the left pl.LazyFrame ?

    import polars as pl

    lf1 = pl.LazyFrame(
    data={
    "col_1": ["a", "b", "c"],
    "col_2": ["d", None, None],
    "col_3": [None, "e", None],
    },
    )

    lf2 = pl.LazyFrame(
    data={
    "col_a": ["d", "xyz"],
    "col_b": ["xyz", "e"],
    "col_c": ["relevant_info_1", "relevant_info_2"],
    },
    )


    Pseudo-code of desired join :

    lf1.join(lf2,
    when(col("col_2").is_not_null().then(left_on="col_2", right_on="col_a")
    when(col("col_3").is_not_null().then(left_on="col_3", right_on="col_b")
    otherwise(do_nothing)
    )


    Expected result :

    shape: (3, 4)
    ┌───────┬───────┬───────┬─────────────────┐
    │ col_1 ┆ col_2 ┆ col_3 ┆ col_c │
    │ --- ┆ --- ┆ --- ┆ --- │
    │ str ┆ str ┆ str ┆ str │
    ╞═══════╪═══════╪═══════╪═════════════════╡
    │ a ┆ d ┆ null ┆ relevant_info_1 │
    │ b ┆ null ┆ e ┆ relevant_info_2 │
    │ c ┆ null ┆ null ┆ null │
    └───────┴───────┴───────┴─────────────────┘

    Continue reading...

Compartilhe esta Página