1. Anuncie Aqui ! Entre em contato fdantas@4each.com.br

[Python] Optimizing timedelta calculation between two dates with different 'weekmask' logic in...

Discussão em 'Python' iniciado por Stack, Setembro 27, 2024 às 20:12.

  1. Stack

    Stack Membro Participativo

    I am very impressed of the Polars library and trying to learn it better. :)

    Now I am trying to calculate days between two dates in Polars for millions of rows, but there are conditions that for some rows I need to exclude certain weekdays. In Pandas/Numpy I have utilized np.busday_count where I can define a weekmask of which weekdays to count per condition and also exclude holidays when needed.

    I'm having difficulties counting the days with conditions in a fast way as I can't figure the way how to do this in expression.

    Example dataframe:

    df = (pl
    .DataFrame({"Market": ["AT", "DE", "AT", "CZ", "GB", "CZ"],
    "Service": ["Standard", "Express", "Standard", "Standard", "Standard", "Standard"],
    "Day1": ["2022-01-02","2022-01-03", "2022-01-04", "2022-01-05", "2022-01-06", "2022-01-07"],
    "Day2": ["2022-01-03","2022-01-04", "2022-01-05", "2022-01-06", "2022-01-07", "2022-01-08"]
    }
    )
    .with_columns(pl.col("Day1", "Day2").str.to_date())
    )


    I was able to pass the data to np.busday_function through struct and apply method. However the execution is much slower with the real dataset (34.4 seconds) compared to Pandas assign (262ms).

    Below the code I was able to come up with in Polars. I'm looking for an optimized way of doing this quicker.

    (df
    .with_columns(
    pl.struct("Day1", "Day2")
    .map_elements(lambda x: np.busday_count(x["Day1"], x["Day2"], weekmask='1110000'))
    .alias("Result"))
    )


    EDIT, expected output:

    ┌────────┬──────────┬────────────┬────────────┬────────┐
    │ Market ┆ Service ┆ Day1 ┆ Day2 ┆ Result │
    │ --- ┆ --- ┆ --- ┆ --- ┆ --- │
    │ str ┆ str ┆ date ┆ date ┆ i64 │
    ╞════════╪══════════╪════════════╪════════════╪════════╡
    │ AT ┆ Standard ┆ 2022-01-02 ┆ 2022-01-03 ┆ 0 │
    │ DE ┆ Express ┆ 2022-01-03 ┆ 2022-01-04 ┆ 1 │
    │ AT ┆ Standard ┆ 2022-01-04 ┆ 2022-01-05 ┆ 1 │
    │ CZ ┆ Standard ┆ 2022-01-05 ┆ 2022-01-06 ┆ 1 │
    │ GB ┆ Standard ┆ 2022-01-06 ┆ 2022-01-07 ┆ 0 │
    │ CZ ┆ Standard ┆ 2022-01-07 ┆ 2022-01-08 ┆ 0 │
    └────────┴──────────┴────────────┴────────────┴────────┘

    Continue reading...

Compartilhe esta Página