1. Anuncie Aqui ! Entre em contato fdantas@4each.com.br

[Python] PANDAS group by with 30 minute intervals and calculate total difference

Discussão em 'Python' iniciado por Stack, Outubro 25, 2024 às 10:22.

  1. Stack

    Stack Membro Participativo

    I have a data frame that looks like this:

    date week id
    20/07/21 12:46:00 1 d1
    20/07/21 12:56:00 1 d1
    20/07/21 13:09:00 1 d1
    20/07/21 14:11:00 1 d1
    20/07/21 14:42:00 1 d1

    I want to group by date in in 30 minutes interval- so if 2 consecutive rows are more than 30 minutes apart they are on different groups. The output I need looks like this:

    week id min_date max_date
    1 d1 20/07/21 12:46:00 20/07/21 13:09:00
    1 d1 20/07/21 14:11:00 20/07/21 14:11:00
    1 d1 20/07/21 14:42:00 20/07/21 14:42:00

    I used this code in order to group by:

    x=df.groupby(['id','week', pd.Grouper(key='date', freq='30min',origin="start")]).agg({'date':[np.min, np.max]})



    Something isn't working with the grouper, any suggestions how to improve it?

    EDIT:

    Here's an example of my data that causes an issue:

    date week id
    20/07/21 12:46:00 1 d1
    20/07/21 12:56:00 1 d1
    20/07/21 13:09:00 1 d1
    22/07/21 07:11:00 1 d1
    22/07/21 07:14:00 1 d1
    22/07/21 07:27:00 1 d1
    22/07/21 08:34:00 1 d1
    22/07/21 08:36:00 1 d1

    The output required is:

    week id min_date max_date
    1 d1 20/07/21 12:46:00 20/07/21 13:09:00
    1 d1 20/07/21 07:11:00 20/07/21 07:27:00
    1 d1 20/07/21 08:34:00 20/07/21 08:36:00

    This is the output I get:

    week id min_date max_date
    1 d1 20/07/21 12:46:00 20/07/21 13:09:00
    1 d1 20/07/21 07:11:00 20/07/21 08:36:00

    I don't understand why it groups the last rows together when there is more than an hour difference between 20/07/21 07:27:00 and 20/07/21 08:34:00.

    Thanks!

    Continue reading...

Compartilhe esta Página