1. Anuncie Aqui ! Entre em contato fdantas@4each.com.br

[Python] Why does drop_duplicates in_place = True not work in this case? [duplicate]

Discussão em 'Python' iniciado por Stack, Setembro 12, 2024.

  1. Stack

    Stack Membro Participativo

    Sample data:

    data = [[1, 'john@example.com'], [2, 'bob@example.com'], [3, 'john@example.com']]
    person = pd.DataFrame(data, columns=['id', 'email']).astype({'id':'int64', 'email':'object'})


    Reproducible code:

    (person.sort_values(by = ['email', 'id'], ascending = [True, True])
    .drop_duplicates(subset = 'email', keep = 'first', inplace = True))


    I expected the code above to revise person so it looks like

    id email
    1 2 bob@example.com
    0 1 john@example.com


    But instead person still looks like its original form

    id email
    0 1 john@example.com
    1 2 bob@example.com
    2 3 john@example.com


    If I break up the methods into two parts, then it works

    person1 = person.sort_values(by = ['email', 'id'], ascending = [True, True])
    person1.drop_duplicates(subset = 'email', keep = 'first', inplace = True)


    In this case person1 looks like the desired format:

    id email
    1 2 bob@example.com
    0 1 john@example.com


    Why doesn't the first code remove duplicated email in-place?

    Continue reading...

Compartilhe esta Página