[Python] Why does drop_duplicates in_place = True not work in this case? [duplicate]

Stack · Setembro 12, 2024

Sample data:

data = [[1, 'john@example.com'], [2, 'bob@example.com'], [3, 'john@example.com']]
person = pd.DataFrame(data, columns=['id', 'email']).astype({'id':'int64', 'email':'object'})

Reproducible code:

(person.sort_values(by = ['email', 'id'], ascending = [True, True])
.drop_duplicates(subset = 'email', keep = 'first', inplace = True))

I expected the code above to revise person so it looks like

id email
1 2 bob@example.com
0 1 john@example.com

But instead person still looks like its original form

id email
0 1 john@example.com
1 2 bob@example.com
2 3 john@example.com

If I break up the methods into two parts, then it works

person1 = person.sort_values(by = ['email', 'id'], ascending = [True, True])
person1.drop_duplicates(subset = 'email', keep = 'first', inplace = True)

In this case person1 looks like the desired format:

id email
1 2 bob@example.com
0 1 john@example.com

Why doesn't the first code remove duplicated email in-place?

Continue reading...

Logar ou Criar uma Conta

[Python] Why does drop_duplicates in_place = True not work in this case? [duplicate]

Stack Membro Participativo

Compartilhe esta Página

Logar ou Criar uma Conta

[Python] Why does drop_duplicates in_place = True not work in this case? [duplicate]

Stack Membro Participativo

Compartilhe esta Página

Pesquisas Úteis