python - How do i remove duplicates but keep row values in other column

Question

python - How do i remove duplicates but keep row values in other column

Marcin

2022年4月26日 10:35

i have a small df with: ID and full Name and i want to remove duplicates in ID but keep all names in new columns (kind of explode) so if i have 3 same ids, i would have new df with columns: ID, name, name, name

Please help

Topic python

Category Data Science

Oxbowerce · Accepted Answer · 2022年4月26日 10:35

This is more of a programming question than a data science question and would therefore be better suited to the stackoverflow stackexchange page. Given the following dataframe with an ID and name column

import pandas as pd

df = pd.DataFrame({
    "ID": [1, 1, 1, 2, 2, 3, 3, 3, 4, 4],
    "name": ["A", "B", "C", "D", "E", "F", "G", "H", "I", "J"]
})

df.head()

ID	name
1	A
1	B
1	C
2	D
2	E

this can be achieved by using a combination of groupby and tolist:

df = (
    df
    .groupby("ID")
    .agg({"name": lambda x: list(x)})
    .reset_index()
)
pd.concat([df["ID"], pd.DataFrame(df["name"].values.tolist()).add_prefix("name_")], axis=1)

This returns the following dataframe:

ID	name_0	name_1	name_2
1	A	B	C
2	D	E
3	F	G	H
4	I	J

python - How do i remove duplicates but keep row values in other column

About