python - How do i remove duplicates but keep row values in other column

i have a small df with: ID and full Name and i want to remove duplicates in ID but keep all names in new columns (kind of explode) so if i have 3 same ids, i would have new df with columns: ID, name, name, name

Please help

Topic python

Category Data Science


This is more of a programming question than a data science question and would therefore be better suited to the stackoverflow stackexchange page. Given the following dataframe with an ID and name column

import pandas as pd

df = pd.DataFrame({
    "ID": [1, 1, 1, 2, 2, 3, 3, 3, 4, 4],
    "name": ["A", "B", "C", "D", "E", "F", "G", "H", "I", "J"]
})

df.head()
ID name
1 A
1 B
1 C
2 D
2 E

this can be achieved by using a combination of groupby and tolist:

df = (
    df
    .groupby("ID")
    .agg({"name": lambda x: list(x)})
    .reset_index()
)
pd.concat([df["ID"], pd.DataFrame(df["name"].values.tolist()).add_prefix("name_")], axis=1)

This returns the following dataframe:

ID name_0 name_1 name_2
1 A B C
2 D E
3 F G H
4 I J

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.