This is more of a programming question than a data science question and would therefore be better suited to the stackoverflow stackexchange page. Given the following dataframe with an ID
and name
column
import pandas as pd
df = pd.DataFrame({
"ID": [1, 1, 1, 2, 2, 3, 3, 3, 4, 4],
"name": ["A", "B", "C", "D", "E", "F", "G", "H", "I", "J"]
})
df.head()
ID |
name |
1 |
A |
1 |
B |
1 |
C |
2 |
D |
2 |
E |
this can be achieved by using a combination of groupby
and tolist
:
df = (
df
.groupby("ID")
.agg({"name": lambda x: list(x)})
.reset_index()
)
pd.concat([df["ID"], pd.DataFrame(df["name"].values.tolist()).add_prefix("name_")], axis=1)
This returns the following dataframe:
ID |
name_0 |
name_1 |
name_2 |
1 |
A |
B |
C |
2 |
D |
E |
|
3 |
F |
G |
H |
4 |
I |
J |
|