Dataframe has no column names. How to add a header?

Question

Dataframe has no column names. How to add a header?

user633599

2022年5月12日 15:44

I am using a dataset to practice for building a decision tree classifier.

Here is my code:

import pandas as pd 
tdf = pd.read_csv('https://archive.ics.uci.edu/ml/machine-learning-databases/breast-cancer-wisconsin/breast-cancer-wisconsin.data', sep = ',', header=0)
tdf.info()

The column has no name, and i have problem to add the column name, already tried reindex, pd.melt, rename, etc.

The column names Ι want to assign are:

Sample code number: id number
Clump Thickness: 1 - 10
Uniformity of Cell Size: 1 - 10
Uniformity of Cell Shape: 1 - 10
Marginal Adhesion: 1 - 10
Single Epithelial Cell Size: 1 - 10
Bare Nuclei: 1 - 10
Bland Chromatin: 1 - 10
Normal Nucleoli: 1 - 10
Mitoses: 1 - 10
Class: (2 for benign, 4 for malignant)

Thanks,

Topic pandas python

Category Data Science

Gyan Ranjan · Accepted Answer · 2022年5月12日 15:44

For any dataframe, say df , you can add/modify column names by passing the column names in a list to the df.columns method: For example, if you want the column names to be 'A', 'B', 'C', 'D'],use this:

df.columns = ['A', 'B', 'C', 'D']

In your code , can you remove header=0? This basically tells pandas to take the first row as the column headers . Once you remove that, use the above to assign the column names.

Priyanshu Khullar · Accepted Answer · 2019年8月10日 14:02

1

Priyanshu Khullar answered at 2019年8月10日 14:02

df = pd.read_csv("Price Data.csv", names=['Date', 'Price'])

use the names field to add a header to your pandas dataframe.

daco · Accepted Answer · 2019年2月11日 11:02

I tried the code above and you are missing the first line of data.

1. original

tdf = pd.read_csv('https://archive.ics.uci.edu/ml/machine-learning-databases/breast-cancer-wisconsin/breast-cancer-wisconsin.data', sep = ',', header=0)
tdf.shape

(698, 11)

2. as the previous questions, removing header=0

tdf = pd.read_csv('https://archive.ics.uci.edu/ml/machine-learning-databases/breast-cancer-wisconsin/breast-cancer-wisconsin.data', sep = ',')
tdf.shape

(698, 11)

3. new answer, adding column names while reading csv, does get all the rows

 tdf = pd.read_csv('https://archive.ics.uci.edu/ml/machine-learning-databases/breast-cancer-wisconsin/breast-cancer-wisconsin.data', sep = ',', names=['Sample code number: id number','Clump Thickness: 1 - 10','Uniformity of Cell Size: 1 - 10','Uniformity of Cell Shape: 1 - 10','Marginal Adhesion: 1 - 10','Single Epithelial Cell Size: 1 - 10','Bare Nuclei: 1 - 10','Bland Chromatin: 1 - 10','Normal Nucleoli: 1 - 10','Mitoses: 1 - 10','Class: (2 for benign, 4 for malignant)'])  
    tdf.shape

(699, 11)

You can assign the names of the columns when reading the csv file

import pandas as pd 
tdf = pd.read_csv('https://archive.ics.uci.edu/ml/machine-learning-databases/breast-cancer-wisconsin/breast-cancer-wisconsin.data', sep = ',', names=['Sample code number: id number','Clump Thickness: 1 - 10','Uniformity of Cell Size: 1 - 10','Uniformity of Cell Shape: 1 - 10','Marginal Adhesion: 1 - 10','Single Epithelial Cell Size: 1 - 10','Bare Nuclei: 1 - 10','Bland Chromatin: 1 - 10','Normal Nucleoli: 1 - 10','Mitoses: 1 - 10','Class: (2 for benign, 4 for malignant)'])

You can check the dataframe using

tdf.head()

and you get

You can check the code on https://gist.github.com/e94b31914dbaebda7d11f6bfe0cfbdec

Dataframe has no column names. How to add a header?

About