+4 votes
in Programming Languages by (40.5k points)

I am using the drop() function to remove column(s) of a DataFrame, but it gives an error.

e.g.

df.drop([1],axis=1)

The above code returns the following error: KeyError: '[1] not found in axis'. It seems that the function accepts only column names. How can I drop columns using their indexes?

1 Answer

+2 votes
by (354k points)
selected by
 
Best answer

You can use either drop() or iloc to delete column(s) from a dataframe using their indices.

Using iloc

The following code creates a list of column indices 'cols' using the column names. Then it subtracts the column indices 'drop_col' that need to be removed. iloc uses the remaining indices 'i' to return the resulting dataframe.

>>> import pandas as pd
>>> df=pd.DataFrame({'x':['a','b','c'], 'y':['d','e','f'], 'z':['g','h','i']})
>>> df
   x  y  z
0  a  d  g
1  b  e  h
2  c  f  i
>>> cols=[i for i in range(len(df.columns))]
>>> cols
[0, 1, 2]
>>> drop_col=[0,2]
>>> i=list(set(cols).difference(drop_col))
>>> df.iloc[:,i]
   y
0  d
1  e
2  f

Using drop()

The drop() function needs column names, not their indices. That's why you are getting the error. You can get all column names using columns and then select the columns that you want to delete using their indices.

Here is an example using the drop() function:

>>> import pandas as pd
>>> df=pd.DataFrame({'x':['a','b','c'], 'y':['d','e','f'], 'z':['g','h','i']})
>>> df
   x  y  z
0  a  d  g
1  b  e  h
2  c  f  i
>>> cols=[i for i in range(len(df.columns))]
>>> cols
[0, 1, 2]
>>> drop_col=[0,2]

>>> df.drop(df.columns[drop_col], axis=1)
   y
0  d
1  e
2  f


...