13.11. Working with Excel files#
If you just want to manipulate excel’s data,use pd.read_excel() and convert it to dataframe and do the manipulation
openpyxl is the best module to work with excel’s other things,like changing sheet names,formatting,automating it etc
indexes start at 1 not 0
!pip install openpyxl
Requirement already satisfied: openpyxl in /Users/sahilchoudhary/opt/anaconda3/lib/python3.9/site-packages (3.0.9)
Requirement already satisfied: et-xmlfile in /Users/sahilchoudhary/opt/anaconda3/lib/python3.9/site-packages (from openpyxl) (1.1.0)
[notice] A new release of pip is available: 23.0 -> 23.0.1
[notice] To update, run: pip install --upgrade pip
import openpyxl
import os
13.11.1. Changing the working directory#
os.chdir('C:\\users\\Sahil Choudhary\\Desktop')
---------------------------------------------------------------------------
FileNotFoundError Traceback (most recent call last)
Input In [4], in <cell line: 1>()
----> 1 os.chdir('C:\\users\\Sahil Choudhary\\Desktop')
FileNotFoundError: [Errno 2] No such file or directory: 'C:\\users\\Sahil Choudhary\\Desktop'
13.11.2. Open Excel file#
# Use load_workbook function and it will give the workbook object
wb=openpyxl.load_workbook('example.xlsx')
print(type(wb))
<class 'openpyxl.workbook.workbook.Workbook'>
13.11.3. Get sheet names#
wb.sheetnames
['Sheet1', 'Sheet2']
13.11.4. Create Sheet#
wb.create_sheet(title='My new Sheet',index=1)
<Worksheet "My new Sheet">
13.11.5. Get one of the Sheet#
sheet=wb['Sheet1']
print(type(sheet))
<class 'openpyxl.worksheet.worksheet.Worksheet'>
13.11.6. Changing sheet name#
sheet.title='new name'
13.11.7. Get the number of rows and columns#
print(sheet.max_row)
print(sheet.max_column)
4
3
13.11.8. Getting Values#
# Getting cell object
sheet['B1']
<Cell 'new name'.B1>
print(sheet['B1'].value)
print(sheet.cell(row=1,column=2).value)
Sahil
Sahil
# Getting its value
sheet['A1'].value
'=Today()+1'
13.11.9. Changing Values#
sheet['B1'].value='Sahil Choudhary'
# By defauly,saves it in internal memory
13.11.10. Save the result as excel file#
wb.save('example2.xlsx')
13.11.11. Looping through cells#
# Get the first 5 cells of 2nd column
for i in range(1,5):
print(sheet.cell(row=i,column=2).value)
Sahil
Sonia
Sourav
Vishal
ws=wb['Sheet2']
rows=ws.iter_rows(min_row=1,max_row=7,min_col=1,max_col=2)
# we have iter_cols as well
# returns generator object
for row in rows:
print(row)
# returns tupple of cell locations
(<Cell 'Sheet2'.A1>, <Cell 'Sheet2'.B1>)
(<Cell 'Sheet2'.A2>, <Cell 'Sheet2'.B2>)
(<Cell 'Sheet2'.A3>, <Cell 'Sheet2'.B3>)
(<Cell 'Sheet2'.A4>, <Cell 'Sheet2'.B4>)
(<Cell 'Sheet2'.A5>, <Cell 'Sheet2'.B5>)
(<Cell 'Sheet2'.A6>, <Cell 'Sheet2'.B6>)
(<Cell 'Sheet2'.A7>, <Cell 'Sheet2'.B7>)
for a,b in rows:
print(a.value,b.value)
# returns tupple of cell locations
13.11.12. Get column name by number#
print(openpyxl.utils.cell.get_column_letter(1))
A
13.11.13. Change format of cells#
# create font object and give it to cell
from openpyxl.styles import Font
sheet['B1'].font=Font(sz=14,bold=True,italic=True)
# save it as new file in end
13.11.14. Adding Border#
from openpyxl.styles import Border,Side
# Pass color and border style to Side
# top=Side(border_style='thin|thick|medium|dashed|double',color='hexcode')
top=Side(border_style='thin')
# Border(top=top,bottom=bottom,left=left,right=right)
border=Border(top=top)
sheet['B1'].border=border