Many people love spreadsheets. Also many people hate spreadsheets. It doesn’t really matter in which camp you are, you probably have to deal with them anyways because they are everywhere. If you prefer to work with Python, you can use some very powerful packages to deal with Excel files. With these packages, you could almost get to “Python Excel”. In this post we will briefly touch upon two of them and provide useful links for further reading material. You can install these package with ‘pip’:
pip install openpyxl pip install pandas
openpyxl
This is a very useful Python library for both reading and writing Excel files (xlsx/xlsm/xltx/xltm). Please find below a very basic example on how to create a Excel file:
from openpyxl import Workbook workbook = Workbook() # Set the active worksheet worksheet = workbook.active # Set the values of cells worksheet['A1'] = 'Country' worksheet['B1'] = 'Sales' worksheet['A2'] = 'JP' worksheet['B2'] = 200 worksheet['A3'] = 'UK' worksheet['B3'] = 100 # Append a row worksheet.append(['US', 150]) # Save the file workbook.save("sales.xlsx")
And the Excel file ‘sales.xlsx’ will look like this:
In addition, if you would like to open an existing workbook and manipulate it, just replace the second line
workbook = Workbook()
with:
workbook = openpyxl.load_workbook('example.xlsx')
Just make sure you are using the correct path name in case your Excel file is not located your current working directory.
pandas
The ‘Pandas’ package is one of the most used packages in Python for data manipulation and analysis. Pandas works with so called DataFrames, which are comparable to the worksheets in Excel. You can easily manipulate your data with Pandas’ built-in functions.
Let’s import the ‘sales.xlsx’ file we created with ‘openpyxl’ and see how Pandas can help us to easily double our sales:
import pandas df = pandas.read_excel('sales.xlsx') df['Sales'] = df['Sales'] * 2 print(df)
And we just doubled our sales:
These are just simple examples. Both packages offer a wide variety of functionalities. There are plenty of high quality (and free of charge) tutorials on the web:
- DataCamp (pandas & openpyxl)
- AutomateTheBoringStuff (openpyxl)
- ZetCode (openpyxl)
- TutorialsPoint (pandas)
- Dataquest (pandas)
Python Excel
Futhermore, Microsoft is considering adding the option to use Python in Excel. They are asking for feedback on https://excel.uservoice.com/:
- Question: How can we improve Excel for Windows (Desktop Application)?
- Answer: Python as an Excel scripting language
If you agree with this, make sure to fill out their survey. This will increase the chances of adoption of getting Python Excel.
Hopefully you enjoyed reading this post. Please let us know your thoughts through the comment form!