Python and Excel application
Python and Excel by using openpyxl
This post will show a very interesting thing that can be done by using Python and Excel together with a GUI that creates an application. At my workplace we often must extract or clean data, one example is shown below:
In the third column which is named “target” we can see text but also numbers, the meaning of this post is to create a fast way to extract these numbers.
To create a fast way to extract these numbers across multiple Excel files I decided to use Python and create a GUI (sort of application) together with openpyxl that lets Python open and write in Excel files. The GUI is created with the package PyQt5
This example is very easy, for each observation there is a number in the target cell, but in real life it can be more values and sometimes even no value. So, the application has an easy job, the stages can be viewed below:
Step 1:
Create an application with PyQt5. The application should take into several files but will only evaluate Excel files (all other files will not be processed).
Step 2:
The application will check each file and only accept Excel. It will then check if the Excel file contains the wanted column that contains numbers, in this case the column is named “target”.
Step 3:
Python than creates a dataframe of the data from Excel and loop through every cell in the target column, it will split the text and extract all the numbers. The numbers are saved in a list.
Step 4:
Now the program will use PyQt5 and open the Excel file and create a new column named numbers that only show the extracted number.
The application
The application is simple and has only one job.
The button “Extract numbers” performs the operation that view all the files and extract the numbers into Excel. The button “Clear list” removes all files from the listbox so that the user can drag new files. The button “information” provides information on how the application is used.
The gif below shows how the user easily can drag multiple files into the listbox:
The user can then click the button to extract all the numbers. A window will pop up when the action is complete and tell the user how many files was worked on.
And this is the result. Look at the new column called “numbers” which was created by the application. Python provides a fast and easy way to communicate with Excel, I believe that Python often can be used instead of VBA.
The full code can be viewed below: