Compare Excel Spreadsheets


Comparing Excel Spreadsheets - Why?

Business and Data analysts, Developers and QA testers often have to verify their data is correct. Microsoft Excel is used to create a snapshot of "good data". Reports are captured and copied into Excel, data is exported and distributed as the model to use.

SelectCompare allows you to compare Excel spreadsheets in the same way as other data sources. This post is going to show you the process of creating data connections to Excel workbooks and comparing data between two spreadsheets.

Where do I start?

First, you have to install appropriate driver to connect to the Excel workbooks. It is an easy step to get you up and running.

Install drivers

First thing is to install the drivers. Microsoft provides 2007 Office System Driver Data Connectivity Components which you should download to your machine if you don't have it installed yet. These components allow for connection to the Access database as well.

Check out the Install Instructions section on the link above - the installation is really straightforward!

When you have the drivers installed, you will see them in the connection setup dialog in SelectCompare. Read on!

Create new data connection to Excel workbook

Go to Connections page and Click the  icon. You will be brought to the data connection creator page.

To create a connection to an Excel file, carry out the following steps:

Click the Create button on the right-hand side of the Connection string edit box.

Select <other> Data Source and select .NET Framework Data Provider for OLEDB as shown on the illustration. 

Select Data Source

Click OK.

Select the "Microsoft Office 12.0 Access Database Engine OLE DB Provider" from the drop down list that appears on the screen. Don't worry, it works with Excel too.

Microsoft.ACE.OLEDB.12.0 OLEDB driver selection

 Enter name of the Excel Workbook you want to use.

Data connection properties - Excel file name

Note: I downloaded sample Northwind spreadsheet from Excel Dashboard Software for the needs of this article.

You can import Northwind sample database to Excel or use your own spreadsheet. I  converted the downloaded xls file to the newest xslx format.

Now, there is a trick: if you click "Test Connection" button on the dialog, you will see this error message:

Cannot open database ''. It may not be a database that your application recognizes, or the file may be corrupt.

In such case, close the error message and click "Advanced" button at the bottom of the dialog.

Create Excel connection - Advanced settings

 Enter "Excel 12.0" to the Extended Properties line:

Advenced settings, Extended Properties: Excel 12.0

Close the Advanced Properties dialog and test the connection again. You should see the confirmation of the successful connection.

Note: The installation instructions section on the driver download page lists what values are appropriate for certain formats of Excel spreadsheets:

When you close the connection wizard, the connection string will be populated in SelectCompare's Connection string field. You can click Test Connection button to verify that the connection string works as expected in SelectCompare.

Test connection to Excel spreadsheet

 The last step is to name the connection and assign it a color of your preference.

 Connection setup complete

OK  - your first connection to Excel is created!

Now it is time to create connection to the second Excel spreadsheet. 

Clone data source connection

The good news is that you don't have to repeat all the steps manually. You can clone the connection by clicking the Clone data connection definition icon. A new connection definition will be created, with name "Copy of " + your connection definition name.

Open the cloned connection definition and change its name and color if you like, and then change the connection details.

You don't have to open the connection configuration dialog windows. Just copy the connection string from the first connection and paste it to the cloned one!

In my case the connection string looks as follows:

Provider=Microsoft.ACE.OLEDB.12.0;Data Source=c:\Projects\DataFiles\Northwind1.xlsx;Extended Properties="Excel 12.0"

Paste the connection string to the appropriate place on the Edit connection page and modify the path of the file. Press Test Connection button to verify that you have the information correct. That's it!

Create data comparison project

The next step is to create the actual comparison project.  Navigate from Connections to Compare Projects page. Click Add new comparison project icon.

The full comparison project process has been described in the Data comparison project manual blog post. In short, we need to select data connection definitions (which we created above) and write select statements for each of the connections.

I created a copy of the Northwind1.xslx file named Northwind2.xlsx and made some changes to the order details, and customers list. I also made some changes in the Northwind1.xslx data to make the comparison more interesting.

The query I used is the same on both sides of the comparison - but obviously it doesn't have to be.

Note: The only condition is that column names in both queries have the same names.

Here's the query.

select c.CustomerID, c.CompanyName, a.OrderID, 
round(sum(UnitPrice * Quantity * (1 - Discount)), 2) as Subtotal from [Orders_details$] as a inner join
(select o.OrderID, c.CustomerID, c.CompanyName from [Orders$] as o inner join
[Customers$] as c on o.CustomerID = c.CustomerID) as b on a.OrderID = b.OrderID
group by a.OrderID, c.CustomerID, c.CompanyName
Note: For more examples of  SQL queries visit post Write Select Queries for Excel Spreadsheets.
 

The Spreadsheet comparison definition page looks as follows:

Data comparison project - two excel spreadsheets

The comparison project uses Northwind1 and Northwind2 connections, and defines OrderID column as comparison key.

 And here's screenshot of data comparison of two Excel sheets:

Excel spreadsheet comparison

 

The icons on the top of the results grid allow you to select visibility of data, so you can focus on the differences of specific kind. By default, rows with no differences between source and target are invisible. You can also quickly check the numbers of rows belonging to each difference category under the filter icon.

You can also export the data to Excel for further analysis - visit the post Create Exel Baseline For Your Data

For some examples of queries against Excel spreadsheets, please read the post Write Select Queries for Excel Spreadsheets

 

I see that you read the post until this point. Why don't you have a look at different editions of SelectCompare to find the one right for you?

 


SelectCompare can compare your data as long as you can connect to your data source and write a SELECT query. That's easy!

Write your queries with filters and aggregations and receive the comparison results in no time. Reuse the queries and comparison definitions - save time!

Try SelectCompare today!


Featured Products



Leave a comment


Please note, comments must be approved before they are published