Write Select Statements for Excel Spreadsheets


You are reading this post because there is a chance you want to compare two spreadsheets in Excel, and you are looking for a way to do it easily and effectively.

SelectCompare allows for data comparison of Excel spreadsheets, as well as other data sources with simple, unified interface. You can export your comparison to Excel for further analysis or sharing.

You may be initially anxious about writing an SQL statement to pull data from your Excel, but unnecessarily. It is not that difficult!

This post will show you a few examples of writing queries against your spreadsheet that you will be able to use right away for your Excel data comparison.

If you want to find out how to connect to an Excel workbook, please read another post on this blog - Compare Excel Spreadsheets

You will see below how to query Excel spreadsheets, specify ranges of columns and rows, and write simple filters that make SQL so powerful.

Excel spreadsheet with headers

Let's assume we have a spreadsheet with headers in the first row as on the following picture.

Excel data with headers

The simplest query you can issue is 

Select * from [Customers$]

Select all data from spreadsheet

The query just selects all columns and all rows from the spreadsheet. Excel automatically returns only populated rows and columns.

If you need to limit the amount of rows you select from the spreadsheet, use TOP clause in the select statement:

Select top 3 * from [Customers$]

Top 3 rows from Excel spreadsheets

If you are interested only in specific columns, you can select only those that you that you need in the following way:

select CustomerID, CompanyName, ContactName, ContactTitle, Country
from [Customers$]
order by ContactName

Selected columns from Excel spreadsheet

 

What if my data doesn't begin in the first row?

It often happens that your spreadsheet has a title and other information at the top. The actual data begin below the title:

Excel report with title

In such case you can specify the range of data you want to select from:

select * from [Customers$A5:K92] order by CompanyName desc

Or just simply

select * from [Customers$A5:K] order by CompanyName desc

 

Select from the range of data

Filtering data in spreadsheet

You can select only data that you need for comparison by adding a WHERE clause to the query:

select CustomerID, CompanyName, ContactName from [Customers$]
where ContactTitle = 'Owner'
order by ContactName

 Filter data for spreadsheet comparison

Spreadsheet without headers

Excel, by default, assumes that column names of your table are stored in the first row of the table you select from. In some cases you may come across spreadsheets that do not have header information, like on the following illustration:

Spreadsheet without headers

In this case, if you select data from the spreadsheet, first row will be treated as column names, which is not what you want.

You have to modify connection string for your Excel Workbook. Just add parameter "HDR=NO" to the connection string as displayed below:

Disable header interpretation in Excel connection string

Now, when you select from the spreadsheet, you will notice that Excel provided automatic names for the columns:

Automatic column names for Excel SQL query

Usually you want to have meaningful names for your columns. You can easily apply aliases to your column names, like on the following illustration.

select F1 as CustomerID, F2 as CompanyName, F3 as CustomerName, F4 as CustomerTitle
from [CustomersNoHDR$A1:K]
where F4 ='Owner'

 

Column aliases for SQL query from Excel spreadsheet

 

These are only a few simple examples of queries you can use to select data for your comparison of spreadsheets. You can write more complicated queries, like joining multiple spreadsheets or calling Excel functions in the query to perform certain operations on the attributes, but this is a topic for another occasion.

 

Featured Products



Leave a comment


Please note, comments must be approved before they are published