Compare a database table with CSV file
SelectCompare can be very useful to simplify the mundane task of comparing data stored in CSV files. The CSV format is very often used to exchange data between various applications and the ability to verify the correctness of the information in the CSV files is very important to the organization.
Traditionally, IT professionals - software developers, database administrators, and QA engineers write custom scripts to compare data or import the CSV files into a database or Excel spreadsheet and execute complex and error-prone operations to ensure the data quality.
SelectCompare allows you to compare data between a database and a CSV file almost effortlessly. Using the approach shown in the blog post about DSN-less ODBC data sources, you can set up your data comparison in just a couple of minutes.
First, let's create the PostgreSQL data connection.
Copy the following connection string, edit the server name, user and password and paste it into the Connection String edit box in SelectCompare.
Driver={PostgreSQL ANSI};Server=IP address;Port=5432;Database=myDataBase;Uid=myUsername;Pwd=myPassword;
Click the Test Connection button. If the connection string parameters are correct, you will receive confirmation that the connection was successful.
Save the Postgres data connection and return to the list of connections. You will see your new addition on the on the list:
Next, create a data connection for the CSV file. It is very straighforward - you need to remember to provide the folder name in the Dbq parameter of the connection string:
Driver={Microsoft Text Driver (*.txt; *.csv)};Dbq=myCSVFilesFolder;Extensions=asc,csv,tab,txt;
Again - after testing the connection, save it, then return to the screen with the list of comparison projects. Click the "+" button at the bottom of the screen.
The comparison editor screen will appear on the screen. Enter the project name and the appropriate description, then click the "Edit" button for the Source query.
A blank screen with place for entering the SQL statement and an option to chose the data connection is displayed. Click the "Change" button.
On the next screen, select the data connection for PostgreSQL that you created before.
After clicking the desired connection, you will return to the previous screen, where you can edit the SQL Statement for the selected data connection - in this case PostgreSQL.
After you write the query, press the "Play" button to verify that it returns expected results. Please note that this operation can take a while, depending on the amount of the selected data.
Once you are happy with the data, return to the data comparison configuration screen by pressing the back arrow in the upper left corner of the application.
Click the Edit button for the Target query and repeat the same actions - select a connection, write an SQL statement for the CSV file and return to the Comparison Project configuration screen.
Note that because the Dbq parameter in the connection string points to a folder, you can treat the files as 'tables' from which you can select.
Note:
There is an important information regarding the Comparison Project configuration screen. SelectCompare requires you to provide the comparison key attributes. One or more attributes must be selected to uniquely identify the rows in the data sets. If the selection is not correct, it may lead to invalid comparison results!
At least one column must be selected as a key, but on many occasions you need more than one attribute, depending on the logic of the SQL statements used in the comparison.
And here it is - a comparison of data between a database and a CSV file. Enjoy!
Once you get more familiar with SelectCompare and you have the data connections and the SQL statements to use in data comparison, you can set up such project in less than 5 minutes.
Please check other posts on our blog or send us a message using the Contact Us form..
Leave a comment
Please note, comments must be approved before they are published