Data blog RSS

Table partitioning in relational databases

I would like to write shortly about table partitioning in relational databases. Table partitioning is basically dividing your data in a table into horizontal chunks, that can be (depending on the DB technology you use) indexed separately and stored on different disks. This allows you to address certain performance issues, if a table is large, and there are many inserts into it, and there is a requirement of providing reports on the data in this table. That table might be for example a transaction registry from your retail network. Partitioning allows separate 'read only' partitions from the active partitions. For example if you insert a lot of transactions to your registry table, they usually have a timestamp associated with them. You...

Continue reading

Comparison of data processing architectures

John Ryan wrote an excellent post comparing different data architectures and their pros and cons. As it turns out there are four major options you can choose from, and you should choose carefully. Symmetric Multiprocessing (SMP) - the architecture we all know and love. Basically these are systems which are based on a single machine with attached storage. The machine of course can work in HA configuration, and there are some multi-machine configurations as well, but the principle stays the same. Examples of such engines include Oracle, SQL Server, PostgreSQL, MySQL and many others. Massively Parralel Processing (MPP) - the idea behind it is that the data is distributed across multiple nodes using a number of algorithms, and each of these...

Continue reading

New version of SelectCompare is available!

I am delighted to announce that a new version of SelectCompare has been released. Version 1.2 contains a number of improvements and optimizations, and two of them are directly available to the user: Support for the SQL Server stored procedures Now you can call a stored procedure that returns a rowset directly in the comparison query definition window. The valid syntax is  exec my_stored_procedure [parameter1[, parameter2, ...]] Sort columns in the comparison results User now can select either continuous or alternating view of the source and target columns. The alternating layout allows for quick visual comparison of values belonging to the same column on both sides of the comparison. To switch between the views, just click the appropriate option on the right of...

Continue reading