Table partitioning in relational databases
I would like to write shortly about table partitioning in relational databases.
Table partitioning is dividing your data in a table into horizontal chunks, that can be (depending on the DB technology you use) indexed separately and stored on different disks. This allows you to address certain performance issues if a table is large, there are many inserts into it, and there is a requirement of providing reports on the data in this table. That table might be for example a transaction registry from your retail network.
Partitioning allows separate 'read only' partitions from the active partitions. For example, if you insert a lot of transactions into your registry table, they usually have a timestamp associated with them. You don't want the table to be heavily indexed, because new inserts will be more IO expensive - the data has to be written and the affected indexes need to be appropriately updated. Now and then, the index has to be rebalanced, triggering a wave of IO across your disks, slowing down your throughput and insert rates. If you have multiple indexes, created to address your reporting needs, you may end up in a pile of ... trouble.
The table partitioning in such a scenario may help you, by allowing you to index partitions that are no longer actively used for updates and inserts (for example yesterday, or last month) in a way supporting your reporting. So you can create as many indexes covering reporting queries as you see fit, and leave relatively un-indexed front partition (e.g. today!) for inserts. The reporting queries are going to use the covering indexes and index searches where appropriate, and only a small portion (hopefully) is going to hit the front partition with table scans.
Pretty much every leading RDBMS engine supports table partitioning. The topic is quite deep and there are multiple ways you can partition tables even within the same engine. For example, SQL server supports partitioned views and partitioned tables - deep dive - even deeper dive.
Here's the documentation for the PostgreSQL table partitioning approach and an extensive article about the use case for partitioning by Matvey Arye.
MySQL table partitioning documentation and an interesting blog post by Francisco Claria.
Of course, Oracle has a very mature table partitioning system. There is also an excellent blog outlining different partitioning techniques in Oracle.
Image By R. A. Nonenmacher - Created by me., CC BY-SA 4.0, Link
Leave a comment
Please note, comments must be approved before they are published