Partitioning tables is a great way to improve the performance and manageability of a table. Sometimes, non-partitioned tables are created that don’t contain much data, but over time the table becomes largely populated. Partitions can be used to manage the proliferation of data in the table. Common single-level partitioning strategies include list partitioning, range partitioning, and hash partitioning. Each partitioning strategy has its own advantages and disadvantages.

In order to demonstrate the different partitioning strategies, a table needs to be created with some rows inserted. This table will store sales information for a company:

CREATE TABLE sales
(
  customer_id NUMBER,
  sell_id     NUMBER,
  product_id  NUMBER,
  sold_date   DATE,
  price       NUMBER,
  quantity    NUMBER,
  state       VARCHAR(5)
);

INSERT INTO sales 
  VALUES(1, 10, 15, TO_DATE('20110110', 'YYYYMMDD'), 100, 5, 'CA');

INSERT INTO sales 
  VALUES(2, 15, 10, TO_DATE('20110116', 'YYYYMMDD'), 25, 2, 'OR');

INSERT INTO sales 
VALUES(3, 20, 5, TO_DATE('20110210', 'YYYYMMDD'), 10, 3, 'WA');

This table starts small and increases drastically as the company sales more products. The table is a good candidate to be partitioned.

List partitioning

This table stores which state the customer lives in. A list partition can be applied to the state column to group the states into regions. Most sales are coming from west United States. Since the table is non-partitioned, a new partitioned table can be created with the data from the non-partitioned table:

CREATE TABLE sales_list_partition
(
  customer_id NUMBER,
  sell_id     NUMBER,
  product_id  NUMBER,
  sold_date   DATE,
  price       NUMBER,
  quantity    NUMBER,
  state       VARCHAR(5)
)
PARTITION BY LIST (state)
(
  PARTITION west VALUES
  (
    'CA',
    'OR',
    'WA'
  )
  );

Now, the data from the non-partitioned table can be inserted into the new partitioned table:

INSERT INTO sales_list_partition
SELECT * FROM sales;

There is one problem that may occur when inserting the data. If a value in the column STATE does not contain ‘CA’, ‘OR’, or ‘WA’, an error will occur:

INSERT INTO sales_list_partition
  VALUES(3, 20, 5, TO_DATE('20110210', 'YYYYMMDD'), 10, 3, 'NV');

Error report:
SQL Error: ORA-14400: inserted partition key does not map to any partition
14400. 00000 -  "inserted partition key does not map to any partition"
*Cause:    An attempt was made to insert a record into, a Range or Composite
           Range object, with a concatenated partition key that is beyond
           the concatenated partition bound list of the last partition -OR-
           An attempt was made to insert a record into a List object with
           a partition key that did not match the literal values specified
           for any of the partitions.
*Action:   Do not insert the key. Or, add a partition capable of accepting
           the key, Or add values matching the key to a partition 
           specification

In order to fix this error, a partition can be added to the table using the DEFAULT keyword for any value in the STATE column that does not meet the criteria for the partition named “west”:

ALTER TABLE sales_list_partition ADD PARTITION default_partition VALUES
(
  DEFAULT
  );

Now, any value inserted into the STATE column that does not meet the criteria of the partition named “west” will be inserted into this partition:

INSERT INTO sales_list_partition
VALUES(3, 20, 5, TO_DATE('20110210', 'YYYYMMDD'), 10, 3, 'NV');

The table is now partitioned into two parts: one partition will hold the rows where the STATE column has the value of ‘CA’, ‘OR’, or ‘WA’ and a partition that will hold all other rows for any other value in the STATE column. A benefit to this is that data that needs to be retrieved from west United States customers can be queried without having to search the entire table. A disadvantage is that there still can be a lot of data in this partition since it includes all sales from west United States.

Range partitioning

Range partitioning can be used to partition data by a certain range of values. The sales table can be partitioned by sold_date. Sales in January and February have the most data. A partitioned table can be created:

CREATE TABLE sales_range_partition
(
  customer_id NUMBER,
  sell_id     NUMBER,
  product_id  NUMBER,
  sold_date   DATE,
  price       NUMBER,
  quantity    NUMBER,
  state       VARCHAR(5)
)
PARTITION BY RANGE (sold_date)
(
  PARTITION P_JANUARY VALUES LESS THAN (TO_DATE('20110131', 'YYYYMMDD')),
  PARTITION P_FEBRUARY VALUES LESS THAN (TO_DATE('20110228', 'YYYYMMDD'))
  );

Now, the data from the sales table needs to be inserted into the new table:

INSERT INTO sales_range_partition
SELECT * FROM sales;

Just like the list partitioning, a row that is inserted into this table where the SOLD_DATE column does not meet the criteria of the two partitions already created, an error will occur:

INSERT INTO sales_range_partition
  VALUES(3, 20, 5, TO_DATE('20110301', 'YYYYMMDD'), 10, 3, 'CA');

Error report:
SQL Error: ORA-14400: inserted partition key does not map to any partition
14400. 00000 -  "inserted partition key does not map to any partition"
*Cause:    An attempt was made to insert a record into, a Range or Composite
           Range object, with a concatenated partition key that is beyond
           the concatenated partition bound list of the last partition -OR-
           An attempt was made to insert a record into a List object with
           a partition key that did not match the literal values specified
           for any of the partitions.
*Action:   Do not insert the key. Or, add a partition capable of accepting
           the key, Or add values matching the key to a partition 
           specification

To fix this error, a partition can be added using the MAXVALUE keyword that will store all the rows where the SOLD_DATE column’s value does not meet the criteria of the two partitions created at table creation time:

ALTER TABLE sales_range_partition ADD PARTITION maxvalue_partition 
VALUES LESS THAN ( MAXVALUE );

Now, any value in the SOLD_DATE column that is not in January or February will be added to this partition:

INSERT INTO sales_range_partition
VALUES(3, 20, 5, TO_DATE('20110301', 'YYYYMMDD'), 10, 3, 'NV');

The table now has three partitions: a partition to hold the rows where the SOLD_DATE column’s value is in January, a partition to hold the rows where the SOLD_DATE column’s value is in February, and a partition to hold the rows for all other SOLD_DATE values. This is a good way to be able to query data from specific time frames. A disadvantage of this is that if data needs to be pulled from mid-January to mid-February, two partitions would still have to be queried.

Hash partitioning

Hash partitioning can be used to equally distribute data into different tablespaces for easier manageability, or the data can be distributed across a single tablespace.

CREATE TABLE sales_hash_partition
(
  customer_id NUMBER,
  sell_id     NUMBER,
  product_id  NUMBER,
  sold_date   DATE,
  price       NUMBER,
  quantity    NUMBER,
  state       VARCHAR(5)
)
PARTITION BY HASH (customer_id) PARTITIONS 2;

This will create a table with 2 partitions. The partitions are distributed across the default tablespace. Now, the data from sales needs to be inserted into the new table:

INSERT INTO sales_hash_partition
SELECT * FROM sales;

Hash partitioning is very beneficial for managing large quantities of data since the data can be distributed equally over different storage devices or accessed equally in parallel. Also, Hash partitioning is good when there is no logical manner in which the data needs to be separated. But, hash partitioning’s disadvantage is that the data is distributed in a non-logical manner. So, if a query needed to be executed that was looking for data in a certain date range, it may have to search each partition to find the correct data since the dates could be spread out between partitions.

Link to script that contains the examples in this post.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Comment