This post highlights a partition fault. Please find below the process on how to optimize it.
Impact
Vehicle traffic record tables are typical flow-type data tables. A flow table stores a huge number of records, usually exceeding 1 billion; and the number will increase linearly as time goes by. In video cloud scenarios, service queries are mainly on data records in a specific time window of a flow table. However, fully scanning such a table with huge data consumes great many system I/O resources, reducing the overall system performance.
Inspection method
Run the following statement to determine the partitioning type of a service table and the storage space occupied by the table:
SELECT T1.schemaname
,T1.tablename
,CASE WHEN T2.partstrategy = 'r' THEN 'range partitioning'
WHEN T2.partstrategy = 'v' THEN 'value partitioning'
ELSE ''
END AS partstr
,pg_table_size(cast(T1.schemaname||'.'||T1.tablename as VARCHAR(64)))
FROM pg_tables T1
LEFT JOIN PG_PARTITION T2
ON T1.tablename = T2.relname
AND T2.parttype = 'r'
WHERE T1.schemaname not in
('dbms_lob','dbms_output','dbms_random','utl_raw','pg_toast','pg_catalog','sys','i
nformation_schema','cstore')
ORDER BY T1.schemaname, T1.tablename
;
If a flow table occupies large storage space (for example, more than 50 GB), define it as a partitioned table.
Optimization method
The partition key of a flow table can be set to the import date (or the data generation date), and partitions usually are made by day.
Use the table sc_etcptjd as an example. The SQL statement for defining the table as a partitioned table is as follows (where the definitions of columns and partitions are omitted):
CREATE TABLE sc_etcptjd_row (
eid character varying(255) NOT NULL,
car_number character varying(255) DEFAULT NULL::character varying,
enter_time timestamp without time zone,
...
)
WITH (orientation=ROW, compression=low)
DISTRIBUTE BY HASH (eid)
PARTITION BY RANGE (enter_time)
(
PARTITION p_min_time VALUES LESS THAN ('2017-12-31 16:00'),
PARTITION p_20180101 VALUES LESS THAN ('2017-12-31 16:00'),
...
PARTITION p_20181231 VALUES LESS THAN ('2017-12-31 16:00'),
PARTITION p_max_time VALUES LESS THAN (MAXVALUE)
)
ENABLE ROW MOVEMENT
;
table sc_etcptjd as an example to describe how to change it to a partitioned table.
Create a table sc_etcptjd_ppi with the enter_time column as the partition key and no index created. Ensure that other table attributes (storage model, column attribute, and distribution key) are the same as those of the sc_etcptjd table.
Step 2. Suspend the data loading job of the sc_etcptjd table.
Step 3. Import all data from the sc_etcptjd table to the sc_etcptjd_ppi table.
INSERT INTO public.sc_etcptjd_ppi SELECT * FROM public.sc_etcptjd;
Step 4. Create an index for the public.sc_etcptjd_ppi table by referring to the index definition of the original table public.sc_etcptjd.
Rename the tables.
ALTER TABLE public.sc_etcptjd RENAME TO public.sc_etcptjd_noppi;
ALTER TABLE public.sc_etcptjd_ppi RENAME TO public.sc_etcptjd;
Step 6. Resume the data loading job suspended in Step 2.
Step 7. Observe the system for a period of time. If the system remains normal, delete the original table.
DROP TABLE public.sc_etcptjd_noppi;