This post is regarding an index fault. More details can be found below.
Impact
Indexes can improve the performance of query statements, such as exact query, range query, and sort query statements. However, too many indexes seriously reduce the table performance of insertion and update. This risk is more noticeable in video cloud scenarios where frequent,
small-batch insertion and update are involved.
Inspection method
Run the following SQL statement to obtain key information, such as the index name, base table name, number of index columns, unique index use, primary key index use, and index type in a cluster:
SELECT T4.schemaname
,T2.relname
,T3.relname
,T_IDX_INFO.indnatts
,T_IDX_INFO.indisunique
,T_IDX_INFO.indisprimary
,T_IDX_INFO.attname
,CASE WHEN instr(T4.indexdef,'USING psort')>0 then 'psort'
WHEN instr(T4.indexdef,'USING btree')>0 then 'btree'
WHEN instr(T4.indexdef,'USING gin')>0 then 'gin'
ELSE ''
END
FROM (
SELECT T1.indexrelid
, T1.indrelid
, T1.indnatts
, T1.indisunique
, T1.indisprimary
, string_agg(T2.attname, ',') AS attname
FROM (
SELECT indexrelid
, indrelid
, indnatts
, indisunique
, indisprimary
, unnest(indkey) as attnum
FROM PG_INDEX) T1
INNER JOIN PG_ATTRIBUTE T2
ON T1.indrelid = T2.attrelid
AND T1.attnum = T2.attnum
GROUP BY T1.indexrelid
, T1.indrelid
, indnatts
, indisunique
, indisprimary
)T_IDX_INFO
LEFT JOIN PG_CLASS T2 --Obtain the index name.
ON T_IDX_INFO.indexrelid = T2.oid
LEFT JOIN PG_CLASS T3 --Obtain the base table name.
ON T_IDX_INFO.indrelid = T3.oid
LEFT JOIN PG_INDEXES T4 --Obtain the index definition.
ON T2.relname = T4.indexname
WHERE T4.schemaname not in
('dbms_lob','dbms_output','dbms_random','utl_raw','pg_toast','pg_catalog','sys','i
nformation_schema','cstore')
ORDER BY T4.schemaname, t3.relname, T_IDX_INFO.indexrelid
;
FusionInsight LibrA Product Documentation has clearly provided recommendations for creating indexes on which columns or column combinations in a service table.
l Create an index on columns that are often searched and queried. This speeds up searches.
l Create an index on a column that functions as the primary key. This enforces the uniqueness of the column and the data sorting structure in the organized table.
l Create an index on columns that are often joined and function as foreign keys. This increases the join efficiency.
l Create an index on columns that are often searched by range. The index helps sort data and therefore the specified ranges are contiguous.
l Create an index on columns that often need to be sorted. The index helps sort data, reducing the time for a sort query.
l Create an index on columns that are often used by the WHERE clause. This speeds up the judgment of conditions.
l Create an index on columns that often appear after the keywords ORDER BY, GROUP BY, or DISTINCT.
To avoid misuse of indexes, ensure that the indexes created in a LibrA cluster meet the following constraints:
l The number of indexes created for one service table must not exceed 3. Too many indexes will increase the cost to perform DML operation on a table (INSERT/UPDATE/DELETE).
l It is recommended that the number of columns with an index created be 1, that is, a single column index. The more the index columns, the lower the probability that a query utilizes the index.
l Indexes must have high selectivity on records in a service table. The higher the selectivity, the more likely the query performance is improved when the optimizer uses the index for query.
Run the following SQL statement to check the selectivity of the index created on the car_number column in the sc_etcptjd table:
SELECT T1.car_number, to_char(count(*)/T2.tcnt,'990.9999') as pct
FROM sc_etcptjd T1
,(SELECT COUNT(*) FROM sc_etcptjd) T2(tcnt)
GROUP BY T1.car_number , T2.tcnt
ORDER BY pct desc
limit 100
;
The following figure shows the query results.
If most pct values corresponding to the car_number column are close to 0.0000, the index created on this column will have a good selectivity.
If most pct values corresponding to the car_number column are large (for example, greater than 0.0010), the index created on this column will have a poor selectivity and is better deleted.