Schema optimisation
PROCEDURE ANALYSE() examines the result from a query and returns an analysis of the results that suggests optimal data types for each column that may help reduce table sizes.
It is important that the type of each column be optimal. In general, smaller is better and NULL should be avoided (especially for columns which are going to be indexed) as the NULL value imposes a performance penalty in MySQL.
The work to do is to run each table in the sample database through PROCEDURE ANALYSE() and optimise the type of the columns if required.
Indexing
MySQL supports indexes which are created with the CREATE INDEX command.
Assuming we have created an index consisting of three columns, C1, C2 and C3, the query optimiser will use this index (and therefore speed up query execution) when it matches:
- fully (i.e. C1, C2 and C3 match)
- with a leftmost prefix (C1)
- with a column prefix (beginning of C1)
- with a range of values (range of C1 values)
- with one element of C1 and a range on C2
When writing an SQL query, it is advisable to pay special attention to the WHERE clause. If an indexed column appears there, it is better for it to be placed on its own on one side of the comparison function. When this is done, the MySQL query optimiser will use the index.
The SHOW INDEX FROM command can be used to get information on existing indexes. The OPTIMIZE TABLE command is used when fragmentation occurs (either in the table or in indexes).
The work to do is to identify slow SQL queries on the sample database and decrease their runtime by judiciously using indexes.
Partitioning
MySQL can store the data in tables in distinct user-defined partitions. As written in the MySQL manual,
“In effect, different portions of a table are stored as separate tables in different locations. The user-selected rule by which the division of data is accomplished is known as a partitioning function, which in MySQL can be the modulus, simple matching against a set of ranges or value lists, an internal hashing function, or a linear hashing function.”
The effects of partitioning are diverse. First, partitioning can help in making MySQL read less data when executing a query if the query only refers to the data in a distinct partition (this is called pruning). Secondly, partitioned data is easy to maintain i.e. when having to discard old data by dropping a whole partition. Thirdly, partitions can be on different disks and therefore can be accessed concurrently.
In fact, partitioning can be thought as being a kind of coarse-grained indexing.
MySQL supports different kinds of partitions: range, list, column, etc.
Unfortunately, MySQL suffers from one very serious limitation: partitioned tables do not support foreign keys.
Partitions can be specified when doing a CREATE TABLE or, later, with ALTER TABLE.
The work to do is to add partitions to the existing tables in the same database and examine how query execution is impacted.
The MySQL event scheduler
MySQL has an event scheduler which can be used to run user-specified queries in the future and on a regular basis. By default, the scheduler is off and can be activated with SET GLOBAL event_scheduler = 1;.
When this is done, CREATE EVENT and ALTER EVENT can be used.
The work to do is to explore the various types of events which can exist and the various ways events can be created.
Leave a Reply