Strong opinions, weakly held

Another angle on database scalability

So the traditional measures of database scalability have to do with query speed. Does your database run queries quickly enough? Do you have contention issues when one thread is trying to insert data and another is trying to run some kind of crazy five table join. One problem I’ve run into with large MySQL databases is what I’ll call, for lack of a better term, maintenance scalability.

Let’s say I need to index a column in a table with three million rows. The create index statement can take 20 or 30 minutes to run. Same thing if I need to add a new column to a large table. So how do people get around these maintenance scalability issues without downtime? In the MySQL world, is there an answer to this question?


  1. It depends mostly on what the table is used for. For logging or other tables that have lots of INSERTs but no UPDATEs, we generally break them up into months, using a MERGE table as an overlay. Then you can copy all the non-active data to new tables, make your changes, and you only suffer the downtime penalty on reworking the most current table. If you plan a cutover for the first of the month, the downtime can be minimized to seconds.

    For tables that are UPDATEd, though, I don’t know of any magic solution. Reindexing an existing table is often slower than recreating a new table entirely, so if you have the room, make copies and try it both ways on the copy in advance of the switch. You might also consider locking the table to create a consistent copy, make your changes to the copy, and then use the binary log to replicate any changes to the original table since the snapshot to the new table.

  2. Down times and/or redundancy with master-master replication.

Leave a Reply

Your email address will not be published.


© 2019 rc3.org

Theme by Anders NorenUp ↑