This document describes how MOCO takes a backup of MySQLCluster data and restores a cluster from a backup.
- Design goals
- Considered options
A MySQLCluster can be configured to take backups regularly by referencing a BackupPolicy in
spec.backupPolicyName. For each MySQLCluster associated with a BackupPolicy,
moco-controller creates a CronJob.
The CronJob creates a Job to take a full backup periodically.
The Job also takes a backup of binary logs for Point-in-Time Recovery (PiTR).
The backups are stored in a S3-compatible object storage bucket.
This figure illustrates how MOCO takes a backup of a MySQLCluster.
moco-controllercreates a CronJob and Role/RoleBinding to allow access to MySQLCluster for the Job Pod.
- At each configured interval, CronJob creates a Job.
- The Job dumps all data from a
mysqldusing MySQL shell's dump instance utility.
- The Job creates a tarball of the dumped data and put it in a bucket of S3 compatible object storage.
- The Job also dumps binlogs since the last backup and put it in the same bucket (with a different name, of course).
- The Job finally updates MySQLCluster status to record the last successful backup.
To restore from a backup, users need to create a new MySQLCluster with
spec.restore filled with necessary information such as the bucket name of the object storage, the object key, and so on.
The next figure illustrates how MOCO restores MySQL cluster from a backup.
moco-controllercreates a Job and Role/RoleBinding for restoration.
- The Job downloads a tarball of dumped files of the specified backup.
- The Job loads data into an empty
mysqldusing MySQL shell's dump loading utility.
- If the user wanted to restore data at a point-in-time, the Job downloads saved binlogs.
- The Job applies binlogs up to the specified point-in-time using
- The Job finally updates MySQLCluster status to record the restoration time.
- Users must be able to configure different backup policies for each MySQLCluster.
- Users must be able to restore MySQL data at a point-in-time from backups.
- Users must be able to restore MySQL data without the original MySQLCluster resource.
moco-controllermust export metrics about backups.
- Backup data should be compressed to save the storage space.
- Backup data should be stored in an object storage.
- Backups should be taken from a replica instance as much as possible.
These "should's" are mostly in terms of money or performance.
Backup files are stored in an object storage bucket with the following keys.
- Key for a tarball of a fully dumped MySQL:
- Key for a compressed tarball of binlog files:
<namespace> is the namespace of MySQLCluster, and
<name> is the name of MySQLCluster.
YYYYMMDD-hhmmss is the date and time of the backup where
YYYY is the year,
MM is two-digit month,
DD is two-digit day,
hh is two-digit hour in 24-hour format,
mm is two-digit minute, and
ss is two-digit second.
This allows multiple MySQLClusters to share the same bucket.
Internally, the time for PiTR is formatted in UTC timezone.
The restore Job runs
As described in Overview, the backup process is implemented with CronJob and Job. In addition, users need to provide a ServiceAccount for the Job.
The ServiceAccount is often used to grant access to the object storage bucket where the backup files will be stored. For instance, Amazon Elastic Kubernetes Service (EKS) has a feature to create such a ServiceAccount. Kubernetes itself is also developing such an enhancement called Container Object Storage Interface (COSI).
To allow the backup Job to update MySQLCluster status, MOCO creates Role and RoleBinding. The RoleBinding grants the access to the given ServiceAccount.
For the first time, the backup Job chooses a replica instance as the backup source if available. For the second and subsequent backups, the Job will choose the last chosen instance as long as it is still a replica and available.
The backups are divided into two: a full dump and binlogs.
A full dump is a snapshot of the entire MySQL database.
Binlogs are records of transactions.
mysqlbinlog, binlogs can be used to apply transactions to a database restored from a full dump for PiTR.
For the first time, MOCO only takes a full dump of a MySQL instance, and records the GTID at the backup. For the second and subsequent backups, MOCO will retrieve binlogs since the GTID of the last backup until now.
MOCO then creates a tarball of the dump and puts it to an object storage bucket.
To retrieve transactions since the last backup until now,
mysqlbinlog is used with these flags:
--exclude-gtids=<the GTID of the last backup>
The retrieved binlog files are packed into a tarball and compressed with zstd, then put to an object storage bucket.
Finally, the Job updates MySQLCluster status field with the following information:
- The time of backup
- The time spent on the backup
- The ordinal of the backup source instance
server_uuidof the instance (to check whether the instance was re-initialized or not)
- The binlog filename in
SHOW MASTER STATUSoutput.
- The size of the tarball of the dumped files
- The size of the tarball of the binlog files
- The maximum usage of the working directory
- Warnings, if any
To restore MySQL data from a backup, users need to create a new MySQLCluster with appropriate
spec.restore needs to provide at least the following information:
- The bucket name
- Namespace and name of the original MySQLCluster
- A point-in-time in RFC3339 format
mysqld is running, it creates a Job to retrieve backup files and load them into
The Job looks for the most recent tarball of the dumped files that is older than the specified point-in-time in the bucket, and retrieves it.
The dumped files are then loaded to
mysqld using MySQL shell's load dump utility.
If the point-in-time is different from the time of the dump file, and if there is a compressed tarball of binlog files, then the Job retrieves binlog files and applies transactions up to the point-in-time.
After restoration process finishes, the Job updates MySQLCluster status to record the restoration time.
moco-controller then configures the clustering as usual.
If the Job fails,
moco-controller leaves the Job as is.
The restored MySQL cluster will also be left read-only.
If some of the data have been restored, they can be read from the cluster.
If a failed Job is deleted,
moco-controller will create a new Job to give it another chance.
Users can safely delete a successful Job.
No automatic deletion of backup files
MOCO does not delete old backup files from object storage. Users should configure a bucket lifecycle policy to delete old backups automatically.
Duplicated backup Jobs
CronJob may create two or more Jobs at a time. If this happens, only one Job can update MySQLCluster status.
Lost binlog files
expire_logs_daysis set to a shorter value than the interval of backups, MOCO cannot save binlogs correctly. Users are responsible to configure
There were many design choices and alternative methods to implement backup/restore feature for MySQL. Here are descriptions of why we determined the current design.
Compared to file systems, object storage is generally more cost-effective. It also has many useful features such as object lifecycle management.
AWS S3 API is the most prevailing API for object storages.
Backup and restoration can be a CPU- and memory-consuming task.
Running such a task in
moco-controller is dangerous because
moco-controller manages a lot of MySQLCluster.
moco-agent is also not a safe place to run backup job because it is a sidecar of
If backup is run in
mysqld Pod, it would interfere with the
The biggest reason is the difference in how these tools lock the instance.
LOCK INSTANCE FOR BACKUP which blocks DDL until the lock is released.
mysqldump, on the other hand, allows DDL to be executed. Once DDL is executed and acquire meta data lock, which means that any DML for the table modified by DDL will be blocked.
Blocking DML during backup is not desirable, especially when the only available backup source is the primary instance.
Another reason is that
mysqhsl is much faster than
Continuous backup is a technique to save executed transactions in real time.
For MySQL, this can be done with
mysqlbinlog --stop-never. This command continuously retrieves transactions from binary logs and outputs them to stdout.
MOCO does not adopt this technique for the following reasons:
We assume MOCO clusters have replica instances in most cases.
When the data of the primary instance is lost, one of replicas can be promoted as a new primary.
It is troublesome to control the continuous backup process on Kubernetes.
The process needs to be kept running between full backups. If we do so, the entire backup process should be a persistent workload, not a (Cron)Job.