Main introduction: introduction to replication function, mysql binary log, mysql replication topology, high-availability framework, single point of failure, read-write separation and load balancing introduction, etc.
mysql replication function provides sharing of read load
Realize data distribution on different servers
Incremental use of binary log
Doesn't require much bandwidth
However, the use of row-based replication will put a certain pressure on bandwidth when making large-scale changes, especially in cross-IDC environments.
Realize data distribution on different servers
Achieve load balancing of data reading
Need to cooperate with other components to complete
Use DNS polling to connect program reads to different backup databases,
Use proxy methods such as LVS and haproxy
Non-sharing architecture, the same data is distributed on multiple servers
Enhanced data security
Use standby database backup to reduce the load of the main database
Replication is not a substitute for backup
Realize database high availability and failover
Realize online data upgrade
mysql service layer log
Binary log
Slow check log
General log
mysql storage engine layer log
innodb log
Redo log
Rollback log
Records all modification events to the mysql database, including addition, deletion, modification, and check events, and modification events to the table structure
Segment-based format (record sql statement)
binlog_format = statement
advantage
The amount of log records is relatively small, saving disk and network I/O, only modify or insert one record
Disadvantage
Context information must be recorded
Ensure that the execution results of the statement on the slave server and the master server are consistent
For specific functions such as uuid() and user(), such non-deterministic functions still cannot be replicated, which may cause inconsistent data in the primary and secondary servers replicated by mysql
Line-based format
binlog_format = ROW
When the same sql statement modifies 10,000 pieces of data, the segment-based log format will only record this sql statement, and the row-based log format will have 10,000 records to record the data modification of each row.
advantage
Make mysql master-slave replication more secure
Modifying each row of data is more efficient than segment-based replication
When the data in the database is modified by mistake, and there is no backup to restore, we can analyze the binary log and reverse the data modification operation recorded in the log to achieve the purpose of restoring the data.
Disadvantage
Large log format
binlog_row_image = [full|minimal|noblob]
Mixed log format
binlog_format = mixed
Features:
According to the sql statement, the system decides to choose between segment-based and row-based log formats
The size of the data is determined by the SQL statement executed
SQL statement-based replication (SBR)
advantage
The amount of logs generated is small, which saves network transmission i/o
It is not mandatory that the table definitions of the master and slave databases are exactly the same
More flexible than row-based replication
Disadvantage
For non-deterministic events, the consistency of master-slave replication data cannot be guaranteed
Modifications to stored procedures, triggers, and custom functions may also cause data inconsistencies
Compared to the row-based copy method, more row locks are required when executed from above
Row-based replication
advantage
Can be applied to any sql replication including non-deterministic functions, stored procedures, etc.
Can reduce the use of database locks
Disadvantage
The table structure of the master and slave data is required to be the same, otherwise the replication may be interrupted
Cannot execute trigger separately from above
The master writes changes to the binary log
Read the main binary log changes and write them to relay_log
GTID-based replication
Replay the log in relay_log from above
The log based on the sql segment is to re-execute the recorded sql from the library
The row-based log is to directly apply the changes to the database on the slave database
Log point-based replication
Create a replication account on the main DB server
create user'repl' @'ip段'identified by'password' grant replication slave on. to'repl' @'ip段'
Configure the slave database server
bin_log = mysql-bin server_id = 101 relay_log = mysql-relay-bin log_slave_update = on read_only = on
Initialize data from the server
mysqldump --master-data=2 -single-transaction xtrabackup --slave-info
Start the replication link
change master to master_host="master_host_ip",master_user='repl',master_password='password' master_log_file='mysql_log_file_name',master_log_pos=4;
Pros and cons
advantage
It is the earliest replication technology supported by mysql, with relatively few bugs
No restrictions on sql query
It's easier to deal with failures
Disadvantage
Failover is difficult to regain the log point information of the new master
What is GTID
GTID is the global transaction id, which guarantees that a unique id can be generated in the replication cluster for each transaction submitted on the master; GTID=source_id:transaction_id
bin_log =/usr/local/mysql/log/mysql-bin server_id = 100 gtid_mode = on enforce-gtid-consiste
Start GTID-based replication
change master to master_host="master_host_ip",master_user='repl',master_password='password', master_auto_position = 1;
Pros and cons
advantage
It can be very convenient to professionally troubleshoot
The slave library will not lose any modifications on the master library
Disadvantage
Fault handling is more complicated
There are certain restrictions on the executed sql
Questions to consider when choosing a copy mode
The mysql version used
Replication architecture and master-slave switching method
High-availability management components used
Support for the application
mysql replication topology
Before mysql5.7, a slave library can only have one master library
After mysql5.7, support one-slave multi-master architecture
advantage
Simple configuration
Multiple slave libraries can be used to share the read load
use
Use different slave libraries for different businesses
Put a slave database to the remote IDC for disaster recovery
Share the read load of the main library
Configuration considerations
It is best to separate the tables operated by the two masters
Use the following two parameters to control the generation of auto-increment id
auto_increment_increment = 2 auto_increment_offset = 1 | 2
Main matters of master-master replication configuration in master-slave mode
Only one main server provides external services
A server is in a read-only state and only used as a hot standby
The switch will only be performed when the main library that provides external services fails or is scheduled for maintenance
Make the original standby library the main library, and the original main library will become the new standby library, and handle the read-only or offline status, and then go online again after the maintenance is completed
Ensure that the initial data on the two servers are the same
Make sure that binlog has been started on the two servers and have different server_id
Enable read_only on the initial backup
You can also assign several slave libraries to the master library
Solution: Control the transaction size of the main database and split large transactions
Solution: Use the mixed log format or set set binlog_row_image=minimal
Solution: Use multi-threaded replication. In mysql5.7, SQL threads can be allocated according to the logic clock.
Configuration steps:
stop slave set global slave_parallel_type ='logical_clock' set global slave_parallel_workers = 4 start slave
Master-slave replication errors caused by data corruption or loss
Errors caused by unexpected downtime of the master library or the slave library
Solution:
Use skip binary log events
The method of injecting empty transactions first restores the interrupted replication link
Use other methods to compare the data on the master and slave servers
The binary log on the main library is corrupted
The relay log on the standby database is damaged
Master-slave replication error caused by data modification on the slave library
Share the write load of the database
Automatic failover and master-slave switchover
Provide read-write separation function
High availability HA (High Avalilability) refers to improving the availability of systems and applications by minimizing downtime caused by routine maintenance operations (planned) and sudden system crashes (unplanned)
Normal available time
Percentage of the whole year
Severe master-slave delay
Master-slave replication interrupted
A lot of blockage caused by locks
Server downtime caused by software and hardware failures, etc.
Avoid factors that cause the system to be unavailable and reduce the time that the system is unavailable
Establish a complete monitoring and alarm system
Perform recovery test on backup data
Correctly configure the database environment
Archive and clean up unnecessary data
Increase system redundancy to ensure that it can be restored as soon as possible when the system is unavailable
Avoid a single point of failure
Master-slave switchover and failover
A server has run out of disk space,
Poor performance sql
Table structure and index are not optimized
Master-slave data is inconsistent
Human error
Single point of failure means that there is only one component that provides the same function in a system. If this component fails, it will affect the normal use of the entire system. Each component that makes up the application system may become a single point.
How to avoid mysql single point of failure
Use sun shared storage or drdb disk replication to solve MySQL single point of failure
sun
drdb
Use multi-write cluster or ndb cluster to solve MySQL single point of failure
How to solve the single point problem of the main server
After the main server is switched, how to notify the application of the ip address of the new main server
How to check if the mysql master server is available
How to deal with the kind of replication relationship between the slave server and the new master server
Multi-Master Replication Manager
MMM monitors the health of mysql master-slave replication
Failover and automatic configuration of other slave-to-master replication when the main library is down
How to find the log synchronization point of the new master library log point corresponding to the slave library
How to deal with inconsistent data in multiple slave libraries
Provides read and write virtual ip, when the main server has a problem, the virtual ip can be automatically migrated
advantage
Developed using perl scripting language and fully open source
Provides read and write vip (virtual ip) to make server role changes transparent to front-end applications
MMM provides delay monitoring from the server
Disadvantage
The release time is relatively early and does not support the new replication function of MySQL
No read load function
When switching between master and slave, it is easy to cause data loss
MMM monitoring service has a single point of failure
Master High Avalilability
Monitor whether the main database service is available
When the master DB is unavailable, a new master database server is elected from multiple slave servers
Provides master-slave switching and failover functions
Attempt to save the binary log from the failed primary database
Election of a new candidate master server from multiple candidate slave servers
Synchronize differential binary data between the alternate master server and other slave servers
Apply the binary log saved from the original db server
One of the main purposes of mysql master-slave replication configuration: in order to share the read load of the master library
Can only be written on the master
Both master and slave read operations are possible
advantage
The developer controls what kind of query is executed in the slave library, so it is more flexible
There is a program that directly connects to the database, so the performance loss is relatively small
Disadvantage
Increase the workload of development and make the program code more complex
Think control is easy to make mistakes
advantage
The middleware analyzes the query syntax and automatically completes the separation of reading and writing
Transparent to the program, no adjustments to existing programs
Disadvantage
The middle layer is added, so there is a loss in query efficiency
Delay-sensitive services cannot be automatically executed in the main library
The difference between read-write separation and read load balancing
The problem of read-write separation is how to execute different SQL statements in different roles of the replication cluster
Read load balancing mainly solves how to share the same load for databases with the same role
How to achieve read load balancing
LVS
Haproxy
MaxScale
hardware
F5