Submitted by tensai on
I just solved a weird MySQL replication problem and thought I would share with you all. First, the background. I have a master and slave running with one-way replication. The slave just sits by waiting for his time to shine, but otherwise doesn't do anything. Well, last week the master had a problem with the filesystem. I can't recall exactly what it was, out of space or something. It confused the heck out of the master and so it basically shut down. One of my co-workers fixed the problem and got the master running again, but the slave was in a pickle. Here is the error it was showing:
Relay_Master_Log_File: mysql-bin.031 Slave_IO_Running: Yes Slave_SQL_Running: No Last_errno: 0 Last_error: Query 'DELETE FROM foo WHERE bar = 1' caused different errors on master and slave. Error on master: 'Got error %d from table handler' (1030), Error on slave: 'no error' (0). Default database: 'baz' Exec_master_log_pos: 118871
Because it thought the command failed on the master, it refused to continue. I can't say it's an altogether bad plan since data integrity is generally the main theme of a database (yes, cue the jokes about my using MySQL in the first place). The question became, how do I get the slave to start up again. "SLAVE STOP; SLAVE START" didn't have any effect.
The trick was suggested to me by a post at mysql.com which pointed out a tool new to me, mysqlbinlog. See I figured the simplest thing would be to restart replication at the step just after the "failed" transaction, since I knew that transaction had actually succeeded. But I have no idea how the binlog counters work, so I couldn't just make up numbers. It's some kind of binary offset. Well, mysqlbinlog will show it to you.
# mysqlbinlog mysql-bin.031 -j 118871 |less
Which of course showed me this:
# at 118955 #080605 18:59:09 server id 1 log_pos 118955 Query thread_id=3218 exec_time=0 error_code=0
So on my slave I restarted replication at offset 118955 and like magic, the slave ripped through the binlogs and caught up in practically no time at all.
Recent comments