This is where the new 'mysqldbcompare' utility comes in handy. The mysqldbcompare utility uses the mysqldiff functionality (mysqldiff allows you to find the differences in object definitions for two objects or a list of objects in two databases) and permits you to compare the object definitions and the data among two databases. Not only will it find the differences among database objects and their definitions, it will also find differences in the data!
The databases can reside on the same server or different servers. The utility performs a consistency check to ensure two databases are the same defined as having the same list of objects, identical object definitions (including object names), and for tables the same row counts and the same data.
Some scenarios where mysqldbcompare can be employed include:
- checking master and slave for consistency
- checking production and development databases for consistency
- generating a difference report for expected differences among new and old data
- comparing backups for differences
Let us take a look at the utility in action. Below are two examples of the utility comparing what should be the same database on two servers. I am using a simple detail shop inventory database used to manage supplies. It consists of two tables (supplier, supplies) and three views (cleaning, finishing_up, and tools).
In the first example, we see an example where the databases are consistent. When you examine the output, you will see each object is inspected in three passes. First, the object definitions are compared. Any discrepancies would be displayed as a difference in their CREATE statements. If the object is a table, a row count test is performed followed by a comparison of the data. Surely, if the row count test fails we know the data check will fail.
Note: This is the same output (and indeed the same code) that is used for mysqldiff. The mysqldbcompare utility has all of the same features with respect to difference type presented (unified, differ, and context) which you can select with the same option named ‘--difftype’.
mysqldbcompare --server1=root@localhost --server2=root@backup_host:3310 inventory:inventory
# server1 on localhost: ... connected.
# server2 on localhost: ... connected.
# Checking databases inventory on server1 and inventory on server2
Defn Row Data
Type Object Name Diff Count Check
---------------------------------------------------------------------------
TABLE supplier pass pass pass
TABLE supplies pass pass pass
VIEW cleaning pass - -
VIEW finishing_up pass - -
VIEW tools pass - -
Databases are consistent.
# ...done
Normally, the mysqldbcompare utility will stop on the first failed test. This default means you can run the utility as a safeguard on data that you expect to be consistent. However, if you suspect or know there will be differences in the database objects or data and want to run all of the checks, you can use the ‘--run-all-tests’ option. This option will run the tests on all objects even if some tests fail. Note that this does not include system or access errors such as a down server or incorrect login - those errors will cause the utility to fail with an appropriate error message.
In the second example, we expect the databases to be different and we want to know which data is different. As you can see, the utility found differences in the object definitions as well as differences in the data. Both are reported.
mysqldbcompare --server1=root@localhost --server2=root@backup_host:3310 inventory:inventory --run-all-tests
# server1 on localhost: ... connected.
# server2 on localhost: ... connected.
# Checking databases inventory on server1 and inventory on server2
WARNING: Objects in server1:inventory but not in server2:inventory:
VIEW: finishing_up
VIEW: cleaning
Defn Row Data
Type Object Name Diff Count Check
---------------------------------------------------------------------------
TABLE supplier pass FAIL FAIL
Row counts are not the same among inventory.supplier and inventory.supplier.
Data differences found among rows:
--- inventory.supplier
+++ inventory.supplier
@@ -1,2 +1,2 @@
code,name
-2,Never Enough Inc.
+2,Wesayso Corporation
Rows in inventory.supplier not in inventory.supplier
code,name
3,Never Enough Inc.
TABLE supplies pass FAIL FAIL
Row counts are not the same among inventory.supplies and inventory.supplies.
Data differences found among rows:
--- inventory.supplies
+++ inventory.supplies
@@ -1,4 +1,4 @@
stock_number,description,qty,cost,type,notes,supplier
-11040,Leather care,1,9.99,other,,1
-11186,Plastic polish,1,9.99,polishing,,1
-11146,Speed shine,1,9.99,repair,,1
+11040,Leather care,1,10.00,other,,1
+11186,Plastic polish,1,10.00,polishing,,1
+11146,Speed shine,1,10.00,repair,,1
Rows in inventory.supplies not in inventory.supplies
stock_number,description,qty,cost,type,notes,supplier
11104,Interior cleaner,1,9.99,cleaning,,1
11056,Microfiber and foam pad cleaner,1,9.99,cleaning,,1
11136,Rubber cleaner,1,9.99,cleaning,,1
11173,Vinyl and rubber dressing,1,9.99,cleaning,,1
11106,Wheel cleaner,1,9.99,cleaning,,1
11270,Carpet cleaner,1,9.99,cleaning,,1
Rows in inventory.supplies not in inventory.supplies
stock_number,description,qty,cost,type,notes,supplier
11269,Microfiber spray on car wash towel,3,16.99,cleaning,,1
11116,Microfiber wax removal towel,3,16.99,waxing,,1
10665,Glass polish pad,3,10.00,polishing,,1
VIEW tools FAIL - -
--- inventory.tools
+++ inventory.tools
@@ -1,1 +1,1 @@
-CREATE ALGORITHM=UNDEFINED DEFINER=`root`@`localhost` SQL SECURITY DEFINER VIEW `inventory`.`tools` AS select `inventory`.`supplies`.`stock_number` AS `stock_number`,`inventory`.`supplies`.`description` AS `description`,`inventory`.`supplies`.`qty` AS `qty`,`inventory`.`supplies`.`cost` AS `cost`,`inventory`.`supplies`.`type` AS `type`,`inventory`.`supplies`.`notes` AS `notes`,`inventory`.`supplies`.`supplier` AS `supplier` from `inventory`.`supplies` where (`inventory`.`supplies`.`type` = 'tool')
+CREATE ALGORITHM=UNDEFINED DEFINER=`root`@`localhost` SQL SECURITY DEFINER VIEW `inventory`.`tools` AS select `inventory`.`supplies`.`stock_number` AS `stock_number`,`inventory`.`supplies`.`description` AS `description`,`inventory`.`supplies`.`qty` AS `qty`,`inventory`.`supplies`.`cost` AS `cost`,`inventory`.`supplies`.`type` AS `type`,`inventory`.`supplies`.`notes` AS `notes`,`inventory`.`supplies`.`supplier` AS `supplier` from `inventory`.`supplies` where (`inventory`.`supplies`.`type` in ('tool','other'))
Database consistency check failed.
Take a moment to read through the report above. At the top of the report (the first object tested), we see a critical error in the suppliers table. Here we can see that there are two different names for the same supplier_id. All relational database theory aside, that could spell trouble when it comes time to reorder supplies.
Notice the report for the supplies table. In this example, the utility identified three rows that were different among the two databases. It also identified rows that were missing from either table. Clearly, that could help you diagnose what went wrong where in your application (or your data entry).
Lastly, we see a possible issue with the tools view. Here the view definition differs slightly. Depending the use of the view this may be acceptable but it is nice to know nonetheless.
What Does This All Mean?
If data consistency is important to you or if you need a way to quickly determine the differences among data in two databases, the mysqldbcompare utility is a great addition to your toolset. But don’t take my word for it - try it out yourself.
Download and Try It Out!
The MySQL Utilities project is written in Python and is a component of the MySQL Workbench tool. You can download the latest release of the MySQL Workbench here:
http://dev.mysql.com/downloads/workbench/
There are some limitations in this first release. Currently, if the storage engines differ among the tables in the compare, the object definitions will show this difference – which is exactly what you would expect. However, the utility will stop and report the object definition test as a failure. You can still run the data consistency check by using the --force option which instructs the mysqldbcompare utility to run all tests unless they fail from an internal exception (for example, the table files are corrupt).
Got Some Ideas or Want to See New Features?
One really cool feature would be if the utility generate SQL statements for synchronizing the data and objects. Would this be something you would want to see incorporated?
If you’re intrigued by this new utility and in your course of use find new features or new uses that you would like to see incorporated in future revisions, please email me and let me know!
Related Material
Th latest MySQL Utilities development tree (including mysqldbcompare) can be found here:
https://launchpad.net/mysql-utilities
Hi Chuck,
ReplyDeletethis sounds like a great tool, we were in great need for something like that lately when we migrated our database in order to get rid of a large and historical innodb file...
One thing missing here is the run times of the utility. Our database has some pretty large entities and all the different tools we tried out took a few hours per entity...
Another thing is that if there is a discrepancy in the data of a given table and there are many many rows differing then it would be hard to read the file.
Again, great tool, we'll definitely find a way to use it, even on an ongoing basis if it runs really fast :-)
How well does this tool scale? If you have millions of rows to compare, are you locking the table the entire time during the comparison on both sides? Does the tool take in to account replication lag and make sure both sides are caught up?
ReplyDeleteYou can download the latest code from: https://launchpad.net/mysql-utilities.
ReplyDeleteI have tested the tool up to about 350k records. If you have a larger dataset, please post your experiences with the speed of the utility.
ReplyDeleteThe lock is held only for the first phase of the operation which calculates checksums for each row. Once that is done, the locks are released.
Currently the utility does not synchronize a master and slave, but that would be a cool feature to include I will put that on the wish list.
Printing run times can be done - but I wonder if you (Gabi D) would like a progress indicator as well. I know I feel uncomfortable watching a blank command line cursor blink seemingly forever wondering if something is really happening. Would this be helpful too?
ReplyDeleteYes, for comparisons with many rows differing the output could get messy. In those cases it might be a good idea to pipe the output to a file to examine later. Or would a write-to-file or limit output feature be more useful?
How to run this tool Chuck on a Linux machine which has mysql.
ReplyDeleteOnce you install Connector/Python then MySQL Utilities, you should be able to simply open a terminal and enter the command mysqldbcompare [...] (with appropriate options specified).
ReplyDeleteIf you need a more specific example, please post it so I can comment.
C/Py: http://dev.mysql.com/downloads/connector/python/
Utilities: http://dev.mysql.com/downloads/tools/utilities/
This comment has been removed by the author.
ReplyDeleteHello Chuck,
ReplyDeletebut if my database has password?
#imNewbie
Check the documentation. You provide it on the command line with --server* options. Let's say your password is SECRET:
Deletemysqldbcompare --server1=root:SECRET@localhost --server2=root:SECRET@backup_host:3310 inventory:inventory --run-all-tests
I'm so grateful man!
DeleteI found together a friend this answer.
I confess that I also lost me with term "inventory:inventory", but now all was resolved.
Thanks for All.
I tried to use mysqldbcompare but returns
ReplyDeleteERROR: Cannot connect to the server2 server.
Error Authentication with old (insecure) passwords is not supported. For more information, lookup Password Hashing in the latest MySQL manual
What can I do?
What version of MySQL are you using? What version of MySQL Utilities are you using? Can you paste the actual command so that I can see how you are attempting to connect (you can mask out the user id/passwd)?
DeleteFinally, is your MySQL server setup to use the default login protocol? For example, can you connect using the mysql client like this:
> mysql -u -p
Thanks for your quick reply
DeleteServer1 is my localhost with MySQL 5.1, Server2 is my hosting with MySQL 5.0
I use HeidiSQL to connect them, but it hasnt the db compare feature, so I downloaded mysql utilities 1.3.5
This is my command
mysqldbcompare --server1=root:secret@localhost --server2=cyberese_2rsv3:secret@server2 cyberese_2rsv3:cyberese_2rsv3
Thanks for your help
Ok, it is clear there is nothing wrong with how you provide the parameters and the server(s) are sufficiently new(ish). The problem is on your server side.
DeletePlease refer to the MySQL Manual to disable old-style logins:
http://dev.mysql.com/doc/refman/5.1/en/password-hashing.html
http://dev.mysql.com/doc/refman/5.1/en/old-client.html
pompeyomex:
ReplyDeleteHi I had a similar problem - I found that old-style logins were switched off, but since I had migrated an earlier (pre 4.1) mysql installation, some of the passwords were still in the old (15 character) hash format, rather than the newer (41 char). By resetting the password for the user this was created in the new format. Unfortunately this still doesn't help since the next error message says:
The server2 version is incompatible. Utility requires version 5.1.30 or higher.
... stumped. I now need to upgrade the older installation to 5.1 in order to use this...
Hope you got there in the end... :-)
Hello. We have that restriction (server version) for several reasons but mainly it is because there is some feature the utility is exploiting that is only available in a certain version and later.
DeleteFor example, the 5.1.30 requirement for mysqldbcompare is because it uses CREATE TABLE ... SELECT which was improved in 5.1.30 to fix a bug for binary logged tables.
Chuck, you mentioned:
ReplyDelete"One really cool feature would be if the utility generate SQL statements for synchronizing the data and objects. Would this be something you would want to see incorporated?"
Yes! That would be absolutely fantastic - I know this post is really old but I'm currently experimenting with putting a DB under version control so looking for ways to produce an SQL diff between two versions of a DB. Right now this tool is almost perfect, and all that would be needed is to be able to output a file of the diff in SQL format (ideally with data and schema separately) which I can then add to the db migration tool I'm using (https://github.com/guilhermechapiewski/simple-db-migrate)!!
Could you please please please add this? :D?
I just realised that you did already incorporate this with the "--difftype=sql" parameter! Right now I'm the happiest man alive!!! :D
DeleteThanks for this great introduction of mysqldbcompare, just what I needed to get started!
ReplyDeleteI'm having trouble getting this to work. I'm comparing two databases on the same MAMP server. My command is:
ReplyDeletemysqldbcompare --server1=root:secret@localhost:8888 sim_ebus:sim_mamp
I get the warning about using a password on the command line, and then it times out after a minute or two. From the examples I've seen, there should be a message about connecting to the server, but it never appears.
I've not seen this issue. You can try specifying the --server1 and --server2 options even if they are the same.
DeleteDo the tables have lots of data? It is possible the utility is still running trying to build checksums for the rows (that's how it works) and has known to take a while for large tables.
Also, use verbosity (-vvv) to see if there are any additional statements.
Finally, use mysqldiff instead to just check the structures.
If you are still unable to get it to work, please file a bug on bugs.mysql.com with as much detail as you can including any output, structure of the tables, and sample data (if possible).
Thanks for the suggestions.
DeleteSpecifying server2 didn't make any difference, and neither did the -vvv flag. mysqldiff wouldn't work either.
My database has 53 tables with a total of 158K records. Would that be large enough to cause a timeout?
Ok, that is strange. mysqdiff has nothing in it which can cause a hang like that.
DeleteLet's take this off the blog so we can communicate in more detail.
Please email me the version number of Utilities you are using, the version of MySQL, and the version of Connector/Python and Python along with an example of your execution (command).
Also, do any of the other utilities work? Are you using Utilities via Workbench or standalone?
Send this to my work email.
c h u c k dot b e l l at o r a c l e dot c o m
Chuck,
DeleteI’m sure I’m making some obvious mistake, since my experience with the MySQL Workbench and Utilities is zero. I’ve always used phpMyAdmin to manage databases, but it doesn’t have a built-in function for comparing databases.
I’ve wasted enough time trying to get mysqlbdcompare to work. I’ll go back to creating exports from phpMyAdmin and then comparing them in Text Wrangler.
Thanks for taking an interest in my problem.
Lorin
Possibly I'm missing a trick here, but I can't get mysqldbcompare to CREATE TABLEs (using the sql difftype) when required.
ReplyDeleteI'm putting a db under version control and am hoping to automate it, but I'm kinda stuck if mysqldbcompare won't pick up new tables.
$ mysqldbcompare --quiet --compact --difftype=sql --run-all-tests -vv --server1=script:password@localhost:3306 a:b
[..]
# WARNING: Objects in server1.a but not in server1.b
# TABLE: foo
Hi. The utilities only identifies objects that are not in both. It doesn't currently dump the CREATE for missing objects. But that would be a nice feature. I invite you to open a feature request on bugs.mysql.com. :)
DeleteThanks, I was vaugely holding out for some undocumented option - oh well.
Deletehttp://bugs.mysql.com/bug.php?id=80593
Hi,
ReplyDeleteI ran this utility on my 1TB data. the process got struck but process is running when it reached to to my 200GB table which has 18Million rows and in utf8 format. And there is no error.
Checked the tmpdir space and has disk space. Not understanding why the process got hanged.
Hi. Sorry you are having trouble. Can you tell me which version of MySQL Utilities, the MySQL server, and possibly the schema of your table in question? You could try narrowing the scope by creating copies of the table partitioned into smaller chunks. That may help determine what row(s) are causing the problem.
ReplyDeleteThat said, with large data the utility can sometimes appear to be hung but is actually processing too many rows in memory. We don't have a solution for this yet, but it can become memory bound for large tables, especially if the primary keys are large, have wide distribution, or contain blob fields.
If you would like, we can continue the conversation via email.
c h u c k dot b e l l at o r a c l e dot c o m
I Hi Chuck,
ReplyDeleteI was looking at your blog because I am searching a solution for my problem when I do mysqldbcompare to a big database MySQL goes down and I have to start the service again. any succession.
NOTE: I am not a DB ADMIN :)
I'm seeing an odd result. My compare is telling me that three tables are not present in the target when they definitely are there. this makes me concerned about what else it might be getting wrong.
ReplyDeleteHi, Can you send me more details about the problem you’re seeing? Feel free to contact me at c h u c k dot b e l l at o r a c l e dot c o m.
DeleteNice & useful information...thanks
ReplyDelete-Satya
http://satya-dba.blogspot.com/2018/11/mysql-mysqldbcompare-compare.html