Mustafa Uysal

I'm traveling light, it's au revoir…

Advanced Database Interactions with HyperDB

This is the continuation post of my previous post “Advanced Database classes for WordPress“. Please read that post before continue unless you are not familiar with advanced database classes for WP.

Well, it’s time to see some configurations and ideas about how Hyper-DB or relevant DB classes might help you to scale.

There are a bunch of examples and explanations about the configuration on the db-config.php file.

Master-Slave Different Machines

<?php
// MASTER machine
$wpdb->add_database(array(
'host' => DB_HOST,
'user' => DB_USER,
'password' => DB_PASSWORD,
'name' => DB_NAME,
'write' => 1,
'read' => 0,
));
// SLAVE 1
$wpdb->add_database(array(
'host' => '192.168.1.2',
'user' => DB_USER,
'password' => DB_PASSWORD,
'name' => DB_NAME,
'write' => 0,
'read' => 1,
'dataset' => 'global',
'timeout' => 0.4,
));
// SLAVE 2
$wpdb->add_database(array(
'host' => '192.168.1.3',
'user' => DB_USER,
'password' => DB_PASSWORD,
'name' => DB_NAME,
'write' => 0,
'read' => 1,
'dataset' => 'global',
'timeout' => 0.4,
));
view raw db-config.php hosted with ❤ by GitHub

Multisite Example

<?php
$wpdb->add_database( array(
'host' => 'global.db.example.com',
'user' => 'globaluser',
'password' => 'globalpassword',
'name' => 'globaldb',
) );
$wpdb->add_database( array(
'host' => 'blog.db.example.com',
'user' => 'bloguser',
'password' => 'blogpassword',
'name' => 'blogdb',
'dataset' => 'blog',
) );
$wpdb->add_callback( 'my_db_callback' );
// Multisite blog tables are "{$base_prefix}{$blog_id}_*"
function my_db_callback( $query, $wpdb ) {
if ( preg_match("/^{$wpdb->base_prefix}\d+_/i", $wpdb->table) ) {
return 'blog';
}
}
view raw db-config.php hosted with ❤ by GitHub

Multisite blog tables are something like “wp4_posts” and the example above says that “hey this table belongs to a blog, send queries to blog dataset”

You can add some logic there, let’s say first 1000 blogs use blogset1 1000-2000 use blogset2 and goes on… (this is a terrible idea, there are some logic that makes this process better)

Another use case; imagine you are hosting one of the enterprise websites on your multisite. (wp.com hosts TechCrunch). You might want to use separate dataset for enterprise customers. So, check the blog id, and return different dataset.

BuddyPress and HyperDB

<?php
$wpdb->add_database( array(
'host' => 'global.db.example.com',
'user' => 'globaluser',
'password' => 'globalpassword',
'name' => 'globaldb',
) );
$wpdb->add_database( array(
'host' => 'bp.db.example.com',
'user' => 'bpdbuser',
'password' => 'bpdbpassword',
'name' => 'bpdb',
'dataset' => 'bp',
) );
$wpdb->add_callback( 'my_db_callback' );
// depends on your bp prefix bp_core_get_table_prefix() probably not working here
function my_db_callback( $query, $wpdb ) {
if ( preg_match("/bp_/", $wpdb->table) ) {
return 'bp';
}
}
view raw db-config.php hosted with ❤ by GitHub

Let’s say you have a successful social network top on the BP. Something like this config will help you to split data between WP and BP.

REST API

<?php
// MASTER machine
$wpdb->add_database(array(
'host' => DB_HOST,
'user' => DB_USER,
'password' => DB_PASSWORD,
'name' => DB_NAME,
'write' => 1,
'read' => 0,
));
// REST READ-ONLY
$wpdb->add_database(array(
'host' => '192.168.1.9',
'user' => DB_USER,
'password' => DB_PASSWORD,
'name' => DB_NAME,
'write' => 0,
'read' => 1,
'dataset' => 'rest',
'timeout' => 0.4,
));
$wpdb->add_callback( 'my_db_callback' );
function my_db_callback( $query, $wpdb ) {
if ( defined( 'REST_REQUEST' ) && REST_REQUEST ){
return 'rest';
}
}
view raw db-config.php hosted with ❤ by GitHub

In this case, we are sharding REST request to a different dataset.

Use a table from another DB

<?php
$wpdb->add_database( array(
'host' => 'global.db.example.com',
'user' => 'globaluser',
'password' => 'globalpassword',
'name' => 'globaldb',
) );
$wpdb->add_database( array(
'host' => '192.168.1.5',
'user' => DB_USER,
'password' => DB_PASSWORD,
'name' => 'mycustomdatabase',
'write' => 1,
'read' => 1,
'dataset' => 'customdataset',
'timeout' => 0.5,
) );
$wpdb->add_table( 'customdataset', 'wp_my_custom_table' );
view raw db-config.php hosted with ❤ by GitHub

The configuration like above, you can use a totally different database from another dataset for a table. Let’s say you are holding a large amount of trivial data, you probably don’t want to keep your main dataset busy. (Of course, we can use callback there too, just want to show add_table works)

Conclusion

HyperDB is a battle-tested and production-grade abstraction layer of the database calls. It’s flexible enough to handle scaling problems out of the box. But careful when planning the sharding scenario across the multiple datasets especially on multisite. (remember Barry’s talk again)

I saw so many wrong assumptions when people try to use this plugin. This is not a normal plugin, as mentioned its description it’s an “advanced database class”. I hope these examples help.

Also, I highly recommended LudicrousDB instead of HyperDB for the new installations.

I just remembered old glories days of the Zend_Db_Table …Which also supports different RDBMS

4 responses

  1. keystone j andrade Avatar

    “let’s say first 1000 blogs use blogset1 1000-2000 use blogset2…”
    how would the function be for this case?

  2. […] to say, this is a hard problem to solve properly (see also here, here, here and here for a few references regarding database replication across different wordpress […]

  3. Mary M Avatar
    Mary M

    You said: “Let’s say you have a successful social network top on the BP. Something like this config will help you to split data between WP and BP.”

    Isn’t there a simple way to split the first 1000 buddypress users similar to your blog example., such as adding some function logic, let’s say first 1000 bp users use buddyset1 1000-2000 use buddyset2 and so on.

    “Multisite blog tables are something like “wp4_posts” and the example above says that “hey this table belongs to a blog, send queries to blog dataset”

    You can add some logic there, let’s say first 1000 blogs use blogset1 1000-2000 use blogset2 and goes on…”

    Here again same query as Andrade above, how would the function look like in such a case? Splitting inside buddypress is a bigger challenge than splitting between WP and BP, IMHO and its much more likely to have more than 1000 BP users than 1000 blog users. Your thoughts?

    Thanks in advance.

    1. mustafauysal Avatar
      mustafauysal

      Hey Mary,

      Isn’t there a simple way to split the first 1000 buddypress users similar to your blog example

      You can’t split tables per user (data integrity); the user’s table is a global table and it will not be created per site – even on a multisite installation.

      Here again same query as Andrade above, how would the function look like in such a case? Splitting inside buddypress is a bigger challenge than splitting between WP and BP, IMHO and its much more likely to have more than 1000 BP users than 1000 blog users. Your thoughts?

      1000 users do not have much meaning for an average database. (of course the amount of data stored in the tables/indexes and request count matter) On the other hand, storing 1000 blogs means 1000 * individual site tables (https://docs.wpvip.com/technical-references/multisite-database-structure/) and easily reaching thousands of tables with millions of records – which could be challenging at a certain level.

      Shortly, I don’t recommend “splitting inside BP”, perhaps a DBA can reply to that question better. I think 99.99% BP installation will never ever need such a thing. Overengineering is a curse. Here is what I would do in your situation:

      * Move BP tables to a separate machine and setup LudicrousDB
      * If the separate machine is not enough, add a slave configuration to send read requests to those server(s). (you can add nth slave as needed, just adjust the config)
      * Once your master BP machine starts struggling with the write request. Congratulations, your project must succeed and you can hire talented folks to take care of scaling problems, or you can pay more to cloud services for managed DB services. (at this stage, assuming you are next twitter)

      I hope this helps 🙂

      Cheers,

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.