Data mirroring with DRBD

A Rackspace engineer asked me for some info on DRBD the other day.  We are heavy DRBD users.  Below is a summary of what I told him.  If you find this information useful, comment below…

We use DRBD to mirror our mail data between pairs of servers.  You can mirror any file system on top of DRBD.  We choose to use ReiserFS because it is optimal for handling large numbers of small files, which our maildir directory structures contain.  Any given server of ours has between 1 million and 2 million files.

On each pair of servers we have two separate DRBD mirrors going in opposite direction.  Only one server can mount a DRBD partition at a time.  The secondary server cannot even mount the DRBD partition as read-only while the primary has it mounted.  On “a” servers /dev/sda is mounted as primary and mirrored to /dev/sda on the “b” server.  On “b” servers /dev/sdb is mounted as primary and mirrored to /dev/sdb on the “a” server.

We use heartbeat to manage a two virtual IP addresses, one owned by “a” and one owned by “b”, and also to managed the DRBD primary/secondary status and mounts.  When heartbeat detects a failure, the remaining good server takes over the dead server’s virtual IP address and DRBD mount, and is capable of serving mail access for the user’s who’s data was on the failed server.

DRBD lessons learned:

– Create a dedicated partition for DRBD’s “meta-disk” and put it on a separate drive from the DRBD partitions themselves (we allocate 1 GB).  This should improve performance in theory, and at a minimum ensures that space is always dedicated to the meta-disk.  At first we did not allocate any space to meta-disks and DRBD defaulted to using the last 128 MB of it’s disk.  We believe that a full disk led to data corruption when running this configuration in at least one instance.

– Change the incon-degr-cmd setting to: “echo ‘!DRBD! pri on incon-degr’ | wall ; exit 1”   …. by default it halts your system when DRBD thinks it is in a degraded state.  Since our servers are at Rackspace and we have no console access, halting is a major annoyance.

– When server goes offline and then recovers, DRBD attempts to automatically reconnect the two and resync the data.  We have seen several cases where DRBD makes an incorrect decision for which server is primary, and the data sync occurs in the wrong direction — losing new data.  The DRBD authors have fixed several bugs related to this, but even with version 0.7.21 we have still seen this occur.  To work around this, we have configured heartbeat to handle failover, but not failback.  It requires manual intervention to reconnect the two servers and get them syncing.  As long as the engineer knows that they are doing, they can get it syncing correctly.

– DRBD is complicated software.  Try to keep everything else around it simple in order to quickly troubleshoot problems.  For example we used to run a local RAID 0 underneath DRBD in order to gain an I/O boost.  Don’t.  It is better to just run another instanced of DRBD on the additional disk and partition your data between the independent DRBD mirrors.

– Again DRBD is complicated.  If there are simpler alternatives, I would recommend exploring them.  For instance a simple rsync script will do great in many situations.  And csync2 is a good choice for multi-server synchronization of a relatively small number of files.  Both are easy to troubleshoot when things break because they run on top of any normal file system, whereas DRBD runs underneath the file system.  It is difficult to troubleshoot and fix problems with software that runs underneath the file system.

11 thoughts on “Data mirroring with DRBD

  1. Rayed

    Hi Bill,
    Thank you for the informative post.
    I am FreeBSD fan, so DRBD isn’t for me but I found geom ggate[cd] which can do the same.
    But I wonder why did you go with network block device replication instead of using SAN, any specific advantages?

    Reply
  2. Bill Boebel

    SAN was an option that we looked at. Although, we found that we could get the cost per GB lower by using lots of commodity SATA drives, if we could come up with a way to use this failure-prone hardware in a way that makes the overall system reliable. DRBD + heartbeat helped us achieve that; along with other software that we wrote to manage lots of pairs of DRBD boxes. So the answer is cost.

    Reply
  3. sun

    Hi,
    I am actually using DRBD for my cluster replication. but DRBD is only Sync when i am starting it. and then stays idle.. nothing happend.. even though i have created loads of files on in the partitions..
    The following is my DRBD.CONF..
    Please advise
    global {
    # we want to be able to use up to 2 drbd devices
    minor-count 2;
    dialog-refresh 1; # 5 seconds
    }
    resource r0 {
    protocol C;
    incon-degr-cmd “echo ‘!DRBD! pri on incon-degr’ | wall ; sleep 2 ; halt -f”;
    on drbd1 {
    device /dev/drbd0;
    disk /dev/sda2;
    address 192.168.1.69:7788;
    meta-disk internal;
    }
    on drbd2 {
    device /dev/drbd0;
    disk /dev/sda2;
    address 192.168.1.73:7788;
    meta-disk internal;
    }
    disk {
    on-io-error detach;
    }
    net {
    max-buffers 2048;
    ko-count 4;
    on-disconnect reconnect;
    }
    syncer {
    rate 10M;
    group 1;
    al-extents 257; # must be a prime number
    }
    startup {
    wfc-timeout 0;
    degr-wfc-timeout 10; # 2 minutes.
    }
    }

    Reply
  4. Schlika

    How do you manage backups ? To tape ? Using snapshots ?
    We are currently using drbd in a similar way for a high volume dating site, works very well !

    Reply
  5. Eric

    HI Bill
    Do you mind posting your drbd config file? If it’s a security risk or whatever I understand but it would be helpful to me.

    Reply
  6. Saqib Jang

    I’m new to DRBD, I’m
    wondering if it can be used
    for synchronous mirroring
    at the directory level
    e.g. building a HA NFS
    config. where the active
    node can fail over to one
    of multiple passive nodes
    depending, specifically if
    the performance is such
    that itallows synchronous
    mirroring and if
    directory-level mirroring
    granularity is allowed

    Reply

Leave a Reply

Your email address will not be published.