Tag Archives: tech

Data mirroring with DRBD

A Rackspace engineer asked me for some info on DRBD the other day.  We are heavy DRBD users.  Below is a summary of what I told him.  If you find this information useful, comment below…

We use DRBD to mirror our mail data between pairs of servers.  You can mirror any file system on top of DRBD.  We choose to use ReiserFS because it is optimal for handling large numbers of small files, which our maildir directory structures contain.  Any given server of ours has between 1 million and 2 million files.

On each pair of servers we have two separate DRBD mirrors going in opposite direction.  Only one server can mount a DRBD partition at a time.  The secondary server cannot even mount the DRBD partition as read-only while the primary has it mounted.  On “a” servers /dev/sda is mounted as primary and mirrored to /dev/sda on the “b” server.  On “b” servers /dev/sdb is mounted as primary and mirrored to /dev/sdb on the “a” server.

We use heartbeat to manage a two virtual IP addresses, one owned by “a” and one owned by “b”, and also to managed the DRBD primary/secondary status and mounts.  When heartbeat detects a failure, the remaining good server takes over the dead server’s virtual IP address and DRBD mount, and is capable of serving mail access for the user’s who’s data was on the failed server.

DRBD lessons learned:

– Create a dedicated partition for DRBD’s “meta-disk” and put it on a separate drive from the DRBD partitions themselves (we allocate 1 GB).  This should improve performance in theory, and at a minimum ensures that space is always dedicated to the meta-disk.  At first we did not allocate any space to meta-disks and DRBD defaulted to using the last 128 MB of it’s disk.  We believe that a full disk led to data corruption when running this configuration in at least one instance.

– Change the incon-degr-cmd setting to: “echo ‘!DRBD! pri on incon-degr’ | wall ; exit 1”   …. by default it halts your system when DRBD thinks it is in a degraded state.  Since our servers are at Rackspace and we have no console access, halting is a major annoyance.

– When server goes offline and then recovers, DRBD attempts to automatically reconnect the two and resync the data.  We have seen several cases where DRBD makes an incorrect decision for which server is primary, and the data sync occurs in the wrong direction — losing new data.  The DRBD authors have fixed several bugs related to this, but even with version 0.7.21 we have still seen this occur.  To work around this, we have configured heartbeat to handle failover, but not failback.  It requires manual intervention to reconnect the two servers and get them syncing.  As long as the engineer knows that they are doing, they can get it syncing correctly.

– DRBD is complicated software.  Try to keep everything else around it simple in order to quickly troubleshoot problems.  For example we used to run a local RAID 0 underneath DRBD in order to gain an I/O boost.  Don’t.  It is better to just run another instanced of DRBD on the additional disk and partition your data between the independent DRBD mirrors.

– Again DRBD is complicated.  If there are simpler alternatives, I would recommend exploring them.  For instance a simple rsync script will do great in many situations.  And csync2 is a good choice for multi-server synchronization of a relatively small number of files.  Both are easy to troubleshoot when things break because they run on top of any normal file system, whereas DRBD runs underneath the file system.  It is difficult to troubleshoot and fix problems with software that runs underneath the file system.

Reply from Webster

In response to my previous post (and the link I emailed them), I received the email below today.  Hopefully we’ll see “prepend” added to the unabridged dictionary sometime this decade…

Dear Bill:

Thanks for your letter.  As you may know, we enter words in our dictionaries based on their use in current printed and edited sources.  A word is only entered in our dictionaries when it meets three criteria:  widespread usage in well-read publications; established usage over a certain period of time; and an easily discernable definition. For this sense of “prepend” to be entered, then, it will need to appear in a number of well-read print sources for a good number of years.

I did a quick check of our citational files, which house upwards of 17 million citations of words in context, and while we have some evidence of this sense of “prepend,” most of it is highly technical, which does not make it a good candidate for entry into an abridged dictionary like the Online Dictionary.  It may be a candidate for entry into the unabridged _Webster’s Third New International Dictionary_, however.  We will consider it when we next revise that hefty tome.

For more information on how a word is entered into our dictionaries, visit http://www.merriam-webster.com/help/faq/words_in.htm, and if you have any further questions or comments, please contact us again.

Sincerely,
Kory Stamper, Associate Editor
Merriam-Webster, Inc.

An open letter to Merriam-Webster

Dr. Frederick C. Mish
Editor-in-Chief
Merriam-Webster, Inc.
47 Federal Street
Springfield, MA 01102

Dear Dr. Mish,

Please consider adding the word "prepend" to the Merriam-Webster dictionary.

As you may be aware, folks in the technology world use this word quite frequently in situations where you "append" to the beginning of something.  Prepend’s meaning is synonymous with the definition of the word "prefix", however prepend is more commonly used by technology professionals.

The word is used within several popular software applications such as Postfix, as well as hardware appliances such as Cisco routers.  Googling for prepend returns several thousand of examples of it’s use.

The frequency that the word "prepend" is used today should justify consideration for inclusion in your dictionary.  Even though we are techies, we do not like speaking broken English.

Regards,

Bill Boebel
Chief Technology Officer
Webmail.us, Inc.

epoll() vs poll()

SERVER A:

Cpu(s):  7.3% us,  5.6% sy,  0.0% ni, 81.5% id,  0.0% wa,  5.6% hi,  0.0% si

SERVER B:

CPU states:  cpu    user    nice  system    irq  softirq  iowait    idle
            total    5.8%    0.0%   71.5%   1.9%     9.8%    0.0%   10.7%

SERVER A is one of our newer POP3/IMAP proxy servers, running Red Hat ES4 (2.6.9 kernel) with Dovecot compiled to use epoll() to handle network events.  SERVER B is one of our older POP3/IMAP proxy servers, running Red Hat ES3 (2.4.21 kernel) with Dovecot complied to use poll() to handle network events.

Each server is handling an equal number of POP3, POP3, IMAP and IMAPS connections.  Several thousand connections total.  Dovecot is handling all of the connections using one process per port, so 4 processes total.

Notice that SERVER A is mostly idle, while SERVER B is using 71.5% of the CPU to handle system events.  Click on the epoll() and poll() links above to understand why this occurs, and why we are in the process of replacing all of our proxy servers with ES4 + epoll().

Just got the VML patch

Nice.  I just got my IE VML patch.  I didn’t expect this for another two weeks.  Way to go Microsoft!  The next update wasn’t supposed to occur until October 10th.

I hope moving forward they continue to release critical updates when they are needed, rather than just during their normal once per month predictable cycle.  Save the less critical updates for the monthly download.

(I’ll still stick with Firefox though)

Previous post with HTML chars replaced

Apparently my previous post doesn’t show up well in some RSS readers, including Webmail.us’s reader.  The regexp rules are rewritten below, with the less-than html bracket characters replaced with LT.

/etc/postfix/body_checks.regexp:
  /LTv:rect/       REPLACE safety: MS VML tag removed
  /LTv:fill/       REPLACE safety: MS VML tag removed