Category Archives: Tech - Page 3

MySQL replication and the sync_binlog option

Recently I’ve been focusing on MySQL replication for a project at work. On this particular project, I’m acting in a Solutions Architect role and have been since about September of 2008.

Because of my background in systems administration, I tend to get myself into situations where I become the Schematic-side sys admin on projects. This involves things like deployment processes, getting development, staging, and production environments setup, and now, setting up MySQL replication. This is probably because a) I’m probably bad at delegating these things to others, b) I’m kinda’ good at it, and c) let’s be honest, I’m a control freak, so I like knowing the servers hosting my apps are setup in a meticulous manner.

In short, we’re running MySQL 5.0.45 on RedHat Enterprise Linux 5 (I know, I know…RedHat and MySQL 5.0…boo…but it’s okay). We’re required to replicate our production database to a secondary machine for backup purposes. This way, if our production server dies, we can manually failover to the slave (once we enable writes to it, of course), then swap the two back once the production server is back up.

All in all, this site is rather low-traffic at around 20,000 dynamic page views per day. Factoring in US-based users in an 11 hour time period, that’s about 1,800 request per hour, or right around 0.5 page views per second. We’ve got a single server in production that’s acting as our webserver and database server; it’s got a RAID array in it that was all setup by the group hosting the application (so I don’t know tons about it, but it’s new, quality hardware).

I’m using my master database for reads and writes in Production. Again, the slave is really only required for live backup-type purposes.

Now, I’m no expert on MySQL replication, but I’ve learned a lot these past few weeks. So I’m going to share one big caveat here. Please correct me as you see fit!

MySQL’s got a sync_binlog configuration option. You typically set it in my.cnf, and its value is an integer from 0-n. This value determines how many binary log writes need to occur before its contents are flushed out of the buffer and onto disk. With it set to zero, your operating system just determines when the buffer is flushed to disk.

I have a database migration process that copies table structures and their data from PostgreSQL into MySQL, then basically migrates that data into the appropriate tables in the new MySQL instance. It involves the transformation of a lot of data. It’s a sizable, complete data set for 8+ year old system that’s not the prettiest, best normalized data model in the world.

Per recommendations in High Performance MySQL, I had my sync_binlog value set to 1 in Production.

When I was performing a test migration to Production recently, the process took about 3 hours. Wow. Thanks, MySQL! It normally takes about 90 minutes in Staging, if that.

In digging around Google and MySQL.com, I found that a non-zero value for sync_binlog causes more disk seeks to flush the binary log to disk. The benefit of having it set to 1 is so that every transaction can be written to the binary log, which is then flushed to disk upon commit. Then, if your server happens to die, the last completed transaction will always be present in the binary log on disk, so you never have to worry about, say, missing a transaction replay on your slaves. However, this results in a lot more disk activity on your master.

I set sync_binlog to 0 and re-ran my migration. It ran in 90 minutes — that’s a 50% performance gain! Now, if you do the math, this makes sense. It’s one less disk seek and write per-transaction, so this result totally makes sense. Hooray for numbers, right?

I’m willing to gamble the integrity of data on my slave for the 50% performance increase. (remind me of this post in 6 months when I’m kicking myself over this for some reason, okay?)

With no binary logging enabled (i.e. in our dev environment), this process takes about 20 minutes. This makes sense — far less disk writes during the process.

Another way to workaround this would be to keep your binary logs on a physically separate disk. However, I don’t have that luxury at this point, so that’s not an option for me. If I had my druthers, this is how I’d handle the problem, but…no dice for now.

Anyways, my main point: if you are willing to gamble with every single transaction being replicated to your slave in the event of a crash, perhaps you can set sync_binlog to 0. If you’ve got a separate disk to devote to your binary log, by all means, set it to 1! There are other concerns around this are related to battery-backed disk cache, which you can read a bit more about in Jeremy Cole’s post on MySQL replication. You can also see some handy benchmarks that compare MySQL with and without binary logging.

Finally, I’ll admit this is a bit of a knee-jerk reaction post. I’ve done a bunch of research on this, but it’s not all quite fleshed out in my mind yet. I get the whole cause and effect in theory, but I haven’t dug into MySQL source or other materials to really understand what’s going on behind the scenes.

MySQL replication is a tricky thing. It’s great when it works, but understand that there are overhead tradeoffs in using it! I’m sure I’ll learn more in the weeks and months following our launch, so I look forward to sharing more of my successes and/or pains on this. Comments, feedback, and flames such as “OMG, you’re so wrong Brian!” and “Brian is a n00b!” are welcome.

“Rickroll…” goes to print with php|architect!

One of my accomplishments during 2008 was preparing and presenting a new talk, “Rickroll To Go With WURFL, PHP, and Other Open Source Tools”. This talk focused on some of the challenges with delivering content to mobile device users, such as limited bandwidth, limited resources on the device, and varying device support for video and audio formats. It illustrated how to use tools such as the imagick extension and FFmpeg to deliver content and an experience that was optimized for mobile devices.

Or, another way to refer to it: if you were ever at a PHP conference and heard of some dude Rickrolling a group during his talk, that was me.

I gave this talk at Atlanta PHP, ZendCon 2008 and PHP Appalachia, and received positive reception and feedback in all cases.

Now, in trying to knock out my New Year’s resolutions, I am adapting this talk into an article for php|architect, which will be published around October 2009.

I’ve only got 3,000 words to work with, so I may not be able to address all three major content areas — images, video, and audio — but I’ll do what I can. The focus of the issue will be “graphic manipulation,” so focusing on the image portion may end up making the most sense given the word limitations. You don’t want to read more than 3,000 words worth of my ramblings anyway.

So, keep your eyes open for that issue later in the year! Also, special thanks to Elizabeth Naramore, Beth Tucker Long, and Marco Tabini for the opportunity to grace the pages of their fantastic publication with my words, ideas, and experience.

Also, another special thanks to my co-workers JP Crevoiserat and Joseph Jorgensen who worked with me on a project that inspired the talk and this article. And as always, a special thanks for Schematic for always supporting the community-related efforts of its employees.

Factory Method pattern; ATLPHP tonight!

I’ll be giving a “mini-talk” on the Factory Method pattern at Atlanta PHP this evening.

Who doesn’t love the Factory Method pattern, right? Good stuff. A link to a PDF of the slides is below. I struggled to come up with a cool example, but an one related to cars was the best that I could do. You’ll find some PHP-specific example code in the slides, too.

Grab the slides (PDF)

For further reading on the Factory Method pattern and other classic design patterns, you can always grab a copy of Design Patterns: Elements of Reusable Object-Oriented Software. Enjoy!

Seven Things

I’m getting in on this “Seven Things” deal by way of being “tagged” by Ben Ramsey. And yes, Ben, while I am technically your “boss,” we’re all still on vacation, so…that doesn’t count now. Or something. Cut it out.

So, here are my seven things, which have been rolling around my head all day. Here we go:

  • I’m a triplet. My other two brothers, Brad and Brent, live in Baltimore and Kansas City respectively. We were born one-minute apart. Brad is also in the IT industry working for the government, while Brent is working for my Dad at Midwest Towers in estimating, sales, repairs, etc. for cooling towers.
  • I started coding in BASIC on an Apple IIe around 4th or 5th grade. I later expanded into GWBASIC on DOS, and took Pascal and other programming classes in high school. Around age 14, I released various shareware applications written (and compiled!) in QuickBASIC under the name Untitled Software. I received a handful of registrations for my “most popular” product, “Master Menu III,” which was a DOS-based application launcher (complete with VGA mode screensavers!). My most distant registration was from the Netherlands, so I just gave him a free license in order to not deal with currency exchange stuff.
  • While working at Best Buy from Summer 1997 to December 1998, I had the opportunity to wear the Idea Box costume during a holiday season. You can’t see them, but I’m wearing blue and yellow elf shoe covers over my shoes. I also couldn’t talk, would just have to wave and shake hands with people. My friends would come in the store, so I’d see them outside the mesh on the “B” in “Best” and break the no-talking rule. Occasionally I’d roam around the store being goofy, while trying to not knock things over. It was a great time! Normally, I worked in the Media Department, which sold music, movies, and software. Loved that job!
  • I moved from the suburbs of Kansas City, MO to New York City at the ripe old age of 19. I was offered a job with VitalAging (now BenefitsCheckUp.org) as a Systems Administrator. I had nothing to lose, so…why not? Then I stayed there seven years before moving to Atlanta. Brooklyn, represent!
  • After moving to the NYC area, I continued and completed my undergraduate education at Baruch College, CUNY by going to school in the mornings and evenings, while still working full-time. It was tough, but worth it. I got my degree in Computer Information Systems (more business-, accounting-, and management-focused), which has turned out to be very practical. I wish I had algorithms, compilers, and all of those other low-level Computer Science courses, but the practical things have served me very well.
  • I met my wife, found my job at Schematic, and my last Brooklyn apartment on Craigslist (NYC).
  • I have a long family history of being a huge fan of Hooters (the restaurant). I’ve been going there since I was pretty young. Whenever we’d travel as a family, we’d stop at locations during our trip. My Dad actually collects the menus for all of the ones be visits — he must have 50 to 60 at this point. I once drove from Boston to NYC via Rhode Island just to be the first DeShong to visit a location in that state. And yes, I took a menu and mailed it to my Dad. I go for the food, though — I’ll even get takeout (when my wife refuses to go), which proves my devotion to the food.

And here are my lucky seven!

  • Robert Swarthout because I want to see if he’ll actually step up and write this blog post…and because he’s a co-worker of mine and a fellow Atlantan
  • Graham Christensen, one of the youngest dudes in PHP, for having just visited Georgia, the state I now call home
  • Shawn Stratton for being another fellow Atlantan, and for that time we randomly bumped into each other at the Envelope Building downtown
  • Brian Moon for being my PHP Appalachia 2008 “Team Haystacks” teammate, and a fellow southerner
  • Jason Sweat for attending my talk(s) at php|tek 2008, and having some awesome feedback. And for having an awesome last name?
  • Lars Strojny for proving some comments and feedback on my Zend_Log_Writer_Mail proposal
  • Nate Abele for being the CakePHP and OmniTI dude, and for that time a handful of us went drinking and to the Comedy Cellar in NYC

And, of course, the rules:

  • Link your original tagger(s), and list these rules on your blog.
  • Share seven facts about yourself in the post – some random, some weird.
  • Tag seven people at the end of your post by leaving their names and the links to their blogs.
  • Let them know they’ve been tagged by leaving a comment on their blogs and/or Twitter.

Converting lowercase_with_underscores to camelCase

I have a case where I’m retrieving an array of associative arrays from a database as shown below:

Array
(
    [0] => Array
        (
            [id] => 1
            [some_id] => 100001
            [some_company_name] => Foo
            [name] => Bar
            [created] => 2008-12-25 21:13:58
            [last_updated] => 2008-12-30 23:32:43
        )

)

…but I need to represent this structure in XML. I have a pre-defined standard in my XML request/response that the element names are to be camel cased.

Aliasing the field names at the query level is just…a hack. So, I needed a way to convert “foo_bar_baz” to “fooBarBaz” before adding the XML element.

To do this, you can use preg_replace_callback() as shown below, which uses a callback function on every match found.

class FooController
{
    public function barAction()
    {
        $dao = new Some_Dao();
        $rows = $dao->getWhateverRows();

        foreach ($rows as $row) {
            foreach ($row as $key => $value) {
                // $key is "foo_bar_baz"
                $key = preg_replace_callback(
                    '/_(\w)/',
                    array($this, '_convertToCamelCase'),
                    $key);

            // $key is now "fooBarBaz"
            }
        }
    }

    private function _convertToCamelCase(array $matches)
    {
         return ucfirst($matches[1]);
    }
}

There is an e modifier that can be used with preg_replace(), but that requires the replacement string to be a valid string of PHP code. This forces the interpreter to parse the replacement string into PHP on each iteration, which can be quite inefficient. Instead, the preg_replace_callback() uses a callback function, which only needs to be parsed once.

I suppose you could easily do the reverse, though I haven’t had a need to write that code yet. :) But…there you go! A little end-of-year tip from myself and the preg_replace() examples.