Category Archives: Scalability

Top Ten List + CoderFaire Atlanta 2013

Back in March, I gave a new talk at Atlanta PHP: “Top Ten List: PHP and Web Application Performance”. This talk is a culmination of my ~14 years of experience primarily as a web application developer, but also as a systems administrator / DevOps-type.  After working with PHP and web applications for so many years, I have amassed quite a few tricks for squeezing maximum performance out of web applications, PHP or otherwise.

I’ll be presenting it again at CoderFaire Atlanta on April 20, 2013.  CoderFaire is organized by a fantastic crew of Cal Evans, Kathy Evans, Chris Spruck, Kevin Roberts, and Jacques Woodcock, so it’s going to be a great event. I’ve never attended a CoderFaire event before, but I’ve only heard positive things. Because it’s not limited to a single technology platform, you’re sure to meet a wide array of technical minds from all different backgrounds. I’m sure we’ll all walk away with some fresh, new ideas from this diverse crowd.

At only $50 per ticket, you’re not going to find a better deal on a technical conference this year. Register now!

As a little teaser, here are each of the 10 guests introducing the topics.  Come on out for the juicy details. Be prepared to go home, sit down, and optimize some aspects of your web application, though!  See you there.

10. Elizabeth Naramore, GitHub: Tweak your realpath cache settings

9. Scott Rocher, Tonx Coffee:  Whenever possible, use offline processing

8. Matthew Turland, Synacor: Write efficient SQL queries

7. Scott Lively, 3SI Security Systems: Don’t execute queries in loops

6. Jed Lau & Maggie Nelson, Findery: Know what your application is doing


5. Robert Swarthout, ShootProof: Use gzip compression on responses

4. Ian Myers, Findery: Do not use .htaccess files

3. Ken Macke, RockIP Networks: Cache all the data that you can

2. Davey Shafik, EngineYard: Use a content delivery network

1. Ben Ramsey, Moontoast: Use APC and set apc.stat = 0

On to CrowdTwist

TL;DR: I’m making a move to CrowdTwist, a New York City-based startup providing social and loyalty services for some of the world’s biggest brands. I worked with their co-founder and CTO, Mike Montero, from 2001-2005 at Community Connect Inc., where I got my start as a developer in PHP and open source tools. I’m thrilled to be joining him and the CrowdTwist team on what’s sure to be an incredible adventure.

Back in 2001, I joined Community Connect Inc. (now Interactive One) as a Senior Network Support Specialist. I was an internal sysadmin, spending my time managing Linux-based file servers, development servers and things of that nature. I had been living in New York just a few short months.

CCI operated what were, at the time, some of the most highly-trafficked social networking sites on the Internet. This was before MySpace and Facebook, of course. BlackPlanet.com was one of the mostly highly-trafficked PHP-based sites on the Internet. We were doing an insane amount of traffic. Our applications had to perform well and scale. We had no choice.

Even as a sysadmin-type, I was surrounded by some incredibly talented developers, who were all working with PHP on Linux and Apache using Oracle. There were caching, using CDNs, and doing things that were still relatively new on the Web. I caught the development bug. I started writing code in my spare time, taking on little projects on the side. How could you not get totally infected in an intense, exciting environment such as this?

In early 2002, I received a call out of the blue from my boss (also our CTO), Mike Montero. On that call, he asked me if I’d be interested in moving to the CCI development team. There was only one way to answer: “YES!”

Thus, in early 2002, just about 10 years ago, I became a developer. The most lowly of the low — “Associate Software Developer.” Over the next three and a half years, I worked my way up to Technical Lead, learning a ridiculous amount from my colleagues. I had amassed this strong set of experience in systems administration and development. I was really growing my skills, and I loved every second of it. I worked many late nights and weekends…and it was an absolute thrill.

Our work was literally being seen by millions of users every day. We were building quality products, all on a home-grown internal framework of sorts. We were doing code reviews. We were writing unit tests. This was how software was built. I never knew any lifestyle but this — it was my first development gig! This was just the way things were done. This period of time really shaped my personal stance on how to build quality software that was both performant and scalable. I consider myself so very fortunate to have started with this level of experience. It’s what gave me such a strong base of experience as a software developer.

In mid-2005, I moved on from CCI and spent almost five years in an interactive agency, Schematic (now Possible Worldwide). During this time, I gained exposure to new, different technologies like Zend Framework and Memcached. This was my first foray into leading major technical projects for clients, but still rolling up my sleeves, diving into architecture and code. I was using my skills from CCI with PHP, sysadmin duties, and databases, and applying them to client work time and time again. I was working in a world that was very different from what I had known at CCI, but bringing so much of that experience forward with me. In mid-2007, we moved to Atlanta, where I stayed with Schematic.

After almost five years at Schematic, I moved on to Yahoo! for a little over a year. This allowed me to get back to my development roots, focusing solely on code and architecture. I had a great time.

In mid-2011, I made a move to Half Off Depot to build an internal development team and grow the technical side of the company as Lead Software Architect. Here, I’ve was using all of my skills: systems administration, PHP, MySQL administration, managerial duties, recruiting, and working with other departments, such as marketing and design.

Over the past eight months, I’ve made a huge impact at Half Off Depot in terms of stabilizing the application and its Production environment. I’ve branched out into using Git and GitHub, Capistrano and Amazon Web Services. I’ve also had the opportunity to continue sharpening my Objective-C and iOS development skills. Overall, Half Off Depot has challenged me, and I’ve enjoyed it. I’ve reaffirmed to myself that I’ve got a breadth and depth of skills, and that I’m still pretty sharp with all of them. It’s also reminded me how much I enjoy a startup environment.

But about a month ago, Mike Montero came calling again — this time, with an opportunity for me to join CrowdTwist, a New York City-based startup where he’s a co-founder and CTO. CrowdTwist is an emerging, unique player in the loyalty space. Think “platform as a service.” APIs, user-facing sites, large amounts of data. And an incredible team that’s tapping into this data to provide real value for their clients.

When someone you trust and respect comes calling and seeks you out, you listen and explore. And that’s exactly what I did. And let me tell you, the CrowdTwist team is INCREDIBLE. I could not be more excited for this career change, both for the opportunity to work with Mike once again, but also to work with all of the brilliant team members and their clients.

I’m in a unique position where I had almost five years of CCI-level experience, coupled with seven years of experience since then. Now I’m going back to work with Mike and the CrowdTwist team, where I’ll be able to bring my strong foundation from CCI, along with all that I’ve learned in the years after CCI. My career has come full circle with respect to the last decade.

I typically like to make a job change, then stay there for at least four years as I did with CCI and Schematic. However, this opportunity with CrowdTwist is so rare that I had to take it. To be with this caliber of talent in such a promising space where they’re truly a pioneer? You just don’t say “no” to that. Or if you do, you regret it in a few years when they’ve been wildly successful.

So, on March 12th, I’m joining CrowdTwist full-time. I’ll be working remotely from Atlanta, but traveling up to New York City from time to time. I’ll be focusing on a mix of back end development and architecture, systems administration, and helping the team continue building quality software.

To my Half Off Depot colleagues, it’s been incredible! We’ve done some great things together. I wish you all the best of luck. Also, this has easily been one of the best team of technologists I’ve ever worked with. Thanks, guys.

To my future CrowdTwist colleagues, thanks for welcoming me! I’m so thrilled at the opportunity to join you. This is going to be an incredible ride. I’m ready to rock.

See you soon, CrowdTwist! And if you’ve read this far, thanks. :)

“Rickroll To Go…” ZendCon Session audio posted!

My ZendCon 2008 talk, “Rickroll To Go With WURFL, PHP, and Other Open Source Tools”, was just released at Zend DevZone as ZendCon Sessions episode #23!

If you’re just now finding my blog from there, welcome! And thanks to Eli White, Community Relations Manager for Zend, for selecting it for posting.

You can get all of the relevant info using the links below:

Slides and videos of the presentation materials
ZendCon Sessions page with audio
MP3 audio of the presentation
iTunes DevZone podcast

Enjoy, and thanks for listening! Find me on Twitter or email me if you’d like to discuss the materials.

MySQL replication and the sync_binlog option

Recently I’ve been focusing on MySQL replication for a project at work. On this particular project, I’m acting in a Solutions Architect role and have been since about September of 2008.

Because of my background in systems administration, I tend to get myself into situations where I become the Schematic-side sys admin on projects. This involves things like deployment processes, getting development, staging, and production environments setup, and now, setting up MySQL replication. This is probably because a) I’m probably bad at delegating these things to others, b) I’m kinda’ good at it, and c) let’s be honest, I’m a control freak, so I like knowing the servers hosting my apps are setup in a meticulous manner.

In short, we’re running MySQL 5.0.45 on RedHat Enterprise Linux 5 (I know, I know…RedHat and MySQL 5.0…boo…but it’s okay). We’re required to replicate our production database to a secondary machine for backup purposes. This way, if our production server dies, we can manually failover to the slave (once we enable writes to it, of course), then swap the two back once the production server is back up.

All in all, this site is rather low-traffic at around 20,000 dynamic page views per day. Factoring in US-based users in an 11 hour time period, that’s about 1,800 request per hour, or right around 0.5 page views per second. We’ve got a single server in production that’s acting as our webserver and database server; it’s got a RAID array in it that was all setup by the group hosting the application (so I don’t know tons about it, but it’s new, quality hardware).

I’m using my master database for reads and writes in Production. Again, the slave is really only required for live backup-type purposes.

Now, I’m no expert on MySQL replication, but I’ve learned a lot these past few weeks. So I’m going to share one big caveat here. Please correct me as you see fit!

MySQL’s got a sync_binlog configuration option. You typically set it in my.cnf, and its value is an integer from 0-n. This value determines how many binary log writes need to occur before its contents are flushed out of the buffer and onto disk. With it set to zero, your operating system just determines when the buffer is flushed to disk.

I have a database migration process that copies table structures and their data from PostgreSQL into MySQL, then basically migrates that data into the appropriate tables in the new MySQL instance. It involves the transformation of a lot of data. It’s a sizable, complete data set for 8+ year old system that’s not the prettiest, best normalized data model in the world.

Per recommendations in High Performance MySQL, I had my sync_binlog value set to 1 in Production.

When I was performing a test migration to Production recently, the process took about 3 hours. Wow. Thanks, MySQL! It normally takes about 90 minutes in Staging, if that.

In digging around Google and MySQL.com, I found that a non-zero value for sync_binlog causes more disk seeks to flush the binary log to disk. The benefit of having it set to 1 is so that every transaction can be written to the binary log, which is then flushed to disk upon commit. Then, if your server happens to die, the last completed transaction will always be present in the binary log on disk, so you never have to worry about, say, missing a transaction replay on your slaves. However, this results in a lot more disk activity on your master.

I set sync_binlog to 0 and re-ran my migration. It ran in 90 minutes — that’s a 50% performance gain! Now, if you do the math, this makes sense. It’s one less disk seek and write per-transaction, so this result totally makes sense. Hooray for numbers, right?

I’m willing to gamble the integrity of data on my slave for the 50% performance increase. (remind me of this post in 6 months when I’m kicking myself over this for some reason, okay?)

With no binary logging enabled (i.e. in our dev environment), this process takes about 20 minutes. This makes sense — far less disk writes during the process.

Another way to workaround this would be to keep your binary logs on a physically separate disk. However, I don’t have that luxury at this point, so that’s not an option for me. If I had my druthers, this is how I’d handle the problem, but…no dice for now.

Anyways, my main point: if you are willing to gamble with every single transaction being replicated to your slave in the event of a crash, perhaps you can set sync_binlog to 0. If you’ve got a separate disk to devote to your binary log, by all means, set it to 1! There are other concerns around this are related to battery-backed disk cache, which you can read a bit more about in Jeremy Cole’s post on MySQL replication. You can also see some handy benchmarks that compare MySQL with and without binary logging.

Finally, I’ll admit this is a bit of a knee-jerk reaction post. I’ve done a bunch of research on this, but it’s not all quite fleshed out in my mind yet. I get the whole cause and effect in theory, but I haven’t dug into MySQL source or other materials to really understand what’s going on behind the scenes.

MySQL replication is a tricky thing. It’s great when it works, but understand that there are overhead tradeoffs in using it! I’m sure I’ll learn more in the weeks and months following our launch, so I look forward to sharing more of my successes and/or pains on this. Comments, feedback, and flames such as “OMG, you’re so wrong Brian!” and “Brian is a n00b!” are welcome.

Slides: ZendCon 2008, “Rickroll To Go…”

It’s the first day of ZendCon 2008! I’m giving my new talk, “Rickroll To Go With WURFL, PHP, and Other Open Source Tools” today at 4:00 PM PST.

The slides are below in a variety of formats:

PDF (no transitions)
PDF (one transition per page)
Quicktime movie

If you’re at ZendCon and reading this, be sure to drop on by at 4:00 PM — it’s sure to be a ball. Enjoy!