infrastructure

SAN Virtualization

Virtualization in the server fram relies heavily on storage, something I understand only to a certain level. This
review by InfoStor discusses virtualization of the SAN itself to get the flexibility you really need to get the best value out of server virtualization. It's somewhat over my head at the moment - I haven't gotten to this point yet - so I'm putting this link here for future reference.

Found via virtualization.info.

The hidden pitfalls of server virtualization

Where's there's buzz, there's bullshit. Slashing hardware costs and the time I spend herding servers makes virtualization sound like a silver bullet, but I have no doubt it's not as easy as the salespeople tell me to pull it off. I've spent some time researching and thinking it through, and have come up with a few things that I need to keep in mind when planning to get into virtualization properly.

Will I really spend less on hardware?

What's cool about virtualization?

How would you like to cut your annual server farm budget to a fraction of its current cost, reaping the glory and gold due to a corporate IT champion? Lop your data center to a third of its current size, no a fifth, maybe less! At the same time, you can improve resiliency, flexibility, and potency!

The Virtual Buzz

Virtualization is all the buzz these days, especially for server farms. As my own collection of server hardware heads towards 20 boxes and is still hard pressed to handle all the tasks I need it to do, I'm finding the lure of the herd hard to ignore.

Whipping up a solid LDAP infranstructure

I've been much too quiet lately. I'm still hard at work putting together what I hope will be a very strong infrastructure for my company's application hosting operations, with about 15 servers for production, content management, and staging and testing.

One of the core components of this infrastructure is an OpenLDAP server, which I've been working on over the past week. Up until now it's been enough to have a couple of accounts which are created locally on all of the servers by puppet. I've got a chunk of disk space on a SAN which is shared across the machines, which is handy for having a common home area for key accounts I use to login and administer the machines, as well as the puppet templates and manifests.

The cool kids talk about operations

Tim O'Reilly, the boss of O'Reilly publishing and a key booster of the Web 2.0 meme, recently posted an article about operations.

One of the big ideas I have about Web 2.0 [is] that once we move to software as a service, everything we thought we knew about competitive advantage has to be rethought. Operations becomes the elephant in the room.

O'Reilly laments that most of the tools for deploying systems and applications on open source platforms (i.e. Linux) are not themselves open source. Luke Kaines and others have commented on the article with examples of open source deployment and operations management tools, including Puppet, and others I've mentioned for system configuration and network monitoring.

Network Monitoring Tools

This section has links and information about network monitoring tools. I've used Nagios a lot in the past 4 or 5 years, which is open source and pretty mature. It is mainly for detecting and reporting problems however, so it's useful to add something like Munin for tracking and graphing system resources and performance.

Another tool I'd like to try out for this is OpenNMS, which is written in Java, and includes the graphing as well as detection and reporting, and also auto-detects devices and services on a network.

Configuration management with Puppet

I've started tinkering with puppet for configuration management. It's a far more flexible and extensible tool than cfengine, so it looks like the best way to go.

It's main drawback is lack of maturity. The documentation is fair, there's a decent reference, but there are only two examples of configuration files that I've seen so far, and neither one is very complex. It's also fairly buggy, although the author is quick to respond when told about specific problems.

I'll most likely be using Puppet to build a J2EE infrastructure based on Red Hat. I'd like to be able to contribute bug fixes, but I'm not sure how many spare cycles I'll have, given that I don't know Ruby. But hopefully I can at least contribute some example files, and some manifests related to Tomcat and general J2EE web application deployments.

Ready for disaster

There are a lot of things you can do to make sure that when disaster strikes, you can get back online. Even in environments where you don't have automatic failover, you can take some basic steps so that when you get the alert or the phone call, you can bring things back online.

Let's say you have a single server running a web application with a local database. However, you need to have a second server available. Maybe it's doing something else normally, maybe it's in a less than ideal location, like in your office at the end of a slower Net connection, but as long as you can fire up your application, repoint DNS, and be online, it'll do in a pinch.

First, make sure you have the base server software ready, so your web, application, and database software are installed.

Second, make sure you have a copy of your application code and configuration files handy. I always like to have these in source control, on a server other than my live one, so in the worst case I can pull them down to my emergency location.

Third, you need your live data, that is, your database contents. Take frequent dumps of the data and have them handy, again away from your live server. Do this outside your system backups, use your database tools such as mysqlbackup to dump a file, then zip it and ship it somewhere else. How frequently you do this depends on how often the data changes, and how important it is to have fresh data. In the most extreme case, you might have the database continually dumping a log to a shared file store, where the backup server is reading it in.

A sticking point may be the DNS. You can change the DNS, but users will have the old DNS information cached. How long it takes for them to get the new IP address depends on the TTL you have set in your DNS configuration, and changing this to a lower value after the crash ain't gonna help. A two hour TTL is probably a good setting.

Of course, better yet is if you have multiple servers behind a firewall and/or load balancer, so you don't need to change your DNS at all, just reconfigure and go. But if you're running a budget setup, these are simple steps to follow to make disasters a little less stressful.

cfengine alternatives

I've been working up a cfengine-based setup to manage a new server infrastructure. This will be my third cfengine-based infrastructure, so I should have learned enough to make a cleaner, tighter configuration. Unfortunately I'm still finding cfengine to be too damned awkward.

So, I'd like to put together a list of alternatives to cfengine. I'll add them to this page, and hopefully add on notes and reviews as I learn more. If you have experience with these or others, please add a comment.

  • Puppet seems to be an up and comer. It looks to be designed to be much more extensible than cfengine is. It also lets you make sure each host only sees its own configuration, which is one of my peeves about cfengine. It's my leading candidate at the moment.
Syndicate content