The Harvard Business Review article, “Reverse Engineering Google’s Innovation Machine” (April 2008), describes how Google is built for innovation. I highly recommend reading the article, and will focus on just a couple ideas here:
- Budgeting for innovation
- Superior infrastructure built for growth
Budgeting for innovation
Test if your organization values innovative activity: do they pay you for it? Google’s engineers are required to spend 20% of their time on a side project of their choosing. This means they take 20% of their time from search & advertising–Google’s bread & butter–and dedicate to risky, innovative activity.
Google’s organizational structure is effective at managing innovation. They understand that in the long-term, it’s more risky to focus solely on search & advertising. Investment in innovative activity is a hedge against potential downturns in their core business and a likely generator of future growth.
Superior infrastructure built for growth
Growing applications, from development to production, is challenging. Growing applications easily and on-demand is down-right hard. Your web application environment needs to be easy to set up, easy to share with other devs, fault-tolerant, easy to launch for users to see, and easy to scale up to meet production demands. Try building these capabilities into your infrastructure, then allowing your devs to do it without needing a sysadmin.
There are new tools that let us approach this level of superior infrastructure. They may not be as fast, or as easy, or as scalable. Tools like virtualization, re-usable web services and code libraries, and a solid architecture can bring us closer to this goal.
An Internet forum is a web application for holding discussions and posting user-generated content. This functionality is found in several types of applications, including message boards and blogs that allow for posting of comments.
One approach to creating a message board and blog web application is to abstract out the shared functionality. First, we have to define the functionality, then see where the overlap is. If the functionality differs greatly, then sharing a common web service won’t make sense. If there is a lot of shared functionality, or we design our applications with this goal of shared functionality in mind, then a web service component would result in an effective software design
Let’s look at what might be common across blogging and message board applications:
- Posts, including subject, author, and meta-data (reply-to, last modified, date created, etc…)
- Post buckets, which could take the form of a forum or blog post and comments.
URIs:
- List buckets: GET /appbase
- List posts: GET /appbase/1
- Retrieve post: GET /appbase/1/2
Supported methods:
- GET
- HEAD
- PUT
- DELETE
Where have I seen something like this before??? Hmmm…
OpenVZ is a virtualization technology that allows many virtual private servers (VPS) to run on one hardware node. This post shows how to create a slim down LAMP VPS using OpenVZ on a RHEL4 hardware node.
Outline:
- Install OpenVZ
- Install an OpenVZ template
- Create a temporary VPS container and initialize
- Set up NAT on your hardware node
- Install vzyum and related packages
- Update the temporary VPS OS with latest packages
- Replace OpenVZ template with the temporary VPS
Start with the fairly straight-forward quick installation guide available on OpenVZ’s wiki. Next, download a pre-created OS Template and place the tarball in /vz/templates/cache . At this point, you should be able to create a temporary container using the OS template you chose by doing the following:
vzctl create 1001 –ostemplate centos-5-i386-minimal
Where 1001 is the CTID (you’ll use this number to manipulate your VPS), and centos-5-i386-minimal is the name of the pre-created OS template you downloaded. Note: I tried a user-contributed centos 5 template, which required the installation of an additional metadata RPM in order for OpenVZ to know the location of certain files (like the networking scripts, nameserver file, etc…)
Let’s set the IP and nameserver of the VPS:
vctl set 1001 –ipadd 192.168.0.1 –nameserver “123.123.123.123 123.123.123.124″ –save
Next, set up NAT on the hardware node so that our VPS can make outbound network connections:
iptables -t nat -A POSTROUTING -s src_nat -o eth0 -j SNAT –to ip_address
I specified a src_nat of 192.168.0.0/24 to give me 255 NAT’d IPs to play around with. ip_address is your hardware node’s IP.
Time to fire up the VPS:
vctl start 1001
Test that networking works:
vctl enter 1001
ping yahoo.com
exit
At this point, we have a working VPS, but it’s running woefully out-of-date software. OpenVZ provides a wrapper around yum and rpm called ‘vzyum’ and ‘vzrpm’. Since I am running a centos 5 VPS, I’ll use vzyum. First I need to install the required packages on the hardware node (remember the HN is RHEL4):
rpm -Uvh python-elementtree-1.2.6-7.el4.rf.i386.rpm python-sqlite-1.0.1-1.2.el4.rf.i386.rpm python-urlgrabber-2.9.6-1.2.el4.rf.noarch.rpm vzrpm43-python-4.3.3-7_nonptl.6.i386.rpm vzrpm44-4.4.1-22.5.i386.rpm vzpkg-2.7.0-18.noarch.rpm vzrpm44-python-4.4.1-22.5.i386.rpm vzyum-2.4.0-11.noarch.rpm
Many of the packages not provided by RHEL4 were found at DAG .
Finally, let’s update our centos 5 VPS with the latest packages:
vzyum 1001 -y update
vzyum 1001 clean all
vzctl 1001 stop
Now we have all the updated packages and cleaned up any headers leftover from yum.
Now to replace the pre-created OS template with our up-to-date centos 5 version:
mv /vz/templates/cache/centos-5-i386-minimal.tar.gz /vz/templates/cache/centos-5-i386-minimal.tar.gz-old
cd /vz/private/1001
tar cvf /vz/templates/cache/centos-5-i386-minimal.tar.gz .
That’s it! Now you can vzctl create a slew of VPSs based on this template. Next time I’ll discuss how to create a web node template with LAMP built-in.
Virtual Private Servers (VPS) provide a chunk of computing power, coupled with storage space and networking. Many commercial hosting providers offer “dedicated shared hosting”, meaning they split up a machine into multiple chunks and rent them out individually. Often customers receive “root-level” access to their VPS, which are designed so that root in one VPS cannot touch another VPS running on the same hardware.
Why should I care about this? I’m a part time sysadmin at best…
Purchasing, racking, installing, obtaining support, patching, and overall managing of physical machines is not a web technology, nor does it deliver academic value to our clients. Web applications built into an academic portal deliver academic value to clients. If there was some way to run our web applications on air, we could focus solely on our direct commitment to students and faculty.
As much as I would like to run our web applications on thin air, physical hardware is needed.
This is where virtual private servers come in. VPS, if managed by highly responsible system administrators, completely eliminates the need for application developers to think about the care and feeding of physical machines. Instead, they get a reliable OS container that can move from one machine to another depending on hardware failures or changes in usage patterns. They get consistent, standardized chunks of computing power, making their application run exactly the same regardless of what physical hardware supports it.
I dig the idea of utility computing. I think we’re approaching the point where the interface between a computing need and fulfillment is as simple as plugging a light into an outlet.
If I need more CPU units to satisfy high end-of-quarter usage, I’d love to have a new server up and running in a few minutes, with minimal admin. I would love to have spare CPU units ready to go at a moments notice. I’d also love to share some of “our” CPU units back to the spare pool during usage lulls–or better yet, let this happen dynamically without me or our users noticing.
Why would a computing department want to offer VPS as a service instead of letting each group manage their own physical machines?
- Greater department efficiency: some clients can share the same physical machine.
- Simplified accounting: one chunk of computing power is the same as any other.
- Higher-level capacity planning: aggregate all future computing needs into a pool.
- Hardware cost savings: yearly bulk purchases for entire department instead of independent purchases.
The next question: how would this impact our organization? What changes would we need to make?