The Theo Spears Blog

Blogging Considered Harmful (Considered Harmful)?

Sabbatical Day 41

It's been a long time since the last update. So long I can't track what number this should be, so I am calling it day 41.

So what's been going on since last time. Mainly going on holiday to switzerland, Christmas, and dropping out of the course I was studying. And a few things that are actually useful.

  • I've turned off new customer acquisition (principally) adwords for Steady Service as I have a list of things that I need to address, and I don't believe it makes sense to carry on paying money to attract people when there are things I know the service really needs. I will need to avoid falling into the trap of making a large number of changes, turning things back on, and then not having any data.
  • I've signed up to play with the Rackspace Cloud Monitoring Beta. This should allow automatic monitoring of all the sites I host. Unfortunately it has a restful rather than a declarative API, so I'll need to do some work to take puppet configurations and turn them into changes to be performed against rackspace.
  • I've been doing some interesting reading about more computer science topics, including the zipper, derivatives of data types and the lambda calculus.
  • I'm due to go back to work as normal next week, so I've been starting to figure out what is happening there.

Sabbatical Day 40 - Downtime

Two missing updates. I was feeling poorly and thus mainly spent the days in bed. Clearly I'm not hardcore.

Today saw the first significant bit of downtime for SteadyService - customer sites were unavailable for a while.

So what happened:

  • Amazon have made some changes to their infrastructure, which required restarting customer VMs. As scheduled, the VM hosting customer services was restarted.
  • Services were not scheduled to come online when the machine starts up, so they did not start when the machine came back a few minutes later. I knew that this would be the case when I wrote the original scripts, but in the time between writing them and now I forgot about this.
  • There was no active monitoring of the sites in place, so I didn't notice that this was the case.

So what am I doing to prevent this occuring again?

  • Adding active monitoring to all the sites to I will be notified immediately if anything breaks. I need to monitor many more hosts than most commerical solutions make possible, plus I want to automatically add customer hosts when they are created, so it looks like this will be something home grown with Nagios. Any better ideas.
  • I will be modifying my site creation scripts so sites are automatically brought online again when a box restarts.
  • In future if Amazon schedules a restart of machines hosting customer data I will handle it by bringing up a new box hosting customer sites, testing it works, transfering DNS over, and then bringing down the old server. This should avoid the downtime which would otherwise be caused by restarts.

This is another reminder that running an always available service has a whole host of additional challenges you don't face when just writing and distributing software.

Sabbatical Day 39

I'm spending the day working on a visitor tracking and data recording system based around CouchDB, nodejs, and browser javascript. To my surprise I am broadly liking nodejs so far. The available libraries seem well build and easily composable, and not having to switch languages between client side and server side is very nice.

Clientside javascript is a different matter. I can never get used to just how little functionality is provided in the javascript standard library. Coupled with the way javascript files cannot easily depend on other javascript files, it makes development a real pain. Plus because this code all has to be transmitted over the wire it's much harder just to build on work which other people have already done. I'm shipping as part of my file some functions to get and set cookies. Why do I have to include this in my file rather than having some nice import statement? Why does this code have to be written at all rather than being part of the browser? Clientside javascript makes me annoyed.

Sabbatical Day 38

I've just finished reading The Lean Startup by Eric Ries. It's the book everyone says you need to read if you are involved in startups, although it is not without criticism. Let's be clear, this is not a book that is going to win awards for the quality of writing or exposition, but the ideas it contains certainly make a valuable contribution to the debate.

I'm not going to discuss the ideas in the book or their merits, many people much smarter than I have done so already. I'm going to talk a little about how we tend to bend language in the business community.

The Lean Startup is fundmanetally about two ideas: science and agility. Ries advocates an approach to business strongly informed by the scientic method, and is a firm believer in following a repeatable and proven process to making business decisions. You come up with a hypothesis, you decide how to test in, and then you go about collecting the data to either validate or falsify this hypothesis. Follow this process and with some luck your business should slowly improve. By making decisions based on this process you can be more agile and flexible by avoiding internal politics (you are using data not decisions) and people stalling, and can get into the habit of changing things regularly.

This word agile is one many of us have come across before in another context, that of The Agile Manifesto. Here it is:

We are uncovering better ways of developing software by doing it and helping others do it. Through this work we have come to value:

  • Individuals and interactions over processes and tools
  • Working software over comprehensive documentation
  • Customer collaboration over contract negotiation
  • Responding to change over following a plan

That is, while there is value in the items on the right, we value the items on the left more.

Individuals and interactions over processes and tools. That sounds very different to Rie's scientific method driven approach. You might even say they are contradictory philosophies. And yet proponents of both approaches would say they are aimed towards bringing about a more agile business.

Neither of these approaches is wrong. It is extremely unlikely either is entirely correct. Both have value. Nevertheless, it is important to be clear on what we are talking about. 'Agile' is not the only popular business word to have been overloaded with multiple meanings that don't necessarily line up. Probably safest to sit down, think about what you really mean and want to say in concrete terms and say it, rather than appealing to generic schools or vocabulary.

Sabbatical Day 37

Today I finished getting automatic deployment of Bugzilla sorted. Bugzilla has an interesting install mechanism that requires running a shell script which prompts for information and then sets up the application. Fortunately you can provide a pre-written answer file with answers to all the questions it asks, so you don't need to try to train expect to fake out user input. It's easy enough to generate this and other configuration files from templates, and the script takes care of the rest of the process.

I've also spent a bit of time reading through questions on the Code Review Stack Exchange to try to pick up some useful tips and practices. Sadly the overall quality there is very low, with most samples containing really very basic flaws. If anyone has any suggestions for places to read critiques of good well designed code in bite sized chunks, I'd be interested to hear.

Sabbatical Day 36

Today has been another day working on puppet. The more I use it the more I find there is functionality I would expect to be present which does not exist, and has a dozen hacked up implementations from different people. Or a dozen plus one, when I then take some of the existing ones and combine them into what I end up using.

Here are some examples: Resources to build a configuration file by assembling snippets defined in lots of different pages. A good way to keep an entire directory tree up to date (the File resource can do this, but it is unacceptable slow compared to rsync). * A way to install perl modules from CPAN.

More generally the language is incredibly weak compared to a "real" programming language. I accept that puppet is about building a declarative model of your system rather than performing instructions, but that is no reason for the language used to build that model not to be powerful. I sometimes find myself wanting to stick a template language on the front to generate puppet configuration files, which is crazy. In particular handling collections is extremely weak.

In short right now I feel the situations where puppet adds the most value are the ones which are so simple you don't really need puppet. But I accept it is early days and with any luck in time PuppetLabs will fix many of these deficiencies.

Sabbatical Day 34

Today I've been fighting against Jekyll. It's a blogging engine written by the team at Github to help with managing blogs stored in Git. Or perhaps it's more accurate to describe it as a Domain Specific Language for writing blogging engines. It's very powerful, but you have to do a fair bit of work yourself to get it up and running.

Obvious question time. Why on earth am I fighting with an obscure and unsupported blogging engine, rather than just installing Wordpress and calling it done? As always, there is more than one reason: 1. I didn't realise just how hard it would be. 2. I believe that blog posts are part of a site's content, and therefore things that are important to all my other site content are also important for it. Blog posts need to be stored in version control so I can back out and stupid changes. They need to be backed up in case my computer or server blow up. I want to be able to edit them offline, and using my favourite editor (vim). I already have all this in place for the rest of the site in a git repository. Setting up wordpress means suddenly I've got a second interface to use, a second database to backup, and a second system to migrate if I ever move the site onto a different machine.

So it's a complexity tradeoff again. I am doing some more work now in order to have a simpler and easier to administer system in the future. But still, an entire day to get a blog setup and live. That's just pathetic.

Sabbatical Day 35

I finally got the blog for Steady Service up and running. The content is still pretty shoddy but the mechanics to publish are now working fine. Just a Small Matter of Copywriting now to get some better content up there. I'll share some more about choosing suitable topics to blog about once I have done so.

A few more people have signed up for the service. The act of turning signups into real accounts doesn't take too long right now, but it is quite tedious and dull. This means I don't enjoy it and put it off, which adds significant friction. This still seems lower priority to fix than getting more people to sign up, but it's work I really need to do soon to improve the customer experience.

I'm also starting on the backend support for Bugzilla. Most of the groundwork for supporting email and PHP apps is already there, but I'm sure new and unexpected challenges will show their head. Today's unexpected challenge: If you have an IP address that is mostly static but changes occasionally, when it does suddenly all the ec2 firewall rules you have setup to permit access to your machines stop working. I'm shocked there isn't already a script that updates your security groups accordingly for this situation. Google finds me other people looking for a solution but no one who has written one.

Sabbatical Day 33

Yesterday afternoon was study time. I'm studying a module on creativity with the open university, currently looking at personality style. I'm fundamentally not a fan of personality psychology; I much prefer the work of Walter Mischel. Mischel advocated the idea of using person/situation combinations rather than purely personality factors to describe people. For example rather than saying that Jack is more aggressive than James, you say that Jack is more aggressive than James in meetings. James may be more aggressive than Jack when attending football matches. And in fact the evidence suggests these sort of dichotomies often exist.

But it's hard to write about that and display knowledge of the course material at the same time.

This morning was about looking at traffic to Steady Service. Signups have dropped as a result of traffic from Google Adwords dropping, so I spent some time tweaking the campaign and pushing more money towards effective keywords. I also moved everything to a new account I have set up with more free adwords credit. (Thanks, Google!) The Google Adwords Editor is great for this - you just export campaign settings from one account, and reimport them into another.

Given adwords is really expensive for some keywords I've started setting up a blog to generate some organic traffic. Writing some content for it is this afternoon's job.

Sabbatical Day 32

I've spent the last "day" working on a few tiny scripts to learn about programming in Ruby and writing extensions for Google Chrome.

My first ruby script, port-install-from.rb, grabs all the MacPorts packages installed on a target remote machine, and installs them locally. This is to help me keep my laptop and desktop in sync with each other. It's available on GitHub, and I'd appreciate some more experienced Ruby programmers taking a look and pointing out anything I'm doing in a stupid way.

port-install-from.rb

The chrome extension I wrote, Wikipedia Usage, looks at your wikipedia usage for the month, and based on that suggests an amount of money to donate to the WikiMedia Foundation. It also displays some graphs, and information about most common and most recently read articles. I'd be interested in any feedback, particularly suggestions or designs for how to make it look better. Any graphics designers want to make an awesome design for me to code up? It's for a good cause, after all.

Install Wikipedia Usage (Chrome users only)

Wikipedia Usage on Github