railway

distance_of_time_in_words with proper grammar (and more flexibility)!

2013-03-20 00:00:00 +0000

If you’ve ever written an app that uses Rails’ distance_of_time_in_words (as well as the related time_ago_in_words) helper method and the app was either translated to locales other than English or wasn’t in English in the first place, you’ve probably been bitten by its lack of proper grammar.

I18n.locale = :en
time_ago_in_words(2.days.ago) + ' ago' # => 2 days ago
I18n.locale = :de
'vor ' + time_ago_in_words(2.days.ago) # => vor 2 Tage (should be "vor 2 Tagen")

People wrote all kinds of sick workarounds to get around this problem but even with these workarounds it was still a pain if you needed a few variations (such as a future_time_in_words helper).

Even if grammar wasn’t an issue (e.g. in English), it was annoying that you always had to add the “ago” either manually or using a second translation with an interpolation argument.

# assuming a locale file containing the following:
# en:
#   time_ago_in_words: '%{days} ago'
I18n.t('time_ago_in_words', :days => time_ago_in_words(2.days.ago)) # => 2 days ago

You had to do this even if you used the time_ago_in_words helper that suggested the inclusion of “ago” even in its name although it didn’t include it.

These issues have been fixed in Rails’ master branch for a while (see commit e22e78545112eaad857ab1e02119e20ce10065d0) by simply making the translation scope configurable via the options hash. Thanks to Steve Klabnik is has also been backported to Rails 3.2 and was released in the Rails 3.2.13. So now, even in Rails 3.2, you can do stuff like this:

# assuming a locale file containing the following:
# de:
#   datetime:
#     time_ago_in_words:
#       x_days:
#         one: vor einem Tag
#         other: ! 'vor %{count} Tagen'
I18n.locale = :de
distance_of_time_in_words(2.days.ago, Time.current, false, :scope => :'datetime.time_ago_in_words') # => vor 2 Tagen

And if you need something like the aforementioned future_time_in_words, it’s easy to write it yourself and just add an appropriate scope:

# assuming a locale file containing the following:
# de:
#   datetime:
#     future_time_in_words:
#       x_days:
#         one: in einem Tag
#         other: ! 'in %{count} Tagen'

def future_time_in_words(to_time, include_seconds = false, options = {})
  options.reverse_merge!(:scope => :'datetime.future_time_in_words')
  distance_of_time_in_words(to_time, Time.current, include_seconds, options)
end

I18n.locale = :de
future_time_in_words(2.days.from_now) # => in 2 Tagen

The only remaining pain is that time_ago_in_words hasn’t been backported yet. You can fix this by just overwriting it:

def time_ago_in_words(from_time, include_seconds = false, options = {})
  options.reverse_merge!(:scope => :'datetime.time_ago_in_words')
  distance_of_time_in_words(from_time, Time.current, include_seconds, options)
end

I18n.locale = :de
time_ago_in_words(2.days.from_now) # => vor 2 Tagen

I will probably whip up a pull request for that (and maybe include the future time helper as well) to fix that. Other than that, we can finally get rid of our scary workarounds.

Using SSL in your local Rails environment

2013-02-12 00:00:00 +0000

For every other project, the requirement of using SSL/HTTPS crops up. And I’m sure, just like me, you were annoyed that there doesn’t seem to be a simple way to test your SSL stuff in your local development environment.

Well: Not any more. Here’s how you set up your local Rails environment so that you can use SSL in development if you want or need to.

Note: I’m assuming you don’t want to mess with your local Apache but want to use the regular Rails server command.

A Gist to Make Your Life Easier

Today, when I googled for solutions, I found a gist outlining how to set up a self-signed SSL certificate for your localhost. It works like a charm.

I’ve put all the files in \~/.ssl so I can use it across projects.

Configuring Your Rails App

Next up is Rails.

The first flaw: Using the regular script/rails server seems to be unable to serve both, SSL and non-SSL requests, at the same port – which, of course, makes sense. We can circumvent this problem by simply starting two servers at two different ports (I use the thin webserver in this example):

  thin start -p 3000
  thin start -p 3001 --ssl --ssl-verify --ssl-key-file ~/.ssl/localhost.key --ssl-cert-file ~/.ssl/localhost.crt

Now we’ve got a non-SSL version of our app running on port 3000 and the SSL version on port 3001. You can, of course, also put these in your Procfile if you’re using the awesome foreman gem.

Now to the Rails app itself.

Since version 3.1, Rails ships with a controller macro named force_ssl so you don’t need the good old ssl_requirement plugin any longer.

However, the implementation of force_ssl that ships with Rails has a major flaw: It explicitly excludes the development environment. And it assumes that the frontend server can handle ports itself – which the regular Rails server can’t. Both issues can be alleviated by monkeypatching the ActionController::ForceSSL module that ships with Rails:

ActionController::ForceSSL::ClassMethods.module_eval do
  def force_ssl(options = {})
    config = Rails.application.config

    return unless config.use_ssl # <= this is new

    host = options.delete(:host)
    port = config.ssl_port if config.respond_to?(:ssl_port) && config.ssl_port.present? # <= this is also new

    before_filter(options) do
      if !request.ssl?# && !Rails.env.development? # commented out the exclusion of the development environment
        redirect_options = {:protocol => 'https://', :status => :moved_permanently}
        redirect_options.merge!(:host => host) if host
        redirect_options.merge!(:port => port) if port # <= this is also new
        redirect_options.merge!(:params => request.query_parameters)
        redirect_to redirect_options
      end
    end
  end
end

use_ssl and ssl_port are custom config settings that are not part of the Rails standard configuration. However, in Rails 3 you can simply add custom config settings by just defining them. So just add config.use_ssl = false to config/application.rb and the same setting set to true in config/environments/development.rb and config/environments/production.rb. In config/environments/development.rb, you also need to add config.ssl_port = 3001 since this is what we defined earlier. Note that I suggest turning SSL off again once you’re done testing – constant port switching can be quite confusing (I know it does confuse me).

And with that, you’re good to go. Simply add force_ssl to any controller you want to secure with SSL. If you navigate to the non-SSL version of a page that forces SSL (e.g. http://localhost:3000/some/page), your Rails app will automatically redirect you to the SSL version (https://localhost:3001/some/page).

Note that depending on your browser you might receive a certificate warning because you’re using a self-signed certificate. Just tell your browser to shut up and go ahead with it.

A Note on Sexism

2012-05-20 00:00:00 +0000

A couple of weeks ago, I stumbled across a blog titled Programmers Being Dicks. The most recent entry was a link to an apology from Robert ‘Uncle Bob’ Martin who apparently felt he had been offensive or sexist in a keynote he gave that day. Reading the apology made me sad and also kind of annoyed me: Here’s a seasoned speaker who is renowned for his funny as hell presentations and keynotes who feels like he has to apologize for his style. But let’s have a closer look at this.

Apparently, Uncle Bob had a slide that said “C was for real men” and, when a female attendant asked “What about women?”, he had answered something like “We didn’t allow women in those days”. Are those words sexist? Hell yeah. But you don’t have to have a degree in psychology to know that communication consists of a little bit more than just plain words: There’s lots of nonverbal cues going on and a whole body of research suggests that these nonverbal cues shape an act of communication way more than the verbal parts. In the aforementioned situation, I’m willing to bet that Uncle Bob was wearing his classic full-on-“I’m being sarcastic now” smile. And there sure as hell was sarcasm in his voice. Maybe there was even a hand gesture that waved it off as being sarcastic. He used words that, read by themselves, are definitely sexist – if you used them in written communication without any further comment, they could (and probably should!) be interpreted as sexism. But not in a presentation where there certainly were dozens of cues that indicated that he wasn’t serious.

Take the counter example: What if a speaker had a slide that depicts a (half-)naked woman in front of a computer and the speaker commented the slide with “As you all know, I totally respect female computer scientists”. In this case, the words by themselves indicate respect for female computer scientists – but everything else does not.

Context is key and one needs some situational awareness before shouting “You sexist!”. Inclusive language, that so many people propagate, is bogus – we need an inclusive mindset and inclusive behavior. What value is there in someone who thinks and acts like he were pro equal rights but silently envies his father’s generation because they’d managed to keep women in the kitchen? I’d say none. And neither is there value in hardcore pseudo-feminism that sees sexism lurking on every corner. If our goal is equal rights and less sexism, it makes no sense to point our fingers at everything that could remotely be seen as sexism – especially when there’s strong indication that it’s not – and focus on the real problem instead: We need to get this mindset out of or generation of men and women – no matter how old they are, where they’re from or what industry they’re working in.

Why Everybody Can Profit From Learning a Little Bit of Programming

2012-05-15 00:00:00 +0000

TL;DR

Most people would benefit from gaining some basic coding skills. Coding empowers them to build their own stuff and helps them to learn the structured and strategic approach to problem solving that we coders use every day to solve our programming problems.

Background

There’s not many blogs that I read on a regular basis but Jeff Atwood’s (aka Coding Horror) blog is one of them. Just today he published a post titled Please Don’t Learn to Code – and reading it made me sad. Jeff argues against recent initiatives that suggest that everybody can and should learn to code (e.g. codeyear.com). While I agree with some of his arguments and while I can follow his reasoning, I don’t agree with the conclusion.

Sacha Greif wrote a response that is short and to the point: It’s not about mastering something so you can code professionally like we do – it’s about gaining some basic understanding of how these things work. I think this is spot on – but I also think it’s missing two vital points that I want to talk about in this post.

Creating Cool Stuff

Jan Lehnardt responded to my tweet that it’s about “building absolutely cool shit” and I couldn’t agree more. At some point, everybody has some cool idea about a website or application – but they don’t have the money and/or the skills to build it. Which is sad – very sad, in fact. Just think about all the potentially cool ideas that we lose every day and every hour because somebody who has a cool idea doesn’t have the means to make it happen. It reminds me of what Linda Liukas said in her talk at RailsBerry about girls who don’t know how to program: Imagine what young girls (and boys, for that matter) could do with all their energy and motivation if they had ways to channel their motivation into a website, app or even a commercial product. Pure awesomeness.

And even if people have the resources to let someone else do it: There is still a dissonance between the thing you think of and the thing somebody builds for you based on what you tell them. If you could do some prototyping yourself, well, that could change things.

If coding was a “basic skill” just like writing, reading or doing math we wouldn’t have these problems. People could – at least rudimentarily – implement their own ideas and then maybe go find some professional coder to do the real thing or an investor to provide money or whatever. It would be empowering.

Problem Solving

Jeff mentions that to focus on learning to code “puts the method before the problem” and that coding is not the goal – problem solving is the goal. I agree – and this is the important point.

In my humble experience, lots of people lack the basic skills to tackle and solve even the simplest problems. How many people do you know that suck at math – especially word problems? If you haven’t realized it before: They don’t suck at math per se – which can often be seen by their issues with other things that require structured approaches like physics, chemistry or even Latin. They just have a hard time coming up with a strategy to understand and then solve the problem in a structured manner.

Think what we software developers do: Some customer explains the problem to us – and we immediately try to split the problem into smaller subproblems that are easier to understand. Then we rephrase it by writing a test. Then we solve the problem and verify it with the test and by showing it to the customer. In the process, we’re extracting variables and methods – which actually just means we put name tags on previously unnamed things – and we put them in context.

Most people don’t think like that. Instead, they try to solve the whole big ass problem in a brute force style – and quite often, they fail. How often have you looked someone over the shoulder when they wrote down the first steps of an equation and thought “I can solve this problem easily but if I did it like him/her, I would also fail”?

It’s not about learning to code – it’s about learning to take a big problem or task and strategically dividing it up into subproblems or subtasks and then tackling them one by one in a structured way. It’s about a bottom-up instead of top-down approach of doing things – no matter if they’re complex or easy. Learning to code could help people get used to that strategic problem solving because, as I said before, it is extremely motivating and empowering to create things – as opposed to, say, learning this skill by doing math which lots of people find boring and frustrating.

So that’s it: Two reasons why I think that most people would benefit from some basic programming skills.

Multiple Validations on One Field

2012-01-25 00:00:00 +0000

I’m sure you’ve had situations in Rails where you had multiple validations on a single field.

Problem

Say you have a field with a value to be chosen from a select field or radio button group where you need to make sure a value is selected and it’s an allowed value. So you’d have some code like this:

class User < ActiveRecord::Base
  validates_presence_of :gender
  validates_inclusion_of :gender, :in => %w(male female)

  # or with modern ActiveModel syntax:
  validates :gender, :presence => true, :inclusion => { :in => %w(male female) }
end

If a user submits the form without choosing a gender, they’d get two error messages:

Gender can't be blank.
Gender is not included in the list.

I don’t like that for two reasons:

The second errors is logically dependent on the first. In other words, it’s not in the allowed list because it’s blank.
In this case, the second validation is just there for data validity: If we provide correct forms and the user doesn’t hack these forms, they will never even be able to submit an invalid value.

Solution

So how do we get rid of the “dependent” validation? Simple: Rails has a flag for that – :allow_blank. You can use it like this:

class User < ActiveRecord::Base
  validates :gender, :presence => true, :inclusion => { :in => %w(male female), :allow_blank => true }

  # or with the old-style validations
  validates_presence_of :gender
  validates_inclusion_of :gender, :in => %w(male female), :allow_blank => true
end

The :allow_blank option works on most (if not all) other validations (e.g. numericality, uniqueness etc.), too.

My advice would be to use it in a way that makes sense to the user. Example: Password confirmation. Assuming that you have a presence validation on the password itself, it is fairly obvious that an empty password confirmation field doesn’t match the password (again: logically dependent). In this case, you might as well tell the user that it doesn’t match immediately rather than telling them to fill it in at all.

I hope this helps you if you happen to run into this use case in the future.

Improve Your Work Environment

2012-01-11 00:00:00 +0000

TL;DR

Good equipment provides increased productivity, motivation and health for a comparatively low fixed investment every year or two. If you’re a freelancer, get better equipment right now – if you’re an employee, talk to your boss and reason with them in terms of how much more money you could make them or how much money they can save by providing you with better equipment. And for your next career move: Check out your future work environment before you sign the contract: If the development team is seated in dark basement rooms in front of a 15 inch CRT, chances are you don’t want to work for that particular company – no matter how great it all sounds on paper.

Background

I read a chapter titled “Invest in a Good Environment” in The Developer’s Code (it’s awesome, buy it!) today which got me thinking about this topic.

As a freelancer, I get to see lots of people work and, obviously, I get to see lots of work environments – some good, some not so much. I can’t, for the life of me, figure out how one can possibly justify not to create a work environment for their programmers that is perfect (or at least tolerable).

The Usual Suspects

Let’s start with the workhorse itself: the computer. I’ve seen (and been told about) talented people hacking away on their outdated, underpowered Dell/IBM/You Name It laptops or desktops. You know, single core processors, 1-2 GBs of RAM etc. Booting takes ages. So does compiling or – in the case of Rails – starting the server, running the tests etc. You can get 16 GB of RAM for less than $100 today or a new decent system for a couple hundred more.

Next: Screens. Big-ass screens are fucking cheap these days. No, I’m serious: If you’re not doing design work where you need 100% authentic colors at all viewing angels, a good-quality TN+ display will be just fine. You can get a 27 inch screen starting at just below $300 (I recently bought one myself). Smaller displays aren’t that much cheaper – so why would you buy them?

Also: Keyboards. If you’re programming, chances are you spend a good deal of your work day typing. A decent keyboard is an absolute must. I prefer wireless keyboards because I can move them around without knocking things over or worrying about cord length. If you easily get sore wrists or shoulder pains, get a real ergonomic keyboard – I hear Kinesis make great ones.

Desks. You should have a big desk. Not so much for filling it up with piles of chunk or that big ass screen I’ve mentioned. It’s more about having enough space that you can, if you want or need to, look something up in a book, start scribbling or pair program right away without having to push things aside first. Big desks are easier to keep in a usable, clean(ish) state.

Last but not least: Chairs. You sit all day long. And if you’re like me, you’re not taking nearly enough breaks and you’re getting not nearly enough exercise. You owe it to your body to at least rest comfortably during those long hours you work.

Why You (and/or Your Boss) Should Care

Why am I saying all this? If you’re a freelancer, you should just spend some money on decent equipment and if you’re employed you should get your company to do so.

Why? Well, equipment costs are usually low fixed costs, especially compared to what they have to pay you for your work. If a company pays that extra $100 for a bigger/better screen, your increased productivity (and probably motivation) will make up for that in a few weeks or even days.

Decent computers reduce wait time – that is, time you spend waiting for those tests to run, that page to re-render, this debugging tool to start up. If your test suite runs in 5 rather than 10 minutes, chances are you’ll save a few hours per week. That’s more than a hundred hours per year. Take your hourly rate or salary and multiply that by 100 – that’s how much money you’re not making (if you’re a freelancer) or that you cost your company extra without being productive (if you’re employed). If that money argument isn’t good enough, I don’t know what is.

Big screens are good for viewing things side by side or on top of each other. Think about it: How often do you look at a stack trace, figure out that the error is on line 102, switch to your editor and then switch back just to double check that it really is on line 102? Or you’re not 100% focused and forget the line number while switching. With a big screen, you can just go left/right or up/down – no context switch, no remembering/forgetting things – everything’s just there. Or if you’re doing HTML: Seeing the code and the result side by side can be a huge time saver. If you can handle it, two or three displays can be beneficial as well – me, I find multiple displays irritating and prefer one big screen instead.

Keyboards are so damn cheap that it feels hideous to even mention them. Programmers aren’t two-finger-typing phone operators – we touch-type up to several 100 keys per minute. Why would one want to be slowed down by something that costs less than an average dinner for two or – let’s face it – that stupid phone you have on your desk but never use anyway? The same goes for desks, actually.

And the chairs: Sure, good office chairs can be a lot more expensive than the above-mentioned screens. But what’s more expensive: Getting decent chairs or losing part of your workforce in their thirties or forties because they have to undergo multiple weeks of physical therapy?

General side note for all of the above: Companies not permitting their employees to bring their own (better) equipment because of some compliance concerns and other business pseudo-bullshit are just about as ridiculous – maybe even more.

Invest in your own health, motivation and productivity and/or make your company do so. It will pay off for everyone.

Code with me! – continued

2011-05-03 00:00:00 +0000

The overall response to my offer to Code With Me was very positive – lot’s of retweets and encouragement. I also got about two dozen emails from people wanting to code with me – which may not sound like a lot but I was very happy with the amount.

Anyway, before I share some details about the first 4 sessions, I just want to announce that until further notice and unless I’m otherwise engaged, the offer continues every Sunday afternoon/evening. I will do one or two sessions (mostly depending on interest on your side) roughly between 1pm and 8pm Central European (Summer) Time (that’s currently UTC+2) (check out Every Time Zone to find out what time this is in your time zone) so everyone between Hawaii and New Zealand can take the offer if they want to. ;-)

So the next session(s) will be on Sunday 8 May between 1pm and 8pm Central European Summer Time (UTC+2).

The story will basically stay the same:

We can talk via Skype or iChat.
We can use TeamViewer, iChat or Skype for sharing your screen (please ensure that your firewall/security settings permit this).
Please ensure that you have a working Ruby environment and a text editor/IDE. I use TextMate – if you don’t have it and/or prefer another editor, that’s fine, of course, as long as you don’t make fun of me for my low Vim/Emacs/whatever skills. ;-)
If you want to work on code for your company, please doublecheck with your superior if it’s okay to work on the code with me.
Some written feedback (positive, negative, neutral) after the session would be much appreciated but, of course, isn’t required.
If you’re interested, send me an email at “clemens at thisdomain”. Please include some relevant details about you (Ruby/Rails/general programming experience), how you would like to spend the time (topic(s) to talk about, project information etc.) and which date(s) and time(s) would be good for you (you should also include your time zone/UTC offset so we can work out times easily). Regarding projects: The first few sessions showed that it really pays to use an actual project as a basis for work/discussion. I don’t mind whether this is a personal or company project or what the current status is – the only requirement (of sorts) is that you have an idea what you want to work on.

I will definitely respond to every email I get – if I’m busy, that might take 1 or 2 days.

Now, let me share a few details about the first 4 sessions with you.

Andrea Singh

Andrea had a concrete problem with using Sphinx in the open source application she’s been working on for the last couple of months (the Zurich University of Arts’ media archiv). She’s been using Thinking Sphinx – which has been tedious, mainly because owing to the application’s architecture they had decided to use XML rather than SQL data sources.

The main point we’ve figured out is that she probably wants to use Riddle (the Sphinx library underlying Thinking Sphinx) rather than Thinking Sphinx to query the indexes: While you could theoretically use Thinking Sphinx, it really is optimized on being used solely with ActiveRecord. Take a look at the gist that I’ve written for a discussion on making Thinking Sphinx’ facets faster in newer versions of Sphinx to get an idea of how to build a search (or facet) interface that is similar to Thinking Sphinx’ API but uses pure Riddle. The payoff is that you have full control about what’s going on.

Here’s Andrea’s feedback:

I had a great pair programming session with Clemens yesterday. I was grappling with a problem that I’ve been working on for some time. I was quite impressed by his ability to grasp the scope and details of the project after just a brief introduction. It was also very interesting for me to see how another developer would approach the same issue. The code samples we discussed opened up a whole new perspective. Clemens was altogether very approachable and pleasant to work with. I would definitely like to keep contact in the future.

If you’re interested, follow her and check out her blog. She told me that she’s starting freelancing again – from what I’ve seen, she definitely knows how to code – so hire her (that is, if you decide to not hire me instead ;-)).

Chris Floess

Chris wanted to work with me because he had difficulties deciding where (model? view? controller?) to put a form generator (think Wufoo but simpler) and how to test it.

We discussed that in my opinion it should go into the model (after all, it’s business logic) – however, since Rails’ form helpers aren’t available to models (and trying to hack them in usually results in a mess – as does manually generated HTML), the most pragmatic choice would be to put it in a helper method because all Rails helpers are available there and it’s also easy to test. So we went ahead and extracted the logic that had been a partial before into a helper method and wrote a proper test for it.

Overall, Chris and I had some interesting things we discussed in terms of application architecture – it often boiled down to how Rails interprets the Model-View-Controller pattern (which isn’t everybody’s cup of tea) and the classic problem of having too much logic in your (procedural) views.

If you’re interested, he twitters as @flooose.

Alejandro Cadavid

Alejandro had a couple of specific questions. The first was a “classic”. Consider this code:

# products_controller.rb
def create
  @product = Product.new(params[:product])
  brand = Brand.find_by_name(params[:brand_name])
  brand ||= Brand.create(:name => params[:brand_name])

  # ...
end

While it’s fairly clear what this code does, it’s not ideal. First of all, Rails has the find_or_create_by shortcut method to do these two things in one go. Second, the better solution in my opinion would be to use a virtual attribute:

# product.rb
def brand_name
  brand.try(:name)
end

def brand_name=(brand_name)
  self.brand = Brand.find_or_create_by_name(brand_name)
end

With that code you can just treat brand_name as another attribute of the product model in the form – no additional code in the controller is needed. If you’re unfamiliar with virtual attributes, I suggest you check out the Railscasts on that topic.

Another cool thing that we discussed was the use case where you use the same form to generate different models – for example with Single Table Inheritance or nested forms that contain an object from a Polymorphic Association. Alejandro had something like this:

<%# view %>
<%= select_tag 'type', options_for_select('Audio' => 'audio_file', 'Video' => 'video_file', 'Text' => 'text_file') %>
<fieldset id="fields_for_audio_file" style="display:none;">
  <%= file_field_tag 'audio_file[file]' %>
  ...
</fieldset>

# controller
case params[:type]
when 'audio_file'
  @file = AudioFile.new(params[:audio_file])
when 'video_file'
  @file = VideoFile.new(params[:video_file])
when 'text_file'
  @file = TextFile.new(params[:text_file])
end

The relevant fieldset is toggled via JavaScript when the select field changes. Due to the fine Convention over Configuration Alejandro had already intuitively used, we could cut down the controller code to something like this:

valid_types = %w(audio_file video_file text_file) # or something like AbstractFile.subclasses.map { |klass| klass.name.underscore }
if valid_types.include?(params[:type])
  klass      = params[:type].classify.constantize
  attributes = params[params[:type]]
  @file      = klass.new(attributes)
end

Note the explicit whitelisting of allowed types to avoid both errors and misuse. The code hasn’t become that much shorter in terms of lines of code (unless you have many different types) but in my opinion it’s clearer, more flexible and more extensible.

There might have been something else that we’ve discussed but if so, both Alejandro and I forgot. ;-)

Alejandro twitters at @acadavid.

Chris Berkhout

Chris had a question regarding caching and one regarding screen scraping and how to test it.

Chris had a KML file to render an overlay on a Google Map. There’s be one KML file for all users that changes only once per day. The first option in this case would be to use Rails’ page caching: Cache the page when it’s accessed so it’s persisted on disk, then use a cron job to remove it as often as you want or need. A second – and more flexible but less performant – approach is to use fragment caching with dynamic, self-invalidating cache keys. Consider this:

# show.kml.builder
cache("#{Date.current.strftime('%Y/%m/%d')}/overlay") do
  # ...
end

As soon as the date changes, you’ll have a cache miss and a new fragment is cached. There’s more to this approach – I’ll try to publish a blog post on it soon because I think it’s worth sharing.

Regarding screen scraping, I had two hints for Chris and now for you:

If you only do simple scraping, you can use open-uri together with a simple HTML/XML library like REXML, Hpricot or Nokogiri (depending on your preference and/or performance needs). Ilya Grigorik has written a cool tutorial for using open-uri and Hpricot to scrape pages.
For more complex projects (and/or if you prefer more declarative code), consider using a dedicated screen scraping library like scRUBYt! or scrAPI or even writing your own (most libraries I’ve seen don’t seem to be actively maintained). I’ve worked with scrAPI in a pet project and it works reasonably well.

Both are problems he ran into when looking at mapping travel safety advice. The source code is available on GitHub and deployed to CanGo.to.

Chris then showed me (awesome! – shows that my idea goes both ways!) some stuff regarding server administration and deployment. Definitely check out babushka which seems to be a great tool for quickly setting up test/demo servers (for production, I still prefer another tool: the good old system administrator ;-)). The author’s and Chris’ own deps can both be found on GitHub. Chris’ rake-based deployment solution called RAG Deploy is also on GitHub. It seems to have some features similar to Capistrano as well as more direct version control integration (think Heroku). To be honest, I’m not sure what to think about it but I think it’s great that people are constantly working on making the tedious process of deployment easier.

Here’s what Chris said:

Thanks for taking the time to chat and look at some code. There are a lot of great resources out there, but having a real person give some advice and talk through the issues is always helpful, and with you it was a pleasure!

Overall, I was very pleased with how the sessions turned out – which is, of course, why I’m continuing the offer.

Code with me!

2011-04-23 00:00:00 +0000

This is the original post. There is [a newer one](/2011/05/03/code-with-me-continued) that explains how I plan to continue with the project. Some summaries of the first couple of sessions are also included.

2011 has been a good year for me so far – financially as well as in terms of cool work in general. I feel like it’s time to share some of that luck with others and spend a couple of hours per week working pro bono, helping people.

Back in 2006 when I was starting out with Ruby and Rails (and, for that matter, back in 2001 when I started learning PHP), I wished I had someone who would just take the time to sit down and code a little with me: show me some tricks, help when I’d get stuck etc. Yes, there are forums (like ruby-forum.com and the WorkingWithRails forums), but like others I’ve never quite warmed up to forums – at least not in a support and learning context. Nothing’s better than someone with more or equal experience sitting down with you and working with you – in programming terms, that’s pair programming. I’ve never learned more (or at least felt that I’ve learned more) than during the couple of months pairing with Sven Fuchs who – apart from being an awesome person in general – was the greatest guy I could have wished to pair with. Recently, I’ve also discovered that I tend to learn a lot when explaining stuff to others.

So that’s what I want to do: Pair with you for some time, review/refactor code, write some tests, answer any questions … In short: Whatever you want or need. For free.

Here’s what I’m thinking:

I want to spend 1.5-2 hours twice a week (depending on my workload) on this.
It will (mostly) take place in the evening – that is, after 5pm Central European Time (UTC+2).
I’ll announce dates and times a couple of days before.
We can use Skype or iChat for talking.
If you want me to code with you rather than just talk, I’ll use TeamViewer to connect to your computer. Please make sure that TeamViewer is installed and functioning (Firewall etc.) and that your connection is strong enough to support voice + screensharing with acceptable quality. You also need a working Ruby environment.
If you want to work on code for your company, please doublecheck with your superior if this is okay. I ensure you that you can trust me to not steal ideas and/or code from your company but I won’t sign NDAs for this.
It would be great to get some written feedback – both positive and critical – from you afterwards per email or published on your blog (if you have one). You don’t have to but I’d certainly appreciate it.
If you’re interested, send me an email at “clemens at thisdomain”. Please include some relevant details about you (Ruby/Rails/general programming experience), how you would like to spend the time (topic(s) to talk about, project information etc.) and which date(s) and time(s) would be good for you.

The first date, if someone wants to do that despite short notice, is Monday, 25 April between 5pm and 9pm UTC+2. I’ll announce the second date in the next couple of days, but it’ll probably be on Sunday, 1 May between 4pm and 8pm – so make a kind of “advance reservation” if you want. ;-)

Get in touch and/or spread the word if you like this!

Trend Whoring

2010-07-18 00:00:00 +0000

This will be my shortest blog post ever because I basically only have one thing to get off my chest:

A good programmer is someone who embraces the advent of new technology as an opportunity to learn something new rather than to replace something existent just because the new thing is oh-so shiny and hip.

Trend whoring is never trendy – what is trendy is critically evaluating new technology and using it where appropriate.

That is all. Now go back to whatever you were doing.

Facebook Trouble (And What We Can Do About It)

2010-05-15 00:00:00 +0000

I’ve been registered at Facebook since my semester abroad in Australia in 2007 – mainly to keep in touch with a few people I’ve met there. Truth is, I’m not much of a user and I never was. I post the occasional status/mood message, comment on some people’s messages/photos and there are probably ~ 20 photos of me online (that I know of). I’m a member of probably a dozen groups/fan pages and there’s no information on my profile that I wouldn’t tell a stranger on the street – I don’t think, knowing who my girlfriend is or my date of birth is an intrusion in terms of privacy. And I don’t play Farmville, MafiaWars and whatnot, and have set pretty much all my privacy settings to “Only my friends”.

Still, I’m having trouble accepting Facebook’s latest changes in terms of privacy (btw. there’s a cool visualization of what has changed in Facebook’s default privacy settings since 2005). This whole connecting your profile to websites is just scary. Not so much because of myself but because of other people – people not having the right sensitivity regarding online privacy and/or who – due to their lack of technical skills – can’t navigate through Facebook’s privacy settings or properly understand them.

Calls for an Open Alternative

Some people are calling for a change – for example, Wired magazine posted an article calling for an open alternative to Facebook a couple of days ago and it seems that some people have already been working on fairly interesting stuff. There’s a distributed social network named Diaspora, for example, which looks promising and there’s also OneSocialWeb – both featured in another article on Wired magazine.

Do I think these projects (and there are certainly others in the makes as well) are interesting? Hell yeah. Do I think they’ll be able to become serious competitors for Facebook? Not really. Reasoning? There’s two things that will stand in the way: people’s ignorance regarding privacy and general problems about Facebook, and people’s adamant cry for activities offering instant gratification.

Facebook Doesn’t Have a Lot of Features

Under the covers, Facebook really doesn’t offer a lot: I’d guess the core features could be implemented in a few weeks even with a small (but capable) development team. Even scaling shouldn’t be that much of a problem before reaching a certain (big) amount of users.

The real USP lies in the Facebook platform, providing a possibility for companies to offer Facebook users their (mini) applications. There are lots of games and more or less funny tools, offering users a little distraction from their work/studying/whatever.

Facebook’s core features actually have become a by-product of its application platform: People would use the apps without the Wall, Poking or Photo Albums, but most likely not vice-versa. Looking at current Facebook statistics – though they may be forged – may confirm this: Apparently, 70% of all Facebook users engage with the platform regularly in some way.

As Long as People Like to be Insulted by Dr. Cox, (Open) Competitors Don’t Have a Stinkin’ Chance

It all boils down to a simple equation: People like instant gratification through applications offered through Facebook by companies. Companies are creating Facebook apps to get a hold of people’s personal data. Thus: If a platform doesn’t offer people’s personal data, companies won’t write applications for it and people won’t use it because there aren’t any apps.

So as long as people like to be insulted by Dr. Cox (of Scrubs) on a daily basis, be offered free advice by the Wise Nut, nourish their crops in FarmVille or rob their friends in MafiaWars, social networks wanting to compete with Facebook will have to offer something similar to their users. Since companies don’t develop free applications for shits and giggles, I don’t see that happening anytime soon.

So What Can We Do?

As I’ve outlined, I fear that Facebook not only is here to stay but also that competitors don’t really have a chance to attract a relevant market share any time soon. And I don’t see global online laws being established anytime soon – so Mark Zuckerberg and others will be able to continue executing their criminal plans (yes, I think some of the things Facebook does are criminal!). So all we can do for the time being is damage limitation and try avoiding collateral damage.

I think, our duty as tech-savvy people is to inform and maybe educate our less tech-savvy friends and family about online privacy in general and Facebook’s privacy issues in particular. We can tell them to take down all profile information that isn’t absolutely necessary and not join stupid groups or fan pages anymore. We can inform them about alternatives – why not put pictures on Flickr, travels on Dopplr or TripIt etc.? We can take a look at emerging projects like Diaspora and OneSocialWeb, and in due time try convincing people to switch – even though there might not be FarmVille, MafiaWars or Which Vegetable Are You (I was broccoli, btw.).

And we can hope for Mark Zuckerberg taking it a notch too far, resulting in either conviction (he’d certainly deserve it) or huge amounts of people leaving Facebook for good (he’d deserve that even more).

UPDATE: Don’t forget to uncheck the checkbox at http://www.facebook.com/settings/?tab=privacy&section=applications&field=instant_personalization so at least Facebook’s partners don’t get access to your information as soon as you just visit their website coming from Facebook.

Inheritance, Pseudo Classes and Mixins Gone Awray

2010-05-03 00:00:00 +0000

Somebody asked me a few days ago to give them an opinion on how they approached implementing a little library dealing with SVG. While it seemed mostly okay, something struck me: Although there were multiple types of points/shapes, functionality was mostly shared using modules. That by itself isn’t bad – it’s only bad if inheritance would be the better, more natural choice.

Rubyists Love Modules

We Rubyists love modules (mixins) because they allow us to share code between non-related types. In contrast to other programming languages – Java being a prime example –, Ruby modules have a major advantage: Not only do they let us share an interface but they also enable us to inject actual implementation. This comes at a cost: Because sharing implementation is so remarkably easy using modules, at times, we tend to overlook when we should actually favor inheritance.

Mixins Gone Awry

Take a look at this (fake) example, based on the SVG stuff mentioned above:

module Drawable
  def draw
    # ... whatever ...
  end
end

class Triangle
  include Drawable
end

class Rectangle
  include Drawable
end

class Square
  include Drawable
end

One of the things to remember about object-oriented programming is that we’re trying to depict reality by using objects. While using mixins certainly is possible, we do have actual real-life inheritance here: Triangles, rectangles and squares are all shapes; and squares are, of course, special rectangles.

If there is a relationship in reality, then we shouldn’t destroy that relationship in OOP.

Detecting Real Inheritance

The ultimate helper to find out whether or not one should use inheritance is quite simple: You should use inheritance for is a / is an relationships. “Rectangle is a shape?” => Yep. “Square is a rectangle?” => Yep.

Unfortunately, we can’t really count on Ruby’s own is_a? method because it also takes into account modules:

square = Square.new
square.is_a?(Rectangle) # => true
square.is_a?(Drawable)  # => true (and it really *is* true)

Whenever this is not the case, you’re essentially doing fake inheritance. Some prime examples can be found in Rails: Inheriting from ActiveRecord::Base is wrong – at least in the object-oriented sense.

class Person < ActiveRecord::Base
end

clemens = Person.new
clemens.is_a?(Person)             # => true
clemens.is_a?(ActiveRecord::Base) # => true (but only in Ruby/Rails, not in reality!)

In reality, a person just is not an ActiveRecord::Base (what is that anyway?).

Other object mappers – DataMapper, MongoMapper, CouchPotato etc. – are doing it right:

class Person
  include CouchPotato::Persistence
end

While certain persons certainly are couch potatoes in its metaphorical sense, they aren’t for real (again: What the hell is a CouchPotato::Persistence?) – so there should not be any inheritance.

Pseudo Classes

Case and point: ActiveRecord::Bases don’t exist in reality – I’ve never seen one walking down the street. So ActiveRecord::Base, in my opinion, is a pseudo class.

If something doesn’t exist in reality, there shouldn’t be a class for it.

On the other hand, I’ve certainly seen persons walking about and I’ve drawn a couple of squares and triangles in my time. These most certainly are real classes.

Using mixins is just fine: Certain objects share behavior – like being persistable. However, you should name them accordingly. I like Java’s often-seen approach here of suffixing its interface with able: Runnable, Serializable etc. It can be found in the Rails world, too: Devise uses it extensively. So in my opinion, the better name for the above would be CouchPotato::Persistable.

Summary

When building a class hierarchy, try to think about reality: Are both classes real (non-pseudo) classes and is there a real, actual relationship between the classes in question? If so, use inheritance. If not, use mixins (and name them properly to reflect that they provide certain functionality). It’s as simple as that.

Named Scopes Are Dead

2010-03-09 00:00:00 +0000

**Clarification:** This post has been sitting unfinished on my disk for a couple of weeks but the update issues I ran into recently finally made me finish and publish it. I didn’t write this post to insult or bash Rails 3 or its contributors for the scope/arel bug I’m mentioning somewhere along the road. I’ve created a [ticket in the Rails Lighthouse project](https://rails.lighthouseapp.com/projects/8994-ruby-on-rails/tickets/4142-arel-methods-in-scope-cause-issues-with-non-existent-tables) that addresses the issue. If you have any insights or ideas with regards to the issue, please head over to the ticket and participate in the discussion.

The bug mentioned in this post has been fixed quite a while ago. Please read the updates.

About 2 years ago, Nick Kallen’s awesome has_finder plugin was brought to Rails in the form of named scopes. And I loved them dearly and used them a lot. In Rails 3, so far, I’ve found them to be obsolete – even harmful. Which is why I think they’re dead.

In this article, I’ll explain how named scopes were used in Rails 2, what was awesome about them and what wasn’t quite so awesome. Then I’ll take a look at what has changed in Rails 3 – mainly due to the advent of arel – and explain why you should think twice before using scopes in Rails 3.

What Named Scopes Used To Be

Named scopes’ awesomeness mainly came from the possibility to pass them around as objects, stack them etc. without them being evaluated. They were only evaluated when you tried to use them like any other ActiveRecord collection – like calling an iterator on them. Folks call this lazy loading. Here’s an example:

class Post < ActiveRecord::Base
  named_scope :published, :conditions => { :published => true }
end

class PostsController < ApplicationController
  def index
    @posts = Post.published # ... no query here ...
    # @posts = Post.published.all would trigger a query
  end
end

# somewhere in the view:
@posts.each do |post| # here's where the query gets triggered
  # ...
end

One of the things it enabled you to do slightly more complex stuff in controllers without cluttering them up:

@posts = Post.published
@posts = @posts.by_author(@author) if @author
@posts = @posts.after(@start_date) if @start_date
# ...

Some people might argue that this stuff belongs into the model anyway (and I’m sure, we all remember having huge finder methods featuring the infamous with_scope method) but I say named scopes enabled a more balanced communication between the controller and its models.

But It’s Different Now

Different how? Well, thanks to arel, everything in Rails now is a scope of sorts anyway – a relation to be more exact. A relation only gets lazy loaded – just like a named scope used to. Take a look:

# in the controller
@posts = Post.where(["created_at >= ?", Time.current]) # no query yet ...
# again, @posts = Post.where(["created_at >= ?", Time.current]).all would trigger a query

# in the view
@posts.each do |post| # here's where it triggers the query
  # ...
end

This is somewhat similar to what you could do in Rails 2 by using the scoped method:

@posts = Post.scoped(:conditions => ["created_at >= ?", Time.current])

So effectively, every relation stays unevaluated until you hit it with a non-arel method like each or all. Read more in Pratik’s blog post on the new ActiveRecord Query API – no need for me to go into too much detail with him explaining it so well.

If you want to read more about relational algebra – the basic mathematical rules that arel is based on – there’s a pretty good wikipedia article on it.

Where’s The Harm?

You might argue that using named_scope (or, as it is in Rails 3, just scope) makes your models a bit more declarative and I tend to agree: It certainly looks good to expose most of your models’ APIs in class level statements like scope, has_many etc. and put them all on top. But let’s look at the downsides.

For starters, I’ve never quite liked the syntax to define dynamic scopes using lambdas:

scope :by_author, lambda { |author|
  {
    :conditions => { :author_id => author.id }
  }
}

There’s just too much punctuation flying around for my taste and you can’t use the do/end syntax unless you parenthesize the whole lambda, thanks to operator precedence. If the need for optional parameters arises, things can get really messy. Even more so, since Ruby 1.9 changes some things about lambdas – if you develop in both, 1.8 and 1.9, that can really wreck your brain.

But the real harm happens when you decide to ditch the hash syntax in favor of arel:

scope :published, where(:published => true)

Unknowingly, you might have just created an update blocker for your co-developers! Scopes – like associations – are evaluated when a class is loaded and arel checks that the underlying table (in this case posts) exists. Now imagine you check the new blog feature into your git repository, including the migration that actually creates the posts table. A co-developer tries to migrate and might just get the following:

~/tmp/scopes% rake db:migrate
(in /Users/clemens/tmp/scopes)
rake aborted!
Could not find table 'posts'

This happens if – by accident or intentionally – the post model is loaded somewhere as port the environment startup. This happens, for example, if you define an observer on the post model or extend some model in an initializer. To work around this, your co-developer would have to comment out the offending scopes, migrate and comment them in again. Seems tedious, huh? And what for? A little more expressiveness? And don’t forget that your continuous integration server needs to migrate every once in a while, too, and can’t just comment out a few lines …

Class Methods To The Rescue

Thanks to arel’s greatness, you can avoid the hassle and just go back to using good old class methods:

class << self
  def published
    where(:published => true)
  end
end

Remember that relations stay unevaluated – and thus still chainable! – until they’re first used in a non-arel context.

This also helps to avoid the tedious lambda syntax that was needed for dynamic scopes:

class << self
  def by_author(author)
    where(:author_id => author.id)
  end
end

No problems with optional parameters or differences between Ruby versions either – methods are just methods, after all. So after all, there is just one syntax for one thing again.

Should I Ditch Scopes Completely?

I say, for any new work you do, you definitely should avoid using scopes because of the update blocker it might become. Just use class methods instead as shown above.

If you’re porting a Rails 2 application to Rails 3, you’ll get a whole load of deprecation warnings because named_scope has been renamed to be just scope. If you want to avoid these deprecation warnings, you have to work through your code anyway, so you might as well just go ahead and turn scopes into class methods.

The only exception is if you want or need to use scope extensions. However, I’ve somehow managed to never have to use them so maybe you can avoid them, too.

If you want to keep some of scopes’ declarative nature, consider just putting them in a module named Scopes and extending the class like so:

class Post < ActiveRecord::Base
  module Scopes
    def by_author(author)
      where(:author_id => author.id)
    end
  end
  extend Scopes
end

This way, you keep scopes somehow separate from the rest of your models’ code.

If you feel that in any case you absolutely must use scopes, make sure you don’t convert use arel syntax for it and stick with the good old hash syntax for the time being. Keep in mind: Apparently, the hash syntax will be deprecated in Rails 3.1. But I guess Rails 3.1 won’t be out for some time so you should be fine for now.

In short: Named scopes are dead – long live arel!

Update (9/3/2010)

Iain Hecker correctly notes that using class methods was already possible in Rails 2 by using the scoped. Example:

class << self
  def by_author(author)
    scoped(:conditions => { :author_id => author.id })
  end
end

You can, again, use this to circumvent the awkward lambda syntax but with Rails 2’s hash syntax you don’t do any harm in terms of update blockers.

Update 2 (21/4/2011)

Since some people have mentioned this post lately, I feel I should clarify that the bug that triggered this issue has been fixed quite a while ago. I’ve since mostly used scopes for “simple” rules and class methods for more complex rules because in my opinion it reads better.

Mainly, whether you use scopes or class methods has, again, become a matter of personal taste. My recommendation is to choose your own style or agree on one style with your fellow developers.

Avoiding Rails 3 Dependency Hell With RVM

2010-02-13 00:00:00 +0000

You’ve probably read by now that the Rails 3 beta has been released last week. I have to admit that I didn’t actively follow the Rails’ development over the last couple of months so I was quite excited to finally be able to get my hands on the beta gem.

It turned out to not be so easy to get going with Rails 3 – hence this post. I want to show you how to get going with Rails 3 without the hassle that I had – by using Ruby Version Manager aka rvm.

Don’t Mess With Your Current Environment

If you’re like me, you’ll still have a couple of Rails 2 projects you need to maintain and it will probably stay that way for at least a couple of months. Thus, one of your priorities should be to preserve your environment’s – i.e. your gem setup’s – healthy state.

It turns out that RubyGems is quite a bitch when it comes to runtime dependency management – which is one of the main reasons why Rails 3 recommends using Bundler over RubyGems at least for runtime dependency management. However, you still run into issues with multiple versions of the same gem – say, ActiveSupport – installed. Or at least I did. You know, these annoying errors like “Gem::LoadError: can’t activate activesupport (= 2.3.5, runtime), already activated activesupport-2.3.4” (see Yehuda’s blog post over on the EngineYard blog).

RVM to the Rescue

These issues gave me a more than valid reason to finally test out rvm. If you’ve read the post about how I cleaned up my system, you’ll know that I hitherto used Ruby as it ships with Snow Leopard (1.8.7p72), putting my gems in my home directory under a hidden .gem folder.

To have separate environments, I installed rvm together with the newest Ruby 1.8.7 and activated it.

  ~% rvm use 1.8.7
  ~% gem list
  *** LOCAL GEMS ***
  rake (0.8.7)

Bingo – no gems except for rake (which gets installed with rvm) installed in the current environment.

Install the Rails 3 Gems

As the introductory post suggests, the next step is to install Rails’ dependencies manually (you won’t have to do this when the final gem is released) and Rails itself:

  ~% gem install tzinfo builder memcache-client rack rack-test rack-mount erubis mail text-format thor bundler i18n
  ~% gem install rails --pre

You can verify that you have two properly separated environments by checking gem lists and, for example, the rails command:

  ~% rvm 1.8.7
  ~% which rails  
  /Users/clemens/.rvm/gems/ruby-1.8.7-p249/bin/rails

  ~% rvm system
  ~% which rails
  /Users/clemens/.gem/ruby/1.8/bin/rails

Neat.

Switching between Rails versions

rvm makes it really easy to manage your different workspaces. A simple \~% rvm use \[VERSION\] will enable a given environment. So whenever you’re working with Rails 3, you just activate the Ruby version you installed it with. And if you’re maintaining your Rails 2 projects, you just go back to another Ruby (or, as in my case, your system’s original Ruby).

Another Option: rvm’s Gem Sets

When I suggested on Twitter that by using different Ruby versions on rvm you could circumvent dependency hell, Prem Sichanugrist replied that you could also use another rvm feature called gem sets. This basically allows you to have several different gem setups within the same Ruby installation. You can do this with a slightly altered rvm command:

  # instead of ...
  ~% rvm 1.8.7

  # ... you use ..
  ~% rvm 1.8.7%rails3

This gives you a gem set called rails3 where you can keep your Rails 3 related gems safely tucked away from the rest.

And that’s it – Rails 3 without too many dependency issues.

Update (13/2/2010)

rvm’s author Wayne Seguin just pointed out that it is even easier to get going with rvm and Rails 3. You can read more about it in this gist. Note that this uses Ruby 1.9.1 which might be a good idea, anyway – if you want to use Ruby 1.8.7, just replace the relevant lines in the gist.

Relaunch

2010-01-10 00:00:00 +0000

After using Typo CMS and one of its standard themes for nearly 2 years, I’ve finally sat down and come up with a new design. The result is not remarkable, but I like it way better than the old one - especially because it actually carries the railway logo.

While I was on it, I decided that I didn’t like my lack of blogging activity in 2009. This was mainly because I’ve been working like crazy but also due to the fact that I simply didn’t like Typo anymore. After reviewing some CMS solutions - obviously adva-cms was part of the review - I decided to go with the simplest possible solution and settled with Jekyll. It’s simple, ultra fast and it enables me to just deploy my website on GitHub - what more could I ask for?

Anyway, I promise you a more active blogging year 2010. I’ve already got some articles in the works that are mostly or even completely written and just need some final polishing. I’ll probably also add Disqus comments when I find the time and migrate the old comments over.

Towards a Really Clean System

2010-01-10 00:00:00 +0000

Some time ago, Yehuda Katz posted some ideas on how to keep his system and his code directories clean. When I had issues with my MacBook Pro lately (which was still on Leopard), I finally decided to just erase everything and start a clean Snow Leopard setup like I did with my iMac a couple of months ago.

How I Fucked it up - Big Time

I bought my MBP back in 2007 and it was my first Mac - and although I had used Linux for a while before and compiled some things every now and then, I didn’t know a whole lot about compiling, let alone about Mac OS X’s internals and just followed tutorials. So, naturally, everything got a bit messy quite soon - and even more messy after I stumbled across MacPorts and decided that it might suit me just fine. So I ended up with probably 3 different version of MySQL, Ruby etc. - scattered in /Library, /usr/local and /opt.

Pretty much the same thing held true for my code, documents, etc.: Everything was all over the place and while it had some kind of structure, it certainly wasn’t organized all that well. Finally, when I started to do full-time freelancing, I became increasingly frustrated by the fact that my code was in some place and specifications, examples etc. that I had received from my client were somewhere else entirely.

Towards a Better Environment

A couple of weeks ago, I used my iMac as a kind of test environment for better organization. I did this because I use my MBP every day at work and can’t really risk having issues with it. The iMac, on the other hand, is my home computer and while I do use it for development at home, I still have my MBP as a fallback if something goes wrong.

I had a couple of iterations where I tried out different things and switched to the next idea whenever I was unhappy with the result of the last iteration. After some iterations, I think I’ve finally found a setup that works for me.

My Clean Setup

Disclaimer: My environment suits me - that’s why it’s my environment. If it doesn’t suit you, it might at least give you some ideas for your own perfect setup. However, I’m glad to hear some opinions on what I might want to change to further improve it!

Basic Setup

First of all, of course, I installed Snow Leopard. Since I had messed up my setup so badly, I decided to completely erase my disk (after doing a full Time Machine backup, of course) and set up a clean OS X.

Next, I needed Xcode. I decided to get the newest version online instead of installing the older one from the Snow Leopard DVD.

Finally, since I use the shell a lot, I needed some decent setup there as well. I use zsh like pretty much every (Rails) developer I know because it’s awesome. Taking a shortcut with configuration, I went with Ryan Bates’ dotfiles as they provide a good starting place.

Homebrew

Thanks to some recommendations via Twitter, I tried out Homebrew. Apparently, it’s the new cool thing in terms of package management - at least if you believe the hype. One of the things that really ate up disk space when I was still using MacPorts was that it always installed a real shitload of unnecessary stuff - mostly optional packages and stuff I already had installed by hand or via a DMG (and MacPorts doesn’t care about anything but itself). Homebrew significantly decreases this overhead and it’s really clean.

After first having it installed in /usr/local (just as the README suggests), I decided to move it to my home directory - $HOME/local, to be more exact, because I wanted it to be separated from the rest of my home folder’s contents. I decided against /usr/local not because I anticipated that someone else might use my laptop (I’d have to kill them) but rather because some intrusive pieces of software occasionally dump their stuff in /usr/local (e.g. the LaTeX packages for OS X). I just want to keep an eye on what I installed and be able to easily remove it without thinking about potentially breaking other software. Another plus is that I can leave /usr/local as being owned by root and still can install everything via Homebrew without sudo-ing.

Don’t forget to add $HOME/local/bin and $HOME/local/sbin to your path.

Gem setup

Another thing I needed to avoid sudo-ing was setting gems to be installed in my home directory. Pat Allan’s gist shows you how to do that. After that, it’s just gem install whatever to install any gem you might need - no sudo, thank you very much.

You should also add $HOME/.gem/ruby/1.8/bin to your path so binaries of any gems are executable.

Documents

Although I don’t really need to share my documents between my iMac and my MBP, I still want to - and I don’t want to do it manually. After toying a bit with symlinking my iDisk contents on both machines, I decided that it was too slow and - even more importantly - wasn’t suitable because it doesn’t keep local copies which would have basically ruled out offline usage - not good.

Thus, I decided to use Dropbox. I don’t need to sync a whole lot of stuff, so the free plan with 2GB is more than enough for me at the moment (by the way, if you sign up because you’re reading this, I’d appreciate it if you could use my referral link - thanks!). I then symlink folders where they belong - say, I have presentations, I’d keep all local-only presentations in Documents/Presentations and symlink Dropbox/Presentations to Documents/Presentations/Shared.

Code

As I mentioned earlier, I’ve lost track on all my local repositories because they were just everywhere.

With my new setup, everything (with the exception of Homewbrew) now sits in a hidden folder called .repositories in my home folder. Seeing what code I have on my machine is now just a matter of calling ls \~/.repositories.

Flat structures don’t scale very well, obviously, so I added some structure. I placed a Development folder in my home directory and created some suitable subfolders like Rails and Ruby. In these, there are subfolders like Plugins and similar which contain symlinked directories that point to their respective repository in .repositories.

Projects

I already indicated that one of the things I didn’t like about my old setup was that a project’s code was in one place and specs, documents and other non-code stuff was somewhere in my Documents folder.

Now I have a Projects folder in my home directory that contains subdirectories for clients which in turn contain subdirectories for every single project. These projects then usually contain two symlinked folders, Code (pointing to some folder in Development) and Documents (pointing, obviously, to some subfolder of my main Documents folder). You can take the whole thing to the extreme and drag all your clients’ emails, relevant weblinks etc. to their respective project’s Documents folder and really have everything in one place.

I haven’t done this yet, but obviously it would be easy to write a little shell script or alias that would enable you to just type something like project PROJECT_NAME which would then open the Documents folder in the Finder and the Code folder in TextMate.

Miscellaneous

Speaking of TextMate: I don’t like shell scripts like mate or gitx lurking around in /usr/local, so I created a bin folder in my home directory, moved these scripts there and added $HOME/bin to the path.

If you use Passenger for development, you can keep your Apache’s httpd.conf clean by keeping the Passenger stuff in a separate file in /private/etc/apache2/other/passenger.conf. The Passenger Preference Pane does this too by keeping the vhost configurations in /private/etc/apache2/passenger_pane_vhosts.

The New Workflows

Installing packages:

  brew install PACKAGE_NAME

Installing a gem:

  gem install GEM_NAME

Adding a new code repository:

  git clone REPOSITORY_URL ~/.repositories/REPOSITORY_NAME
  ln -s ~/.repositories/REPOSITORY_NAME ~/Development/some/path/REPOSITORY_NAME

Adding a new project (after setting up the repository):

  mkdir ~/Projects/some/path/PROJECT_NAME
  ln -s ~/Development/some/path/REPOSITORY_NAME ~/Projects/some/path/PROJECT_NAME/Code
  ln -s ~/Documents/some/path/PROJECT_NAME ~/Projects/some/path/PROJECT_NAME/Documents

Sharing documents across my computers:

  cp /some/folder/or/file ~/Dropbox/Documents/some/folder/or/file
  # if it belongs to some existing folder:
  ln -s ~/Dropbox/Documents/some/folder/or/file ~/Documents/some/folder/Shared

Where Does it Go From Here?

The possibilities are endless. You could, for example,

create shell scripts and aliases that facilitate either of the tasks
symlink the logs of applications to their project’s folder so that you can view them more easily
buy more Dropbox space and also use it for music, movies, IRC transcripts, …
etc.

Personally, I’m happy with my setup for now. I can’t say that I’ll always be happy with it but at the moment it seems to be exactly what I want.

Update (13/2/2010)

As I’ve run into a real dependency hell while toying with Rails 3, I recently installed Ruby Version Manager (rvm). I can now warmly recommend using this awesome piece of software if you have (or want) to juggle with multiple Ruby versions and (gem) environments. You can read some details of what made me install it in my post on Avoiding Rails 3 Dependency Hell With RVM.

Using Cucumber to test a multilingual app

2009-09-12 00:00:00 +0000

Last week I gave my i18n talk from Rails Underground (slightly updated) at our local Ruby Usergroup Berlin (RUG-B). After the talk, a question came up that raised quite a bit of discussion: How do you test translations in Cucumber or - in general - integration tests. While Henning has already summarized what was said, I want to take on the opportunity to take a closer look at this issue.

The Problem

To illustrate the problem, let’s look at the example that Henning uses in his blog post:

When I go to my bookings page
And I press "Delete booking"
Then I should see "Booking deleted"

So far, nothing special - pretty much your average Cucumber scenario. Assume that you have a multilingual page, say, English and German, so you have extracted all localized strings with some kind of translation API (hopefully Rails i18n ;-)). Looking a bit closer, the problem here is twofold:

You can’t change the English translations for “Delete booking” and “Booking deleted” (and there’s probably a couple of hundred more strings in your app anyway).
You can’t run your Cucumber stories with any other locale but English (but you should!).

Some people might dismiss the first issue because this also holds true for monolingual apps (especially if Cucumber stories are written by developers and texts come from the marketing department or similar) but the latter is a real and undeniable issue: Not being able to run your app in all locales provided means that you can’t prove that your app actually works for all locales.

While Cucumber provides us with the possibility to write our tests in our language of choice, there doesn’t seem to be a way to cleanly test multilingual apps across all its locales (yet). So let’s look at some potential solutions that were discussed after my talk.

Solution #1 - Using a separate testing locale

One of the first solutions we discussed was using a separate locale for testing (think en-testing instead of just en). Advantages: You’re still able to use good old strings in your scenarios and nothing breaks when translations in the “real” locales are changed. Disadvantages: Still no way to run your features across different locales and you’ll still break your tests if you change your keys. Plus you have to maintain an additional (unnecessary) locale.

In short: From my perspective, the only acceptable scenario for using this approach could be a monolingual app that still uses i18n so that translators don’t have to mess with your files directly - but even then, it’s hard to justify the additional effort of maintaining the otherwise useless testing locale.

Solution #2 - Use i18n keys instead of strings

Nico suggested the following approach to circumvent the problem:

When I go to my bookings page
And I press "bookings.index.delete"
Then I should see "bookings.index.deleted"

The corresponding steps would look something like this:

When /^I press "([^\"]*)"$/ do |key|
  click_button(I18n.t(key)) # or even I18n.t(key, :default => key) if you want to be able to use the key itself as the default
end

Then /^I should see "([^\"]*)"$/ do |key|
  response.should contain(I18n.t(key))
end

This has two obvious advantages: First of all, it works across different locales. Moreover, you only ever need to touch your scenarios if you change your i18n key structure (and if you need to do this, you probably have way bigger problems on your plate than fixing a couple dozen steps).

However, apart from my personal distaste - I think, it just looks artificial and weird - this effectively takes away one of Cucumber’s biggest advantages over other integration testing approaches: its natural language look. This also means it gets more difficult to discuss your scenarios with non-technical people (“What the heck does bookins.index.deleted mean?”). Without that, why not just use plain Rails Integration Tests with Webrat?

If you’re fine with the (subjective) ugliness of using keys in your scenarios, I’d only recommend using this approach if you use Cucumber exclusively as a means to integration testing and if it’s not an important part of the communication with your (non-technical) customer.

Solution #3 - Use your old friend, the DOM

Finally, let’s take a look at the approach I suggested: using the DOM instead of relying on strings. What I had in mind actually slightly differs from the example Henning uses in his post - I thought of something like the following:

When I go to my bookings page
And I click the delete button for booking with title "My booking"
Then I should see a confirmation message

The corresponding steps would look like this:

When /^I click the (.+) button for (.+) with (.+) "([^\"]+)"$/ do |action, model, field, value|
  model = model.gsub(' ', '_')                    # so "forum topic" becomes "forum_topic"
  field = field.join(' ', '_')                    # same same
  klass = model.classify.constantize
  object = klass.send("find_by_#{field}", value)

  click_button("#{action}_#{model}_#{object.id}") # e.g. delete_booking_1
end

Then /^I should see a (.+) message$/ do |type|
  response.should have_tag("#flash.#{type}")
end

Note: I haven’t tested this code.

We do something similar in adva-cms - albeit without Cucumber, we use good old Rails Integration Tests - which is, I would guess, one of the bigger i18n-powered apps out there. After first also relying on labels and texts, we pretty soon figured out that we broke our build whenever we decided to change some small and seemingly insignificant text (imagine changing “Upload »” to “Upload” because you use an additional image instead of »). After quite a bit of (heated) discussion, we decided to sacrifice test readability for robustness and switched to (mostly) using DOM ids instead of texts/labels. The problem with decreased readability doesn’t really apply to Cucumber, though, since it abstracts away the actual stories from the code that implement their steps.

Here’s what I see as advantages:

The Cucumber scenarios still read natural. I would even argue that saying that I “click the delete button” is more expressive than saying that I “follow/press Delete”.
No arguments with the customer about wording at this point (“The message should say ‘We are sorry that you cancelled your booking’ instead of ‘Booking deleted’, that’s just too blunt.”) - it’s just a confirmation message (whatever that may mean).
Relies on the DOM which - at least in terms of CSS ids and classes - isn’t likely to change quite as easily as translations. This holds true even more if your app relies heavily on JavaScript/AJAX because if you change your DOM that also means JavaScript stuff (and who wants to do that, really?). Plus, in my opinion you get improved markup for free.
The only string the steps depend on is, in this case, the booking’s title which is most probably set up in some kind of fixture or scenario Background and therefore not likely to change.
With a little tweaking and sticking to some conventions of your choice, this should work flawlessly across models …
… and more importantly across locales. Let the translators do their harm, err, work without having to worry about them breaking your tests. And, of course, you’re able to run your stories in any locale.

Disadvantages:

Obviously doesn’t validate that the delete button and the confirmation actually contain texts. On the other hand, you can set up i18n so that it raises exception if it can’t find a translation.
Of course, this diversion from the “Cucumber way” is debatable and definitely a matter of taste.
Some steps might need some additional logic and therefore be a little more brittle than they probably should be.
Most Cucumber steps have to either be tweaked or be completely rewritten.

So, if you are keen on getting maximum test robustness in terms of i18n and you are willing to slightly change your usual way of using Cucumber, I think this (or something based on this) definitely is the way to go.

What Do You Think?

I’m curious what you think - even more than usual because i18n is one of my favorite topics. I’m especially keen on getting some thoughts from people who might have already come up with solutions for the mentioned problem - please do share your insights!

I'll be speaking at Rails Underground

2009-06-02 00:00:00 +0000

A couple of days ago, I received the confirmation that I’ll be speaking about Rails i18n at Rails Underground in London. I’m really looking forward to this since it will be my first big speaking gig.

If you have any suggestions on what I could/should include in my talk (I’ve got 45 minutes), please feel free to leave a comment or write me an e-mail.

Looking forward to seeing you in London!

New plugin: delocalize

2009-05-03 00:00:00 +0000

It’s been a long time (again) since the last blog post

lazy me, I know. But I’ve been crazily busy over the last couple of weeks: The moving into my new flat still isn’t quite finished and there are quite some projects I’ve been working on, most notably adva-cms which we’ve pushed to version 0.2 last week, including a 100% rework of the whole admin interface. Since there are many people asking questions on the mailing list and we still don’t have any real documentation, you should check back over the next few weeks as I’ll try to do some hands-on tutorials on adva-cms.

In the meantime let me announce my latest plugin effort: delocalize.

New plugin: delocalize

Here’s a rundown of how delocalize came into being.

A customer wanted a simple JavaScript datepicker for their website. While I can see why some people might like them, personally, I’m not too fond of them - especially in a localized context. Some datepickers offer possibilities to localize dates in the sense that you can translate month and day names - however, they still output dates in some good old English format including English names which is, in my opinion, a usability nightmare, but it’s the only way it can be properly parsed. Some are even worse: They use English formats as their output but use localized month and day names - thus not only sucking in terms of usability but also nigh impossible to parse. A minority of more decent solutions - like the jQuery UI Datepicker - do however support displaying a different format from the one that’s actually submitted behind the scenes. While this is a more than decent solution, I still think that localized date parsing should also be supported on application level.

Plus, there’s another issue to tackle (also in the same customer project): Parsing of localized numbers. Not all countries use the decimal point as the decimal separator - in fact, all of Europe except the UK and all Latin American countries except Mexico use the decimal comma (see the Wikipedia map).

So far, what I used to do - and I’m sure you did pretty much the same - was to implement some custom attribute writer to handle the parsing:

def price=(price)
  write_attribute(:price, price.gsub(/[^0-9,]/, '').gsub(',', '.'))
end

def publish_at=(publish_at)
  # ...
  # some more or less complex parsing logic
  # ...
  write_attribute(:publish_at, publish_at)
end

And, of course, if you wanted your forms to show localized output, you had to play the same game in reverse. No more: Use delocalize instead!

Below the Surface

delocalize mainly consists of a parser for localized dates and times and a few Ruby and Rails hacks. I’ve tried to implement everything at the lowest possible level to minimize side effects. Here are the hacks:

I use alias_method_chain on ActiveRecord’s convert_number_column_value which is called internally in write_attribute for all numeric columns. In this method, I do pretty much the same as I used to do in attribute writers: Replace everything except numbers and the decimal separator (this time, of course, looked up from the locale) and then replace the decimal separator with a decimal point.
The date/time hack consists of two parts: I alias_method_chain write_attribute to branch out for dates and times and call the parser from there. Unfortunately, thanks to Rails’ time zone implementation, I also need a rather icky hack to parse time zone aware dates and times as the time zone parsing takes place before the actual call to write_attribute.
Finally, I hacked ActionView’s (or InstanceTag’s, to be more specific) to_input_field_tag which is responsible for building text fields through a form builder. All this does is reverse the whole process, i.e. localize numbers and dates again.

Information on how to use delocalize can be found in the GitHub repository where you can, of course, also download it.

Feedback

This is still highly experimental stuff and I don’t know all facets about all exotic languages and their special date and number rules. If delocalize doesn’t support some of your locale’s specific requirements, please let me know - ideally with a specific test case and (even better) a working patch. Also note that I’m grateful for every hint as to how I might be able to get rid of the ugly time zone-related hack I had to use.

In the meantime: Test it and let me know what you think. Also check out my other i18n plugins localized_dates and LaterDude. And if you like my work and/or my blog, please subscribe and consider recommending me on WorkingWithRails.

Don't Let Implementation Issues Affect User Experience

2009-02-07 00:00:00 +0000

Gregg Pollack of RailsEnvy recently started a series on Scaling Rails over at NewRelic’s RailsLab. Although I highly recommend watching these screencasts, Gregg says something in episode #5 that got my attention: He says that login and logout as well as some information about the login status (the omnipresent “Logged in as Clemens”) are overrated.

… Or Are They?

Although I often tend to agree with Gregg I think here’s a subject that’s a bit touchy with me - or rather, it scratches an itch that I’ve been scratching a lot over the last few months and let me tell you - it’s already pretty sore. But let me explain.

Login/logout links/buttons as well as a certain type of login status are used to indicate to the user whether or not they are logged in to the site. And if they aren’t then the site usually provides some kind of guidance on how to log in or create a new account plus an incentive why a user should sign up.

Now if we are fortunate enough to create a site for a certain geeky clientele - say, GitHub - we can safely assume that we’re dealing with users who can be trusted to have a certain level of experience with the Internet in general and websites (with account functionality) in particular. However, this type of homogenous target group cannot always safely be assumed. We live in a time where Internet usage increases by the day and not only do we get children and adolescents looking at our websites, we also get middle-aged users. And these middle-aged users might have grown up without ever touching a computer and the Internet and - even worse - might be slightly inept and untalented, too.

Fear Of The Unknown

Now imagine for a second that you’ve been using the Internet for a couple of days and you come to a website that has a link “Your Account”. I guess I’d close the site as soon as I see this text. Why? With all the paranoia created by the media about Internet security - phishing, dubious e-mails, trojans and the like - I’d be hesitant to trust a site where I have an account with signing up for one.

It’s just about as frightning as this hoax years back that allowed you to list folders from your local hard drive in Internet Explorer (“Your computer isn’t safe. Look, we can easily access your hard drive”) - the main difference is, we as developers usually don’t have these bad intentions, but sometimes we’re just a bit ignorant about our users’ expectations. And we forget that it’s only human to have a certain fear of the unknown. I, as an experienced user, know (or at least assume) that if I click on that link without being logged in and/or actually owning an account I’d probably be redirected to a login prompt or a registration form. The unexperienced user however is more likely to see a threat, especially if they are on an e-commerce site and maybe even trying online shopping for the first time.

If You Are Uncertain, Always Expect The Worst

Often times, we developers just don’t know enough about the future users of our apps - maybe we don’t even know anything at all. They key then is to expect the worst: 100% of our future users are going to be complete idiots who are inept, inexperienced and paranoid all at the same time. It is our job (and the designer’s and the usability expert’s) to build a comfy website for them that explains everything that might seem totally superfluous from our point of view. Believe me, I’ve been at the point where I clearly labeled a button “Buy Now” and had to come back after some customer feedback and add an info text that said “Click on the ‘Buy Now’ button to buy the products you selected” because some users wouldn’t understand how to finalize their order.

Page Caching With Dynamic Content

Gregg talked about these issues when dealing with page caching. If you’ve tried to efficiently implement page caching for a site with a certain amount of dynamic features I bet you’ve run into all kinds of issues because it’s just really damn hard - and pretty much impossible if you also want your page to degrade gracefully for users with JavaScript and/or Cookies blocked.

One way to get around these problems is to remove everything dynamic from the layout and consolidate dynamic stuff in selected places that aren’t page cached, such as a dashboard. As I mentioned above, this might be OK for experienced users but as soon as you’re dealing with a heterogenous group of users, you just shouldn’t do it.

It’s Not Fair To Ask The Value Question If You’re Dealing With Implementation Details

This is probably the place where my pal Sven Fuchs would chime in and say we’d have to ask the customer if they’d rather reduce their server load and use page caching or have their highly dynamic and gracefully degrading interface. To be honest, I think this question just isn’t fair to the client because even if we’re talking to a technical or semi-technical client, we should refrain from bothering them with implementation details.

Usually - if the design is developed first , a certain layout will have been agreed upon. With good reason: It’s a designer’s suggestions (who hopefully has at least some experience when it comes to usability) that integrates what the client wants. It is then our job as developers not to point out to the client how they could reduce server load/improve performance and save a few bucks a month but instead it’s our job to implement the design we got as efficiently as possible. If this, due to a heavily dynamic interface, rules out page caching and forces us to fall back on fragment caching so be it.

You Are Allowed To Criticize

That doesn’t mean, of course, that we have to accept every design that is given to us without taking a moment to consider the options and maybe give the client and/or the designer some feedback. For example, there might be some kind of information website that really only has a forum when it comes to dynamic features but for some weird reason the designer has put the “Logged in as Clemens” stuff in every layout which effectively denies the use of page caching for areas of the site where it would actually make sense without affecting the user experience. This is where we should raise discussion. But again, avoid the implementation details and instead talk about leaving out that right menu for 90% of the pages and having more space for the actual content because this is a point the customer understands independent of their technical knowledge.

That being said, I still very much recommend watching Gregg’s screencasts and subscribing to the RSS feed - there are some really interesting topics coming up and Gregg is just a very good teacher.

Personal news & plugin release

2009-01-15 00:00:00 +0000

Personal news

It’s been a little quiet here over the last few months and there are two reasons for that: First of all, I was crazy busy finishing my diploma thesis and studying for my final exam at MCI. But now, after 4 years of (partially) hard work, I’ve bagged my diploma and I’m done with uni (at least for now). Second, and perhaps more important, I’ve prepared my relocation to Berlin, Germany. Some of you, who follow me on Twitter and/or GitHub might have seen that I’ve been contributing actively to adva-cms. Now I got offered a job in Berlin to work full-time on adva-cms with Sven Fuchs and others - and accepted it.

LaterDude

Now for the actual topic of this post. Yesterday, I pushed a new plugin to GitHub: LaterDude. LaterDude is a small calendar helper plugin for Rails with i18n support - which was exactly what we needed for adva-cms or, more specifically, the adva_calendar engine. It was heavily inspired by Jeremy Voorhis’/Geoffrey Grosenbach’s calendar_helper plugin, although - apart, of course, from the i18n support - there are some subtle differences.

I’ve written a pretty extensive README including examples which can be found in the LaterDude GitHub repo.

LaterDude is still work in progress so I’d appreciate any feedback or comments that you might have.

Memo-what? - A Guide to Memoization

2008-09-20 00:00:00 +0000

Today, I’m going to talk about a feature that’s not all new in Rails but it’s still got the smell of a freshly unwrapped present: Memoization. I’d like to thank Josh Peek from the Rails Core team for giving feedback on the article before I published it - thanks, mate!

It Really Just Means Caching the Result

The concept behind it is actually fairly easy to grasp: Instead of calling the same (possibly expensive to run) method that will return the same result over and over again, why not just store the result of the first method call and re-use it afterwards? Ryan Bates often refers to this as caching the result in his Railscasts. You can also read a more general description of the memoization pattern over at Reference.com.

Let’s take a look at a simple example. We have a City class that has a zipcode and a name. We’ll usually use both of these in our application, so we might write a method that looks like this:

def zipcode_and_name
  "#{zipcode} #{name}"
end

Memoize Away!

This is perfectly valid, however, every time we call this method during one request it always re-constructs the resulting string. So let’s memoize the call in the classic way that you might have already stumbled across:

def zipcode_and_name
  @zipcode_and_name ||= "#{zipcode} #{name}"
end

The ||= operator is a so-called conditional assignment that only assigns the value on its right to the variable on its left if the variable is not true. Since pretty much everything in Ruby evaluates to true when coerced into a boolean (except nil and, of course, false, as well as a couple of other things) this is exactly what we want: The first call returns the actual string and from the second time on, no matter how often the method is called on the same object, the “cached” result will be used.

A little note on naming here: Some people seem to prefer prefixing the memoizing variable’s name with an underscore to indicate that it’s not meant to be used as an actual instance variable. To be honest, I don’t think this is really necessary unless you define a whole bunch of instance variables and memoized variables.

Memoizable - The Macro-Style Way of Doing Memoziation

On July 15, 2008 Josh Peek added the Memoizable module to Rails core. Originally, this was only meant to be used internally in Rails to prevent some annoying errors when manually memoizing methods in frozen objects. Here’s how Josh Peek explains the reasons for including the Memoizable module:

The initial motivation was to fix an issue with freezing memoizable methods. If you do it the old way, and freeze the object before calling the method first, the method will complain that it can’t modify the instance var. The fix was to call all the memoized methods on freeze.

I found this pattern really useful for creating a class that could be lazy loaded or eager loaded. So you write your classes up in a lazy load stye, memoize the methods with the helper, and - if you need to eager load - use freeze to transition the object to a final fully cached state. I’m not sure if this is for everyone, but its been really useful internal in Rails.

Although it was meant for internal purposes, the inclusion of this module initially kick-started some discussion as to whether the responsibilities for memoization should be taken away from the programmer by putting it in a module. However, Josh decided to take some of the proposals and include them in the Rails core, so now all it does is abstract the functionality in a - in my opinion - really clean way.

To cut a long story short, here’s the example we used earlier, rewritten to use the Memoizable API:

# somewhere inside the class
extend ActiveSupport::Memoizable

def zipcode_and_name
  "#{zipcode} #{name}"
end
memoize :zipcode_and_name

The memoize macro method then takes care of caching the method call result and retrieving it as necessary.

Part of the initial criticism also where that there was no way to force the cached result to be reloaded and that memoize wouldn’t work with methods that accepted parameters. Both of these issues have been fixed later - so methods with parameters can now be memoized as well:

def some_method(*args)
  # some calculations
end
memoize :some_method

This roughly equals the following code (and is actually pretty much how the Memoizable module handles it internally):

def some_method(*args)
  @some_method ||= {}
  @some_method[args] ||= (
    # some calculation
  )
end

What happens here is that the cache is a hash instead of a mere scalar and the hash is indexed by the parameters that are passed to the method. So if I call a method twice with the same parameters, I’ll get the cached result on the second call.

Reloading a cached result is pretty easy, too:

object.some_memoized_method(:reload) \# or, less self-explanatory: object.some_memoized_method(true)

Beware of the Dynamics!

I can already feel people finishing reading this article and then going back to their code and doing something like this to their User class:

def age
  today = Date.today
  today.year - birth_date.year + (today.month - birth_date.month + ((today.day - birth_date.day) < 0 ? -1 : 0) < 0 ? -1 : 0)
end
memoize :age

This may seem okay but imagine running this method one second before midnight on the day before a user’s birthday: The memoized value will be returned even if the process runs longer than midnight (two seconds, say) and it’s already their birthday!

The point of this certainly simple example is as follows: If a method uses data that is inherently dynamic, such as the current date/time, random numbers, lambdas/procs and the like, it’s not really a good fit for memoization. On the contrary - it might even lead to all kinds of weird behavior and bugs!

Tips and Tricks

Here are some tips and tricks that didn’t seem to fit anywhere else in the article.

Memoizing Class Methods

Since memoize is already a class method, if you want to memoize a class method, you have to put the memoize call in the class’ metaclass:

def self.a_class_method
  # some code
end
class << self; extend ActiveSupport::Memoizable; self; end.memoize :a_class_method

As you can see, this is not really clean - so here’s a better way:

class << self
  extend ActiveSupport::Memoizable

  def a_class_method
    # some code
  end
  memoize :a_class_method
end

Way better!

A little hint if you want to use the “old style” for memoizing class methods: You have to use instance variables (with one \@ rather than two) to cache the results! This is because inside the context of a class method the instance is the class itself rather than the instance. If you don’t get what I’m talking about, you might want to check out Dave Thomas’ screencast series on Ruby Metaprogramming.

Enable Memoization For All Active Records

If you want to use memoize in a lot of your ActiveRecord models, you might think about extending the Memoizable module inside ActiveRecord::Base:

# for memoizing instance methods:
ActiveRecord::Base.extend(ActiveSupport::Memoizable)

# for memoizing class methods:
ActiveRecord::Base.class_eval { class << self; extend ActiveSupport::Memoizable; end }

This way, every descendant of ActiveRecord (i.e. all of your models) will be able to call memoize without having to extend the module first.

Return Values Are Frozen!

There’s one more thing that tends to creates both, confusion and controversy: Return values of memoized methods are frozen - which basically means you can’t modify the return object itself but rather have to dup it first. As Josh stated in the quote above, the freezing problems where the reason behind the Memoizable module - so I guess, frozen return values are a trade-off that we have to just accept.

When Should You Use memoize and When Should You Just Stick to the Classic Style?

You might also ask yourself how to decide when you should use memoize and when to stick to the “old way” of storing the result in an instance variable yourself. So here’s my rule of thumb: If you may need to reload the cached result during a request, you might want to use memoize because it offers a simple way for reloading - it’s way harder to implement that yourself! Otherwise, I tend to look at the complexity of the method: If the method is a simple one- or two-liner, I use the classic style - if it involves more complex logic, I prefer memoize.

What Do You Think?

I hope this article comes in handy for some of you who haven’t heard of memoization yet or who just didn’t really understand what’s going on there. As always, I’m looking forward to reading your comments - also, feel free to ask questions if something remained unclear!

Finally - I'm twittered!

2008-09-11 00:00:00 +0000

I’ve long resisted the temptation of signing up for this time-consuming thing that is Twitter. However, after talking to various guys at RailsConf Europe - which all went kind of “Whoa, seriously, dude - you don’t have a Twitter account?” - I finally decided that it’s time to be a late adopter. So if you want to follow me - feel free. You can find me here.

RailsConf Europe 2008 - Recap

2008-09-05 00:00:00 +0000

After three days in Berlin, what can I say about RailsConf Europe 2008 in retrospective? What were the ups and downs and, most importantly, was it worth the trip?

Well, I went there with mixed feelings and two different sets of expectations: On the one hand, I had pretty low expectations regarding the talks because - let’s face it - last year’s talks weren’t all that great. From what I overheard in other people’s discussions and from what David Black told me, most people seemed to have criticized the talks because they’d just been to basic - a majority seemed to have been targeted at Rails beginners. Fortunately, this year’s talks turned out to be really good on average. Yes, some of them were still aimed at a low level of Rails (and other technology) knowledge but overall they were evenly distributed between more basic and advanced topics - at least from what I can tell because, of course, I haven’t been to all sessions. From the talks I heard, I liked Jay Fields’ talk about functional testing best: All in all, it had the most value to me personally. Other really great talks were Sven Fuchs’ presentation of the Rails i18n efforts and Jonathan Weiss’ talk about Rails security. In short: My expectations were exceeded by far.

My second expectation was that the conference would offer great networking possibilities - especially since I’m now a little less green than I’d been last year. ;-) This expectation turned out to be fulfilled: I met lots of great people and had quite a bunch of interesting discussions, starting at Bratwurst on Rails and ending at the last conference day’s evening when I had the opportunity to have dinner with program chair David Black. Greetings and thanks to:

Sven Fuchs (who offered me a place to crash during the conference and because he’s probably one of the coolest people I’ve ever met)
Mathias Meyer and Jonathan Weiss (because they’re not only very proficient Rails developers but also extraordinarily nice guys)
Geoffrey Grosenbach (because I got this cool PeepCode shirt and because he looks pretty darn smart in a suit … and also because he’s got the sexiest voice in the Rails world and is generally a pretty awesome guy)
David Black (for being an awesome program chair)
Marko Seppä (for being the evil, yet silent, Finnish guy he just is)
Benjamin Krause (for showing us this cool Sushi place just around the corner)
the Maritim proARTE staff (for their unobtrusive catering and the food that was just plain awesome)
and all the others I met (the list would probably be too long).

There were really only two things that I’d criticize (and some people I talked to seem to agree with me on these):

Conference sessions should have some kind of scale/type like Beginner, Advanced, Informational session, etc. This would make it easier for people to choose sessions they’re interested in and that fit their level of experience. This should ideally also include better titles for sessions: While I really liked Michael Bleigh’s and Chris Selmer’s session, to me it wasn’t really all that “advanced” as the title would have suggested. I know this is hairy since pretty much everybody’s opinion about what’s advanced and what’s not is different - but in my opinion it’s worth putting some effort into it.
The conference was definitely missing a real closing session. After the last three sessions, there was just - well - nothing. I’d really love to have a closing keynote - ideally talking about a topic that’s relevant for the near future of Rails and web development in general. Another option would be to have some kind of official fare-well drink. Anyway, leaving without proper closure just doesn’t feel right to me.

Where’s RailsConf Europe headed from here? It seems that at least for next year it’s leaving Berlin. Other candidates? Maybe back to London, maybe something totally new (Prague, Madrid, Vienna, …)? We’ll see. If next year’s conference’s quality increases like this year’s we’ll definitely see a great RailsConf Europe 2009.

Thanks for sticking with my coverage and thanks a lot for the positive feedback I received at the conference and in IRC. Subscribe to my blog if you like what you’ve seen over the last few days.

Boarding my plane back home now — Clemens signing off.

RailsConf Europe 2008 Coverage - Day 3

2008-09-04 00:00:00 +0000

Day 3 of RailsConf Europe

Functional Testing - Lessons Learned

presented by Jay Fields

“Testing is hard” (so true …). Jay states that you can’t learn TDD by just reading a book.

Jay thinks that test code is just as important as application code. If your application is never going to change and only you yourself are working on it - well, don’t have tests. But Jay stresses that this is usually not the case and tests help keeping the application working and also to document what the application does to other (new) developers. No automated test suite can, of course, replace exploratory testing - you still need your QA team.

Jay suggests creating tests that test important things - 100% coverage isn’t necessarily always the right goal. You should test the features that make it a great application. Finally, tests are needed to prove that the system works as expected.

The problems with functional tests, according to Jay, are that they don’t verify corner cases and also that they tend to run pretty long which is why he suggests to not have too many functional tests. Instead, he suggests to use unit tests to test functionality in isolation and run these unit tests constantly. In addition, you also have a couple functional tests that integrates databases and/or APIs and you can run these functional tests separately from unit tests every now and then (e.g. before pushing changes to your source code repository).

test/unit, for Jay, is bad because it’s QA-unfriendly due to its ugly syntax. Also, tests are procedural rather than really object-oriented (the test suite is but a test itself isn’t). The good things are that it is easily understandable and that it provides good help to trace your errors (e.g. stack trace). All in all, Jay considers test/unit to be a decent choice.

Regarding Selenium, Jay thinks it’s awfully slow and he hates that browser bugs can cause failures. It’s also hard to find the defect location and you have to track bugs down yourself. The good thing is that it runs the full application - they actually ensure that users get what they expect. It’s also easy for developers and tests to use. Summarizing, Jay’s opinion is that it’s too brittle and slow to be used on a regular basis.

Bad things about RSpec: It’s got a pretty steep learning curve (coming from test/unit) and Jay says that even he needs to look up some details every now and again. Moreover, RSpec is pretty hard to extend - Jay considers custom matchers are a way out but it’s still not a good code base. On the other hand, Jay loves that it encourages behavioral testing and, therefore, helps you writing better tests in general. Developers and testers can use and even understand tests - it allows you to sit down with QA and even business people and they are likely to understand the test code. No surprise: RSpec is Jay’s preferred solution for tests.

Jay suggests writing smoke tests to ensure that the most important parts and paths of your app work - he calls them smoke tests.

Subsystems can be a pain in the … (well, you know). Jay stresses that you really should test subsystems you own (like databases) but not subsystems you don’t own (like Twitter, your credit card provider, etc.). Instead, stub 3rd party sub systems and create an external test suite - maybe even a semi-automatic one - that you run against the real 3rd party API. Jay showed a simple example for stubbing where he had a FeedBurnerGateway and stubbed out its methods for the test.

Jay suggests that unit tests and functional tests should be top priority when bugs occur. Of course, you also have to work on failing smoke tests and external tests - but they shouldn’t be your to priority.

Regarding data, Jay stresses that you should not re-import reference data (e.g. the states of the US) for every single tests but you definitely need to have fresh state data (e.g. a shopping carts) for every test. For state data, Jay suggests using a factory such as Dan Manges’ factory and the tests should create the state data they need by themselves. Jay also states that he really doesn’t like fixtures because they’re slow and unmaintainable, and in his opinion moving away from fixtures has always paid off.

Jay summarizes:

Move corner cases from unit tests.
Remove painful tests that provide little value (writing them is easy, deleting them is hard).
As tests break, migrate them to better written RSpec tests.
Create stubs for sub systems that you don’t own.
Break data into reference data and state data and refactor tests so they share reference data but create their own state data.

Jay’s talk was the most valuable session for me so far and I’ve really taken away a lot from it. He gave a really well-rounded presentation (actually, I think he was the first speaker to provide a real summary at the end) that was full of useful content. If you’re not subscribed yet, definitely check out his blog.

Resources:

Sidenote

I wrote about the great food yesterday. I took a picture with my iPhone today so you can check it out yourself at Flickr

The Future of I18n/L10n in Ruby on Rails

presented by Sven Fuchs

Disclaimer: I’m kinda partial to Sven’s work so please forgive me if it turns out that I’m not entirely objective on his talk.

Sven started his talk with the “official Rails i18n hymn” - Judas Priest’s Painkiller and a short introduction about himself.

After that, he gave a brief overview about the past regarding i18n in Rails: Of course, there were Ruby Gettext, GLoc, Globalize, etc. But the problem about these was that they often were isolated, incompatible solutions with bad support and lots of monkey patching that tended to break with every new Rails release. The wheel was constantly re-implemented.

Sven proceeded to explain how the Rails i18n project was born: Pretty much everyone that has contributed in one way or another in i18n stuff was invited to join in the effort to finally create a working, unified i18n solution for Rails. After lots of discussion about implementation details, the team came up with a draft of what Rails i18n should be able to do. Here are the key dcisions:

100% swappable translation logic
Always do the simplest thing that could ever work.
Rails i18n doesn’t want to be a full-blown i18n/l10n solution.

Instead, Rails i18n aims to provide a powerful i18n API - it’s only a blueprint of what the real translation implementations should do. Additionally, Rails ships with the Simple backend to support Rails’ standard en-US localization.

Translations are stored in YAML files by default - optionally, you can also use Ruby files if you need more complex translations using lambdas and other complex Ruby logic.

Sven then showed the basic API of the i18n plugin: The methods translate and localize. He also explained how lookups work and how you structure your locales using scopes, interpolation of translations and pluralization rules. He also demonstrated how you can swap the Simple backend with your own customized backend.

After that, Sven did a short live coding session where he showed how you can easily provide simple translation for static texts and load and set locales on the fly.

In Sven’s opinion, by using the Rails i18n API, we do win a lot because

we need less monkey patching,
we have one common API for all plugins
we can share locale data
and we’ll finally get better plugins.

Sven proposes to continue work on Rails i18n by implementing features, collecting and reviewing them and afterwards aim for another big Rails core patch when the time is right.

After that, Sven showed a video interview with Joshua Harvey, the original author of Globalize and also mentioned some details about the upcoming Globalize 2.

He closed his talk by explaining how everybody can take part and contribute to the ongoing efforts in making i18n/l10n in Rails work even better than they already do.

I think, Sven managed to give people a good overview of what i18n can do for you right now and about its plans for the near future. I guess everything will become pretty great as soon as the plugin landscape becomes more diverse, especially with the advent of Globalize 2. Thumbs up, buddy!

Resources:

Treading the Rails with Ruby Shoes

presented by Eleanor McHugh and Romek Szczesniak

I’ve spent some time letting Eleanor’s and Romek’s talk sink because it was kinda weird. The title suggested that it was about using why’s Shoes framework together with Ruby on Rails but that made only for about 5 minutes of the talk. The rest was about, well, everything and nothing at the same time. Eleanor and Romek had a lively and funny dialogue going on between them, talking about the history of the Internet, cryptography, sockets, etc. without a really clear direction - at least it wasn’t clear to me. It seems they held a similar talk at last year’s conference so maybe this was a kind of prerequisite for really getting them.

All in all, it was - as I said - a really weird discussion. It wasn’t bad but I didn’t really take away something from it. Eleanor and Romek went over lots of different topics but never really sticked with one topic long enough so people could actually understand where they were coming from. Plus, they showed slides with huge amounts of code that was barely legible from the third row where I was sitting - let alone in the back of the room - not good. In short: It was amusing but not really valuable.

Resources:

I Heart Complexity

presented by Adam Keys

Adam started his presentation talking about domain modeling: Customers talk about accounts, credits, etc. and developers talk about users, addition, subtraction, etc. The goal of domain modeling is that you break it down to one common language or jargon that both, business people and developers speak - also called an ubiquitous language. Adam says that this results in less documentation needed because it avoids misunderstanding between the business and development team - business people should actually be able to read code (similar to what Jay Fields talked about earlier). If you put the domain together with the intent of the application, you get the essence of the application - after this, we can focus on more complex (and more interesting) challenges. Adam then made his point showing how he refactored a bad, anemic domain to a richer, strong domain.

State, according to Adam, is the most important part of our application because it displays the life cycle of our actual objects. This is why you should spend a good amount of time thinking about your application’s stateful logic. Again, he made his point by refactoring an app by using a state machine to define state transition rules - according to Adam, this makes it easier to understand and easier to see what’s important.

Another challenge is handling money: Language designers are way too cool to package money classes with their standard libraries. Adam discourages the use of floats because they are imprecise. Instead, he suggests extracting money into its own class and storing the value in cents plus its respective currency.

Adam then proceeded to talk about time travel: It turns out a lot of people need to travel back and forth in time. This basically refers to versioning your domain objects (e.g. using Rick Olson’s acts_as_versioned) as to allow to viewing given objects and their changes over time.

His final topic, asynchronous processing, was basically about queuing (because you really shouldn’t rely on multithreading in Rails just yet ;)). Adam used a moderation queue as an example where all product changes in an application should go through a moderation process before being published to the live site. For large queues, Adam suggests using tools like Beanstalk or Starling for smaller scale projects, putting them directly in your domain model may just be the right choice.

In my opinion, Adam gave a great overview of methods to reduce complexity and also made a point of showing that there are also plenty of options out there that are ready to be used in your everyday Rails application. I kinda expected him to be a little funnier - but probably this is because so far I’ve only seen him in the MVC videos with RailsEnvy that are, after all, just hilarious. A well-rounded talk and all in all a really worthy end of the official RailsConf Europe 2008 presentation.

Resources:

Day 3 resumé

I started day 3 a little later (even geeks need sleep every now and again) so I missed the symposium about different Ruby implementations - I heard it was quite interesting. Regarding quality, day 3’s sessions were en par with day 2’s - at least the sessions I’ve been to. I heard that some of the other sessions weren’t quite that good - but then again, I guess you can’t really equally satisfy 700 people.

I especially enjoyed Jay Fields’ talk about his experiences with functional testing. Among other things, he was one of the very few speakers that ended their talk with a summary of what they’d discussed during their session. Sven Fuchs’ introduction to the Rails i18n plugin was also very interesting. I also really loved that David, Koz and Jeremy sat through Sven’s talk as well which, in my opinion, showed that they really do care about Rails’ latest internationalization efforts.

Something I definitely missed on the last conference day was a real closing event like a final keynote speech or at least some kind of official fare-well drink. Other than that it made a good last day for a really great conference.

RailsConf Europe 2008 Coverage - Day 2

2008-09-03 00:00:00 +0000

Day 2 of RailsConf Europe 2008

Keynote - David Heinemeier Hansson

(Missed the first 10 minutes - anything interesting happened?)

Legacy is not a technology and not an attribute to software or code. David thinks it’s what is perceived as “old” some time after it has been written. And, apparently, he now likes to draw graphs by hand! ;) Over time, code goes from “This is the best code ever” over “Well, this could be better” to “This sucks” and “I can’t take it anymore”. Finally, of course, there’s always the last resort: Rewrite. In short, the code stays the same but our perception of it changes it’s use that change, not the code. David says it’s a good thing because reflecting on old code means becoming a better programmer.

What you write today will become legacy. We need to be prepared for that and Rails won’t save us from that.

The best programmers write the most legacy software because it means they are writing the most software. Switching to a newer, better platform should help you become a better programmer even faster. David argues that the process for “best code ever” becoming “biggest ball of mud” runs faster as you get better as a programmer so, all in all, you end up producing more legacy code over time.

When you’re green, everything turns legacy over night because when you’re new to a technology or platform which David thinks to be the “inherent dilemma of coming to Rails”. People are used to a certain pace and when they come to Rails they experience a high level of euphoria - followed by a crash because everything turns stale over night.

Good software takes 10 years to write (Joel Spolsky). David says that it probably doesn’t take 10 years to write but 10 years to become successful. The thing is that you have to deal with code over a long period of time so it, again, comes down to always having to deal with what you perceive to be legacy software. But: _The single worst strategic mistake that any software company can make: Rewrite the code from scratch” (also Joel Spolsky).

David continued by showing some code pieces that he now considers to be legacy or a trash can for code:

Basecamp’s Application Helper apparently has a couple dozen methods with over 500 lines of code because it was just easy to stuff everything in there when he didn’t have helper :all yet. David compares the Application Helper with train station: People are not supposed to stay at the train station but get on a train and leave. Same holds true for Application Helper: It should hold helper methods until you find a better place for them. The way to tackle refactoring a bloated Application Helper (or legacy code in general) is to not rewrite big chunks of code but instead work on it slowly and always leave the code better than it originally was.
Basecamp’s Person class seems to have accumulated quite a bunch of legacy code, such as old-style filters (def before_create and the like). David suggests extracting what he calls “concerns” out of a class and put it into modules. He proceeded to make his point by extracting a system notification service out of the Person class and putting it into a SystemNotice module and then including it in the Person class. He thinks that this increases the coherence of the code because the Person class becomes both, better structured and shorter. It’s less about reuse and more about structure.
Basecamp has a GlobalController (chch) because “it does global stuff” that David couldn’t “fit in somewhere else” and because before REST there wasn’t really any natural boundary for how bloated a controller could be.
Basecamp’s environment.rb was really bloated before the advent of initializers (note to self: Need to find a screenshot because I’ve definitely seen it in one of David’s presentation).

Things he learned by tackling these issues:

Extract concerns. Not because you want to reuse them but because they improve the structure and coherence of your code.
The DRY exception: Code that is ultra-DRY is really hard to extend and becomes unflexible. David says that sometimes it’s better to just split code across multiple files even if this means that there is going to be some duplication (not too much, mind you).

In short: Don’t get stuck worrying about legacy code because everything becomes legacy at one point.

I think David gave a great talk and addressed a topic that really affects pretty much every developer at some point during their career. What especially amazed me is that David seemed to have changed some of his strong notions of what is good and bad, and that he seemed to have, dare I say it?, matured. I don’t think that he’s the same guy that was once termed the potty mouthed Dane and showed big fuck you slides in his keynotes. Instead he impressed me (and probably some other folks) by showing some Basecamp code where he pretty much said “Look at the crappy code I once wrote”. Yes, there’s still some kind of Socratic notion about his talk where he states that he writes or wrote bad code but at least he knows it’s bad and others don’t know how bad their code is. But if David - as a developer with an enormous amount of influence on the whole development community over the last few years - can’t have this bit of arrogance, then really, who can?

Resources:

Hacking the Mid-End: Unobtrusive Scripting and Advanced UI Techniques in Rails

presented by Michael Bleigh and Chris Selmer

(Again, missed the first few minutes. What’s it with me today?)

Michael explains that it’s important that your application is accessible and that Rails isn’t necessarily ideal when it comes to being unobtrusive and accessible (e.g. delete links with link_to). You also want your app to be intuitive - which Michael thinks is an inherent advantage of web applications because users are used to working with the browser. Finally, Michael says that apps should be responsive and react quickly to users’ actions.

Chris went on talking about solving mid-end problems using a toolset of helpers, partials, jQuery (or Prototype) and LowPro. He likes to start building an app without JavaScript, then abstract some UI into helpers and partials and finally write unobtrusive JavaScript to enhance your application.

The guys then went to do some live coding: The goal of the first example was to improve a report that takes quite some time to generate and doesn’t really provide any feedback to users. Users with JS disabled should see an info message (“The report may take up to 10 seconds to generate.”) plus a link to the actual report while users with JS enabled should see a spinner and the report should automatically be displayed as soon as it was created. As a second example, they showed how to build tabbed navigation that, again, degrades gracefully for users with JS disabled. (By the way, if you don’t know it: If you type “lorem” in TextMate and hit tab, you get a full paragraph of Lorem ipsum. Life can be so damn great.)

The guys from Intridea did a great job picking up where Jarkko Laine stopped yesterday, so if you were at the tutorial yesterday like I was and you missed the more advanced stuff in Jarkko’s content, this session definitely acted as a great complement. I especially liked the live coding where they showed how to use LowPro with jQuery since I’m thinking about ditching Prototype in favor of jQuery but haven’t gotten around to taking a closer look at jQuery just yet.

Resources:

The One-Two Punch: jQuery with Rails

presented by Yehuda Katz

Yehuda explained jQuery’s core philosophy: “Get some elements. Do some stuff.” (Jeez). jQuery uses lots of advanced CSS3 style selectors (Yehuda termed them CSS3 Plus) to support really powerful selection of elements. Example: table tr:nth-child(even) > td -> all tabe cells in every even table row. jQuery supports method chaining by always returning jQuery objects from methods. 5 parts of jQuery:

DOM: Yehuda showed a couple examples how to use different selectors.
Events: jQuery, like Prototype, has a unified event model in order to hide the issues with cross browser compatibility and, of course, support for handling custom events.
Effects
AJAX: jQuery has shortcuts for AJAX methods (load, $.get, $.post, $.getJSON) and one mighty $.AJAX method that seems to be pretty similar to Prototype’s Ajax.Request. It seems you can also set AJAX options directly on objects which looks very cool.
Metadata: Put JSON metadata directly in your HTML. Didn’t really get the advantage here.
LiveQuery: Didn’t really catch that either. (Btw. wasn’t this already the 6th part? ;-))

All in all, to me the presentation wasn’t quite as valuable as the other sessions I’ve been to so far. Yehuda managed to give a quick overview about the capabilities of jQuery but what was definitely missing was some live coding so we could have seen the whole thing in action. Also no mention of the jQuery plugin for Rails.

Resources:

RailsConf during off hours

One thing has to be said about this year’s RailsConf during break times: It’s just plain awesome. People are engaging in interesting discussions everywhere and the hotel’s staff and the catering are just amazing. Let’s face it: Even for real conference hotels like the Maritim proARTE, it’s still quite a challenge to cater lunch and coffee breaks for around 700 people. They do it well, they have good quality food and it seems they have some LowPro-style backend: They’re always around but they’re very unobtrusive.

EngineYard and Brightbox are giving away free t-shirts (although, unfortunately, Brightbox has already run out of stock) and the shirts are even suitable for wearing them on streets - not like last year’s official RailsConf t-shirts, mind you. Speaking of official RailsConf t-shirts: They finally arrived today and compared to last year’s fashion disaster, they do really look okay this year. In fact, they look almost as good as the EngineYard shirts. I’ll try to make a couple of pictures and put them up on Flickr if I can manage - unfortunately, the WiFi seems to be neither fast nor all that stable up until now.

Intellectual Scalability: Solving a LArge Problem with Multiple Cooperating Rails Apps

presented by Frederick Cheung and Paul Butcher

Paul states that the least scalable component of every problem is the human brain. The problem at Texperts is that they have about 30000 lines of code in one of their apps and the tests take about 7 minutes to run. One solution is using multiple apps that still should be seamless to the user.

Texperts roll out their iterations in “lock-step” - i.e. they roll out all of their applications at the same time. They use a simple directory structure under a single root directory and share code through plugins. They also have hacked Capistrano to support their style of deployment. Finally, they use Selenium for cross-browser testing.

Each app runs its own database and they don’t communicate through databases at all. Server-side communication is handled by RESTful APIs and client-side communications via a special UI that makes the apps look like a single application by using Apache to rewrite URLs (which also resolves problems with the single origin policy). The whole application is served by a single portal application and all other applications are loaded as widgets.

Paul then handed over to Frederick who started doing some live coding to show how to create a widget application with the underlying UI framework. It seems to rely heavily on modern JS/CSS techniques so it pretty much rules out the use of IE or real accessibility and graceful degradation. However, all in all it seems to be a pretty neat framework for creating an application by distributing the logic across multiple small apps that are easier to maintain.

Texperts have been using the framework in production since June 2008 with 5 integrated applications. According to Paul, people who are using the system say that it’s got a responsive UI and increases productivity. However, it doesn’t seem to work well with high-latency connections because there’s lots of tiny AJAX requests happening all the time.

Fred and Paul gave a good talk - maybe the live coding in this case was a bit too expansive to be short and a bit too short to really display the works of the actual framework. Paul also talked that there’s a possibility that they might open-source it, so definitely be on the lookout over the next few weeks and months.

Resources:

RESTful Everything - Towards a Complete Resource-oriented Workflow

presented by Ingo Weiss

I was a little late for Ingo’s talk and the room was really packed - eventually they brought in additional chairs because 1/3 of all people where just sitting in the back of the room - so I didn’t catch every last bit that Ingo talked about.

His main statement was that Rails currently supports a REST-oriented workflow pretty well in the model and controller layer but, unfortunately, not in the views. Therefore, he developed a plugin on top of Rails’ REST implementation that facilitates the integration of views in the resource-oriented workflow. He then showed some example code to make his point. It really looked kinda painful because I’m just used to handling my views in the good old Rails way but after I got rid of my reservations I really started to see some potential in his whole approach. If I’d have to decide, I’d probably still stick with the classic approach - but it definitely needs people with fresh ideas out there so, personally, I’m grateful that Ingo seems to be one of these fresh people.

Granted, Ingo is not the most captivating of speakers but the interesting content definitely made up for the small deficits in presentation and after all, we’re programmers: Let the code speak (and speak it did - pretty well, as I said).

Resources:

Security on Rails

presented by Jonathan Weiss

(A lot of Weiss action going on today …) Jonathan started with the topic of information leaks. Convention over Configuration is good but it indicates that an application is using Rails (e.g. default routes, default javascripts, default 404/422/500.html error documents, etc.). Sometimes you can even approximately deduce the Rails version that is used. You can also gather lots of information from the HTTP response headers, such as the HTTP server (e.g. Apache) and modules (e.g. mod_rails) and application server (Mongrel) including their respective versions. Capistrano usually also deploys .svn and .git directories so you can just browse to .svn/entries or .git/whatever and can just read unencrypted data. With all this information, potential attackers could use the vulnerabilities to attack your site.

The Cookie Session Storage poses another security risk because the contents of your cookies are only encrypted with Base64 and can pretty easily be decrypted unless you use a secure (= long and random) session key. Jonathan recommends to not store sensitive data in the session at all or, if you really have to, at least use a different session store (e.g. ActiveRecord Store).

Jonathan then went on talking about Cross-Site Scripting. Avoiding XSS is easy if you don’t allow user content to be formatted - you can just use the h helper method to escape HTML output. There are also two plugins, safeERB and XSS shield plugin, that help you in that they raise exceptions when output is not escaped (safeERB) or just inverse the principe, i.e. all output is by default escaped and you have to explicitly say that you want your output unescaped (XSS shield). If you need to allow formatting of user content, Jonathan suggests using custom filters (e.g. RedCloth) but he stresses that you still need to sanitize the input/output and you should definitely use Rails 2.0 because the sanitize method in older Rails versions used a blacklist approach whereas newer versions of Rails use a whitelist approach so you have to explicitly list the tags you want to allow in user content. Another option would be to use Tidy.

Session fixation attacks are more rare with Rails applications than for example with PHP because PHP often passes the session id along in the URL. If you want to be really safe, Jonathan suggests resetting the session after every login and logout. Unfortunately, according to Jonathan, not even popular plugins like restful_authentication don’t do this unless you manually add the respective command.

Cross-Site Request Forgery is usually exploited by inserting an image with a URL to a server where the user has a valid and active session. It’s also possible to use JavaScript, Java Applets, ActiveX, etc. Protecting applications from CSRF is really easy because you only have to use a protect_from_forgery call in a controller, preferably the ApplicationController. Jonathan said that in his experience this is a really underused measure in the wild. You really have to make sure that you only use GET requests for safe and idempotent requests and you also should ensure that session cookies are not persistent (i.e. they should really expire at the end of the browser session or even earlier).

To avoid SQL Injection (I guess most people have at least heard about the term), you should either use prepared SQL statements (e.g. :conditions => [“id > ?”, params[:id]]) or invoke the quote method preferably on every table name, column name and value. In Rails versions <= 2.1 you also need to escape LIMIT and OFFSET parameters.

Jonathan then talked about JavaScript Hijacking which is especially critical if you use JSON because JSON is actually valid JavaScript. It can, for example, be done by hacking JavaScript’s default Array definition. Ways around JavaScript Hijacking not putting important data in JSON responses, using unguessable URLs (e.g. with MD5 hashes instead of plain text IDs) and prefixing the JSON response with garbage, thus making it invalid JavaScript. Of course, for the latter, you then need to manually remove the garbage for your own application.

Jonathan names mass assignment as another popular security in current Rails applications. If you’re not using attr_accessible/attr_protected to avoid critical attributes being accesses through mass assignment - think of an admin flag (true/false) for example. Jonathan definitely recommends using attr_accessible because it makes your attributes save by default (as opposed to attr_protected which only protects the fields that use the attr_protected method).

Rails plugins also have their security problems - even plugins of seasoned developers, like Rick Olson’s restful_authentication. Jonathan states that you always should do code review before you start using a plugin and keep them up to date (e.g. subscribe to the author’s blog or watch their GitHub repository). You also should rely on svn:externals and git submodules because you can’t deploy if the references repository isn’t available. Jonathan recommends using Piston (svn) or Braid (git) to manage your plugins.

Rails is susceptible to Denial of Service attacks because there usually is a limited number of Rails instances (e.g. through Mongrel) running so if you have expensive operations (like uploads of large files) they are likely to block your whole application. There’s several ways around this problem: Use the web server to serve static files (also cached files and also private files, e.g. by using X-Sendfile) and use separate clusters to handle expensive operations so they don’t block your main application cluster.

As I indicated yesterday, I really love Jonathan’s talks because they’re full of useful information and he presents it really well for both, seasoned and less experienced developers. If you didn’t know about Rails’ potential security issues (most of them, after all, are relevant for web applications in general and not only Rails applications), this talk definitely gave you a broad insight in the whole topic of securing (Rails) web applications. Jonathan gave the same (I think) talk at 24C3

you can view the video and check out the slides in his blog. Thumbs up (again) for Jonathan!

Resources:

Keynote - Performance on Rails - Jeremy Kemper

Jeremy talked about Rails 2.2, especially the work that has been done regarding performance for the upcoming Rails 2.2 release. The question is: What does the term “performant” actually mean? Jeremy states that performant actually means user experience. The user doesn’t see good performance, they only see bad performance (i.e. if things are slow).

With Rails, the standard way seems to be to throw in more hardware - if you can afford it. The problem, according to Jeremy, is that it’s really hard to find out where your performance bottlenecks are because the log file, after all, doesn’t tell you all that much.

He then went on talking about Steve Souders’ 10 rules and the related YSlow plugin by Yahoo that gives you some guidelines as to how you could potentially improve the overall performance of your application. You can find the guidelines over at Yahoo.

Jeremy said, one way to improve performance is to just do less - meaning that you should focus on the things you really need to do. Other ways are caching assets (like images or stylesheets) and using asset hosts. Using unobtrusive and/or asynchronous JavaScript can improve the rendering time of a given page. The good news is: We can already use all this stuff to improve our apps’ performance - very little work which can result in a massive improvement of user experience.

The important thing is to identify performance bottlenecks. Jeremy suggests using NewRelic’s RPM for finding these bottlenecks which apparently prints out really cool graphs that show you where your application spends a lot of time and which URL produce the most traffic. It is important to mention, that you shouldn’t necessarily work on improving long-running requests - if you speed up fast-running requests, you end up freeing up more resources for the requests that take longer to process and, finally, also manage to speed up your application.

Jeremy also mentioned FiveRuns’ TuneUp that helps you measure performance of parts of your application. He also pointed us to script/performance/request for executing integration tests but in his opinion you tend to end up with unstructured code, especially when you start changing your app - it’s a kind of one-off solution. This is the reason why Rails 2.1 started shipping with performance tests which behave pretty similarly to Test::Unit and Rails’ integration tests - except that, of course, it doesn’t print out dots but instead shows you metrics about the test, e.g. process time, memory allocation, number of objects and garbage collector time.

RESTful thinking and awareness regarding the options HTTP provides you with, according to Jeremy, help you a lot with improving the performance of your site because it enables possibilities for caching and only transmitting content that really has changed since the last time the user visited your app. If you check for the “freshness” of a request early enough during that request, you can cancel the request before expensive stuff happens and just send the user a Not Modified header so the browser uses its cached version.

A second big thing to Jeremy is garbage collection. He made his point by showing the output of a performance test that had 86000 objects which made the garbage collector run 18 times, amounting to more than 90% of the total request process time. You can speed up Ruby’s garbage collection by patching your Ruby with a patch by Stefan Kaes which you can find in his GitHub repository.

Finally, it comes down to really dig into the app and profile it from core - preferably with ruby-prof (Jeremy’s fork at GitHub is compatible with Rails 2.2). With ruby-prof, profiling is just a simple rake test:profile away. He also showed call graphs using an X11 tool named KCacheGrind. Afterwards, he went on demonstrating ways of fragment caching - which can be really hard because of all the inter dependencies between views and helpers. Jeremy: “Expect to get it wrong!” Nevertheless, you should design for cacheability.

Jeremy states that he ended up not so much improving Rails itself but, rather, its interaction with the browser and HTTP. He ended saying that, after all, it’s about the user’s experience and happiness and, ultimately, our own happiness as a programmer - if the user likes our app (i.e. if they think it performs well), we’re equally happy.

Resumé Day 2

In retrospective, I think the second day can be deemed a success. Last years RailsConf was - in my opinion rightly - criticized for its beginner level presentations and lack of advanced topics. From what I saw myself and heard from others, this year’s conference has a balanced mix of topics. Yes, there still aren’t that many really advanced topics

but then again, that’s not why one goes to a conference, right?

On the other hand there’s networking opportunities en masse. As I already mentioned, during coffee breaks and lunch break everyone is mingling and people from all over Europe engage in interesting conversations about Rails and the whole ecosystem around Rails. Plus, there’s people from everywhere over Europe and the US which makes it pretty damn interesting from a cultural perspective.

In short: So far, this year’s RailsConf is worth its money.

RailsConf Europe 2008 Coverage - Day 1

2008-09-02 00:00:00 +0000

As I mentioned a couple of hours ago I’m at RailsConf Europe 2008. So far, I’ve had a good time and the Conf seems to be better organized than last year (for starters, all the microphones worked from the start ;-) and the registration took only ~5 minutes). Unfortunately, O’Reilly seems to have cut back on the goodies: No RailsConf T-shirts this year and as far as I’ve seen there’s only one magazine that we got (last year it was, like, 10?!). But hey, as long as the talks are good, who cares about a TYPO3 magazine anyways?

So here it goes for today’s talks:

Accessible AJAX on Rails

presented by Jarkko Laine & Geoffrey Grosenbach

On of the three morning tutorials was about how you can make AJAX and JavaScript in general less obtrusive and therefore also more accessible. Jarkko showed some basic techniques of progressive enhancement (really necessary, since most of the people didn’t seem even know the term let alone what it’s about) and introduced Dan Webb’s lowpro pretty early in the game. In my opinion, it was very important that he really elaborated on using the right markup in order to be able to access elements using their respective class name, id or even other CSS3-style selectors. Jarkko also did not only talk but did live coding and a couple of short “try it yourself”-sessions where the participants got little tasks that they should implement. Finally, he showed delegating event handling with Event.delegate which is hidden very well beneath the covers of lowpro and thus probably not used quite as often as it should be used (because it’s good!).

All in all, since I’ve written two articles about unobtrusive javascript myself, the talk was a bit too basic for me. Jarkko presented well, though, and Geoffrey (who replaced Dan Webb, who unfortunately seems to be missing from this years’ Conf) chimed in every now and then and shared his experiences with several things related to the topic.

Ressources:

Deploying and Monitoring Ruby on Rails

presented by Mathias Meyer & Jonathan Weiss

Jonathan started the talk and explained about different ways of serving Rails apps: FastCGI, Mongrel and Phusion Passenger (aka mod_rails). He explained the differences, advantages and disadvantages really well and elaborated on the issue of load balancing. Jonathan said he thinks that Phusion Passenger will become the default way of hosting Rails application over the next few months. He also discussed JRuby/Glassfish/WAR files for deploying Rails applications into existing Java environments.

Jonathan then went on to explain load balancing across multiple physical servers using Apache 2.2’s mod_proxy_balancer, nginx, lighttpd and HAProxy (and other pure web proxies), and their respective pros and cons. Here are his recommendations for some use cases:

Small sites (with only one physical/virtual server): Apache 2.2 with Phusion Passenger
Medium sites (with multiple servers): Apache 2.2 as frontend proxy and static host, Mongrel (or Phusion Passenger) as the backend.
Large sites: redundant load balancers, redundant proxies and static file hosts, plus Mongrel or Phusion Passenger
Heavy focus on static files: Static requests handled by nginx or Lighttpd, dynamic requests routed through Apache to Mongrel/Phusion Passenger.
Java: WAR files with Glassfish/Tomcat/etc.

He closed the first part of the presentation talking about Ruby Enterprise Edition, Thin/Ebb/Evented Mongrel (although he doesn’t seem to have favorable opinions on Thin & Co.) and up and coming technologies like Fuzed (Erlang-based load balancer), JRuby and, of course, Phusion Passenger.

The talk was handed over to Mathias who talked about deploying Rails applications with Capistrano. He especially focused on recipes and more advanced techniques like using callbacks, namespaces, Capistrano variables and the deprec gem.

Jonathan took over again and introduced Webistrano, the web GUI he created to facilitate deployment with Capistrano and Macistrano, a little tool written by Mathias, to have deployment tools handy on our Aqua desktops.

Afterwards, it was time to get our hands dirty. Jonathan and Mathias had brought a mighty quad-core machine with 40 virtual servers so people could try out deploying the Railscasts website by Ryan Bates. Jonathan also showed how to configure Apache with mod_proxy_balancer to proxy dynamic requests to a cluster of Mongrels. Finally, they demonstrated how to deploy the same application with Phusion Passenger (great for people like me who are thinking about switching to PP for production). After some initial difficulties (turns out an AirPort Express isn’t really suited for servicing a couple dozen synchronous connections) I think all folks took away a lot from the practical part.

In the last part, Jonathan gave a brief overview about tools for monitoring Rails. It was especially cool to see how easily you can configure Monit with its ultra-easy DSL for basic monitoring on Rails (and other services).

Lastly, it was Q&A. Jonathan said that in his opinion memcached is the best caching solution available for Rails if you need really fast caching. He also likes God as an alternative to Monit (which has readable syntax because it’s Ruby) but prefers Monit since it is “older” and therefore has a proven track record.

All in all, this was a really valuable talk and Jonathan and Mathias have a great way of presenting and also engaging with the audience during the practical part. Thumbs up.

Resources:

Panel discussion

David Heinemeier Hansson, Jeremy Kemper, Michael Koziarski

David started the panel discussion with talking about three upcoming releases that are due to be released during RailsConf (2.0.4 and 2.1.1, which will, among other things, contain REXML fixes, and also a 2.2 beta). We also found out that most of the folks at RailsConf use Rails 2 now and it’s only a few people who are still, dare I say, stuck with Rails 1.2 - which is good because the core team wants to cut back the efforts on 1.2 so they don’t really know how long it’s going to be supported.

What’s good and what’s bad about the way that the whole Rails system is moving?

David especially likes that the ecosystem is growing and that people are blogging about Rails and sharing their stuff. At the same time, he thinks there’s lots of companies and individuals with lots of great code just sitting in their lib and vendor directories and rotting away.

Future of Test::Unit in Rails, because there’s rspec and others?

David would like to get rid of long underscored test names and he’d like to see the word “should” more often. In fact, some of the new stuff in Rails edge even started using the “new style” where you supply the test name as a string. Jeremy, Koz and David agree that the learning curve of Test::Unit is more user-friendly than rspec because you don’t need to wrap your head around the whole idea of behavior driven development. David thinks that, after all, everybody should use what suits them best.

Future of web apps - Sproutcore/MVC in the browser?

David is not a big fan of the idea of pushing more logic onto the browser. “The more stuff you take out of the application, the more stuff you have to write in a language other than Ruby - not good!” (Couldn’t agree more ;-)). Jeremy thinks that Sproutcore is the framework that has evolved best and has actually done some real integration and bridging the gaps between the server and the browser.

Are Gems the best way to publish and manage plugins?

Jeremy thinks that lots of plugins are actually very small and independent of each other so they don’t really need to be Gems and probably shouldn’t be. So he thinks that plugins will stick around in the future but the bigger plugins should be distributed as Gems.

Influence of Merb on Rails? (thread safety, performance, etc.)

Jeremy thinks that there’s definitely light competition going on and Rails does definitely learn a thing or two from Merb (and vice versa). Koz says that Rails still tries to build the best web application framework there is. As far as performance goes, he hates “Hello World” benchmarks because they don’t show the real world performance.

Which Ruby implementation should be used?

Koz’ customers mostly use the current MRI (1.8.6) but thinks that JRuby is coming along really well and that it’s important. Jeremy agrees and also mentions MagLev as a future possibility. RubySpec is the future because it shows how far and how well Ruby has been implemented by the respective implementation.

Other frameworks that are interesting to the core members and what are they playing with?

Koz evades the question (“I’m a Rails consultant”) but states that he’s definitely interested in alternative data stores and persistence technologies such as object-oriented DBs and CouchDB. David is interested in clouding and distributed system technologies (“Having your own machine is like carrying your own electricity”). Jeremy thinks that Sysadmins should pick up convention over configuration to achieve this (chch).

Differences between European and US conferences and other ones?

Koz likes smaller conferences because it’s less frantic and you can have conversations with people. He also likes that it gives new people opportunities to speak. David loves that RailsConf US is a special experience because it’s about being part of the crowd which has a different kind of energy and atmosphere (“It has the big, block-buster kind of mentality.”). He likes that big conferences have lots of applications for talks which, all in all, increases the quality.

Sweet spots for Rails applications (apart from the whole scaling debate)?

Koz thinks that Rails usually fits the problem from a technical perspective but it doesn’t always fit the business - if the team or the company doesn’t accept Rails, it’s not right for them (“It’s not about technology, it’s about culture.”). David agrees and adds that he sometimes thinks that he’d sometimes like if people wouldn’t force Rails on people.

Resources:

At RailsConf Europe 2008

2008-09-02 00:00:00 +0000

I just got back from Bratwurst on Rails which, like last year, was a really awesome opportunity to get to know some of the folks that are going to be at RailsConf Europe over the next few days. All those of you that I met: It was great meeting you! If you ever come to Austria, drop me a line - I can’t promise a Bratwurst but we can definitely get a beer (or two). ;-)

As for RailsConf itself, I’ll be attending the Accessible AJAX on Rails and Deploying and Monitoring Ruby on Rails on the first day. If you spot me (hint: this is how I look these days), say hi!

I’ll see if I can manage some coverage of the Conf - can’t promise, though, as I’ll probably be busy mingling. ;-)

Have fun at RailsConf Europe 2008!

i18n demo application grows up

2008-08-30 00:00:00 +0000

Thanks to the press my i18n demo application received lately (blog posts at Rails Inside and the Riding Rails Blog plus mention in the Rails Envy Podcast #42), there have been a number of contributions to the app. Mostly, these were additional locales so that the app now offers complete translations for the following languages:

US English
Portuguese (Brazilian)
Australian English
Japanese
Thai
Spanish (Argentinean)

In addition, Masayuki Nakamura and Simon Tokumine took the time and introduced content translations so now not only the Rails parts are localized but also the tutorial content itself! To be honest, this is not what I had in mind when I wrote the demo application but it’s a great addition nonetheless! Thanks so much for the effort, guys!

If you want to add a new translation of the app, that would be great! It’s pretty simple to do:

Fork my repo at GitHub.
Add a translation file in config/locales.
Copy the contents of one of the complete translations (e.g. en-US.rb) and use this as a base for your translation.
Add your name and optionally a link to your website to the list in app/views/pages/about.html.erb.
When you’re done, push the changes to your repo and pull request me.
I will then review your additions and merge it into my original repo.

The demo app now also supports the latest additions to the Rails i18n plugin, namely loading locales from YAML files. I’ve implemented a demonstration for this in the en-US locale. Note that YAML files are the new default storage method for translations, except if the translations contain any complex Ruby constructs such as lambdas - see the :long_ordinal date/time format in the en-US.rb locale if you want an example.

If you want to take a peek at the demo app without installing it, it’s running live over at http://i18n-demo.phusion.nl. Thanks for hosting it!

I’m really looking forward to seeing the changes/additions that you guys propose for the demo app over the next few weeks. If you need any help on how to contribute, feel free to ping me in IRC - I usually hang in #rails-contrib under the name “clemensk”.

Localizing Rails

2008-08-10 00:00:00 +0000

After releasing the localized_dates plugin, I thought that it’s about time that someone wrote a comprehensive tutorial on how to fully leverage the power of the current Rails i18n implementation.

Granted, Trevor Turk wrote a pretty cool tutorial that shows simple localization in Rails, but it’s missing details on how number formats, dates, etc. can be localized. This is where my tutorial starts.

For a change, I decided not to write a long article about it. Instead, I created a demo app that uses the i18n features (plus my localized_dates plugin) and has all the tutorial texts in the app itself. We’ll see if this kind of tutorial works - if so, I’ll probably continue working this way in the future.

Here’s what you do:

$ git clone git://github.com/clemens/i18n_demo_app.git
$ cd i18n_demo_app
$ ./script/server

Then fire up your browser and go to localhost:3000 - the index page should be loaded automatically.

For now, this tutorial app includes info on how to localize:

Date and Time Formats
DateHelper (forms)
NumberHelper
ActiveRecordHelper (incomplete)

Feedback, as usual, is appreciated!

Translations/Forks

Fabio Akita has forked the demo app on GitHub and translated it to Portuguese. If you want to translate it to other languages, that’s perfectly fine by me! Leave me a note in the comments and I’ll add your translation to the list.

Translations:

Australian English by Dr Nic Williams
Portuguese (Brazilian) by Fabio Akita
Spanish (Argentinean) by Leandro Marcucci
Japanese and content translation by Masayuki Nakamura and Simon Tokumine
Thai by Prem Sichanugrist

Plugin: localized_dates

2008-08-07 00:00:00 +0000

After nearly two years of Rails development it’s more than time that I released my first (official) plugin. So here it goes.

If you’ve watched Rails on Lighthouse and GitHub lately, you’ve probably noticed that I’ve proposed quite a number of patches, especially regarding the new Rails i18n plugin (which by the way got me the 2nd place in the July 2008 Rails Hackfest - congrats to Tarmo Tänav at this point).

However, the date and time API got only partial i18n/l10n coverage - date and time formatting wasn’t included - which is where my new plugin, localized_dates, might come in handy.

The quickest way to get going is to simply install the plugin from its GitHub repository:

$ ./script/plugin install git://github.com/clemens/localized_dates.git

As part of its installation, this will copy two locale files (en-US.rb and de-AT.rb) to your config/locales folder. en-US is basically there to keep the current functionality and date definitions in place. de-AT serves as a demo on how you can write your own locale.

If you need any further help, I suggest reading the README over at GitHub.

If you like it, consider rating it at its Agile Webdevelopment plugin page.

To Raise or Not To Raise

2008-08-01 00:00:00 +0000

Disclaimer: I’ve taken the Rails i18n implementation because I’ve worked with it lately and because I think it serves as a good case study here. Personally, I think that all the guys have done a great work in realizing this i18n interface and with this article I don’t want to attack them in any way. You may want to read Sven Fuchs’ writeup on the Rails i18n features first in order to better understand some parts of the article.

The Stumbling Block

A few days ago when I tried to implement the i18n functionality for Rails’ Date and Time classes I stumbled across an interesting part in the i18n source or, to be more specific, in the simple backend’s localize method:

def localize(locale, object, format = :default)
  raise ArgumentError, "Object must be a Date, DateTime or Time object. #{object.inspect} given." unless object.respond_to?(:strftime)
  
  type = object.respond_to?(:sec) ? 'time' : 'date'
  formats = translate(locale, :"#{type}.formats")
  format = formats[format.to_sym] if formats && formats[format.to_sym]
  # TODO raise exception unless format found?
  format = format.to_s.dup

  format.gsub!(/%a/, translate(locale, :"date.abbr_day_names")[object.wday])
  format.gsub!(/%A/, translate(locale, :"date.day_names")[object.wday])
  format.gsub!(/%b/, translate(locale, :"date.abbr_month_names")[object.mon])
  format.gsub!(/%B/, translate(locale, :"date.month_names")[object.mon])
  format.gsub!(/%p/, translate(locale, :"time.#{object.hour < 12 ? :am : :pm}")) if object.respond_to? :hour
  object.strftime(format)
end

What this code does is basically take a format, look it up in the dictionaries and use it as a format string for strftime. If it can’t find the format it will just convert the format parameter to a string and then use it in strftime directly.

Notice the comment! It’s basically a note from one developer to the other developers where they ask whether or not they should raise an exception if the format can’t be translated. And well, they should!

Why and When Raise Exceptions?

Over the years I’ve come up with a little rule of thumb for exceptions:

Libraries Should Raise, Applications Should Not Raise.

Yes, it’s as simple as that. To understand the reasons for my rule of thumb, we have to switch to the world of compiled programming languages for a second.

Lots of libraries in C/C/Java/etc. come to you as compiled source which basically means you can’t just look at the implementation and figure out what the code does and how it behaves in different situations. This, on the one hand, forces library developers to write good documentation for their code and, on the other hand, also means they must provide sensible error handling since the developer can’t just - like in Ruby - re-open a class and change a method according to their needs.

Ask yourself a little question: If you pour water in a glass that’s already full, would you expect the glass to react to that all by itself or would you rather take it into your own hands and stop pouring or maybe empty the glass (you don’t like what’s in it anyway)?

In other words: The tool should not decide about error handling but instead delegate it to its user. At the moment of its inception, a programmer basically just can’t know how it’s going to be used and thus can’t make decisions about how errors should be handled.

Plus, the programmer of the library should apply the principle of least surprise: If I, as a user of the library, throw in an invalid value (i.e. a value that can’t be found in the dictionary), I don’t automatically expect that my value is just pushed into strftime no matter what.

Dynamic Typing and Error Handling

How do you actually define what’s valid or invalid for a given parameter?

Well, as long as you work with only one type of values, it’s usually pretty straightforward. Imagine a car with a pre-defined maximum speed and a setter for its current speed. If the setter is invoked with a speed that exceeds the defined maximum speed it is invalid and it can trigger error handling like setting the current speed to the maximum speed instead of raising an exception.

However, if you allow a parameter to have different types, things get a little more complicated in a dynamically typed language like Ruby. Where you would have interfaces and polymorphic methods in Java to ensure that all types are handled correctly, you need a full blown control structure in Ruby:

def method(param)
  case param
  when Symbol
    # do some symbol specific things
  when String
    # do some string specific things
  # etc.
  end
end

With lots of different options, this quickly gets a bit bloated but it’s a thing we accept because we get added flexibility that comes with dynamic typing.

The real challenge in this case is to conceive error handling that works. Where in Java each polymorphic method usually has its own validation rules, we’d have to throw them inside the code blocks in Ruby.

def method(param)
  case param
  when Symbol
    raise ArgumentError, ":default is not a valid option" if param == :default
    # do some symbol specific things
  when String
    raise ArgumentError, "param is too long - use max. 5 characters" if param.strip.length > 5
    # do some string specific things
  # etc.
  end
end

Sidenote: Premshree Pillai has written a very good Introduction to Static and Dynamic Typing over at SitePoint, so check it out if you want.

Apply it to Some Real Code

To cut a long story a little shorter, let’s take a look at why I don’t like the current implementation of Simple#localize. Here’s the code that I’d need to write in order to support both, Rails’ existing behavior and i18n capabilities:

def to_formatted_s(format = :default)
  return DATE_FORMATS[format].respond_to?(:call) ? DATE_FORMATS[format].call(self).to_s : strftime(DATE_FORMATS[format]) if DATE_FORMATS.keys.include?(format)

  type = self.respond_to?(:sec) ? 'time' : 'date' # could be TimeWithZone so I need to even check that in date
  # neet to check first if the format is defined - if not, fall back to default to_s
  I18n.translate(:"#{type}.formats") ? I18n.localize(self, format) : to_default_s
end

To shortly explain the flow: Rails looks up standard formats in its DATE_FORMATS constant first. If it can find it it either calls it if it’s a proc or uses the format as a format string for strftime. If it can’t find the format in the constant it does localization. The I18n.translate is necessary because localize doesn’t raise an exception if it can’t find the supplied format but instead uses it as a string for strftime. We can’t have that since Rails falls back to to_default_s if it can’t find the format.

Here’s what it would look like if localize raised an exception if it can’t find a format:

def to_formatted_s(format = :default)
  # fall back to default to_s if localize raises an exception
  DATE_FORMATS.keys.include?(format) ? strftime(DATE_FORMATS[format]) : I18n.localize(self, format) rescue to_default_s
end

Here’s the implementation for localize that I’ve proposed to the i18n team:

def localize(locale, object, format = :default)
  raise ArgumentError, "Object must be a Date, DateTime or Time object. #{object.inspect} given." unless object.respond_to?(:strftime)
 
  format = case format
  when Symbol
    type = object.respond_to?(:sec) ? 'time' : 'date'
    translate(locale, :"#{type}.formats.#{format}") # raises I18n::MissingTranslationData if format isn't found
  when Proc
    # allows for something like :long_ordinal => lambda { |date| "%B #{date.day.ordinalize}, %Y" }, # => "April 25th, 2007"
    # similar to current Rails functionality, see ActiveSupport::CoreExtensions::Date
    format.call(object)
  else
    format
  end.to_s
  
  format.gsub!(/%a/, translate(locale, :"date.abbr_day_names")[object.wday])
  format.gsub!(/%A/, translate(locale, :"date.day_names")[object.wday])
  format.gsub!(/%b/, translate(locale, :"date.abbr_month_names")[object.mon])
  format.gsub!(/%B/, translate(locale, :"date.month_names")[object.mon])
  format.gsub!(/%p/, translate(locale, :"time.#{object.hour < 12 ? :am : :pm}")) if object.respond_to? :hour
  object.strftime(format)
end

What happens here, exception-wise? First of all, I kept the ArgumentError that is raised if the object that is passed in doesn’t respond to strftime.

The code then makes an assumption about the value of format that is pretty common in Rails and also Ruby: If it’s a Symbol it’s most likely some kind of Hash key - therefore we try to look it up in our translations. If it can’t find the translation, the translate method correctly raises an exception which is passed back to the caller - in this case the to_formatted_s method in the Date/Time classes.

The second option is to use a Proc - also fairly common in both Rails and Ruby itself - to evaluate a closure.

Finally, the default behavior is just to return the format parameter converted to a String (notice the to_s at the end). Both options rely on Ruby’s internal error handling so if an error is raised it will - again - be passed back to the caller.

We’ve also opened the possibility to add other options later in the game so we could, say, treat an Array differently.

In short, we’ve basically set up a copule of rules for any programmer who wants to use the library:

You can pass in any parameter as an object as long as it has a strftime method.
The type of format you pass in affects how it’s going to be handled by the library.
We delegate all error handling to y so you, yourself, are responsible for handling all errors.

So now whoever uses the library can again decide whether they want to re-raise the error or provide a sensible default behavior - like Rails does by falling back to the original to_s method.

Why Applications Should Not Raise

Now that I’ve explained why I think that libraries should raise it’s time to explain why application code should not raise.

First, let me explain what I mean by application code. Roughly speaking, everything that you write yourself and (usually) exclusively for one application is what I consider application code. So your models, controllers and views are application code. Some stuff in the lib folder, although considered a library, might be application code, too. Even some plugins might be considered application code: Rick Olson’s popular restful_authentication plugin, for example, generates a user model and some other stuff for you that sits directly in your application and can - and usually even should - be changed.

I’ve seen some code lately that raises a comparatively high amount of application code exceptions. I think this is due to the rescue_from method that was added to Rails core in September 2007 because it makes it really easy to unify error handling.

Do I think that this is a valid approach for handling errors? Yes, absolutely. Do I think that it’s the best approach? Absolutely not.

While it may DRY up your code it usually also obfuscates it. If you look at code that raises an exception and the handling of the exception is handled somewhere else you’ll either have to think how you implemented the error handling or even look at the code again.

While this is perfectly acceptable and even natural for libraries because they don’t know anything about the code that’s going to use it, it feels awkward in application code. When you write application code that can potentially cause errors you already know how these errors are going to be handled which - by my definition - is a clear case against raising an exception.

Don’t get me wrong - I think DRY is a hell of a good rule. However, when it comes to handling application code errors DAMP code is just a lot more readable and meaningful.

What Do You Think?

I’ve told you about my opinions and my rule of thumb so now it’s time that you tell me about your experience with exceptions. Do you always/never raise exceptions? Is it merely a question of style or is it important to raise exceptions, especially in libraries that other people are going to use?

I look forward to hearing from you!

Guide to Unobtrusive JavaScript - Part 2

2008-06-30 00:00:00 +0000

This is part 2 of my Guide to Unobtrusive JavaScript. It’s been over a month since I published the first part of the Guide to Unobtrusive JavaScript. As you can see from the date of this article (btw. why does Typo show the article’s creation date instead of its publication date? If anyone knows, drop me a line - I’d appreciate it), riginally, I wanted to publish the second article just a few days after the first but - as usual - projects got in the way. So here it finally is. I hope the quality makes up for the delay. Sorry again for keeping you waiting.

If you haven’t read part 1 and aren’t familiar with the term Unobtrusive JavaScript and/or the basics of Dan Webb’s lowpro, I strongly recommend reading part 1 before going on with this part.

Taking it further - lowpro behaviors

Let’s re-visit the rollover example from part 1. We wanted to have a simple rollover effect for all our images with the CSS class rollover. Just for fun, I’ll start at the end and show you the code that results from using a behavior before we implement the actual behavior:

Event.addBehavior({ 
// expects image names to be in the following format: some_image.extension 
  'img.rollover': Rollover
});

Wow - we’ve compressed 20 lines compressed into a single line of code. And it’s even more declarative than before! But of course this doesn’t come for free - we still need to define the behavior ourselves:

Rollover = Behavior.create({
  initialize: function() {
    this.split_src = this.element.src.split(/\./);
    this.parts = this.split_src.first().split(/_/); // split main part and extension
    this.extension = this.split_src.last();
  },
  onmouseover: function(e) {
    if(this.parts.last() != 'over') {
      this.parts.push('over');
      this.element.src = [ this.parts.join('_'), this.extension ].join('.');
    }
  },
  onmouseout: function(e) {
    if(this.parts.last() == 'over') {
      this.parts.pop();
      this.element.src = [ this.parts.join('_'), this.extension ].join('.');
    }
  }
});

Most of the code still looks the same as in part 1 but now we’ve encapsulated the behavior into a separate behavioral class - which is great, from an OO point of view. Every element on the page that suffices the conditions (i.e. it is an image with class rollover) will instantiate its own Rollover object that handles its events (onmouseover and onmouseout).

Taking a closer look at the Rollover class, you’ll notice that there is an initialize method that we haven’t had before. As usual in Prototype, this is the constructor: It takes responsibility for setting up a basic Rollover object. In this case, it initializes the parts array and the extension for further use in the actual behavioral methods.

We can go further and apply a namespacing strategy to keep our behaviors neatly ordered: Just rename Rollover to Image.Rollover to be more explicit.

For practice, let’s also rewrite the labeled form example from the first article to be a behavior.

NOTE: I had a little error in the code of the first part. I set the value correctly (using .value for textfields and .innerHTML for textareas) but when reading the values I only used the .value version. This has been corrected for this article to use the $F utility method. I’ve also internalized the hiding of the associated label.

Form.WithInlineLabels = Behavior.create({
  initialize: function() {
    this.label = this.element.previous('label');
    this.label.hide();

    this.labelText = this.label.innerHTML;
    this.element.tagName == 'TEXTAREA' ? this.element.innerHTML = this.labelText : this.element.value = this.labelText;
    this.element.addClassName('with_label');
  },
  onfocus: function(e) {
    if($F(this.element) == this.labelText) {
      this.element.removeClassName('with_label');
      this.element.tagName == 'TEXTAREA' ? this.element.innerHTML = '' : this.element.value = '';
    }
  },
  onblur: function(e) {
    if($F(this.element).blank()) {
      this.element.addClassName('with_label');
      this.element.tagName == 'TEXTAREA' ? this.element.innerHTML = this.labelText : this.element.value = this.labelText;
    }
  }
});

Yet again, this makes the call in Event.addBehavior dead easy:

Event.addBehavior({ 
  // makes all textfields and textareas use inline labels
  'input[type=text], textarea': Form.WithInlineLabels
});

Don’t rewrite - reuse!

So what’s the advantage of using this approach apart from saving a few lines of code and making the body of Event.addBehavior way more declarative and concise? Well, a behavior is a set of definition how a certain element reacts to given events (e.g. when hovered with the mouse, when clicked, etc.). So each behavior object encapsulates functionality - and in OOP, encapsulated functionality usually means that the functionality is reusable and maintainable.

It’s like that for lowpro behaviors: You can easily re-use the behavior in other projects. Just put all your behaviors in a separate JavaScript file. After that, you could have a setup like the following:

<script src="/javascripts/prototype.js" type="text/javascript"></script>
<script src="/javascripts/lowpro.js" type="text/javascript"></script>
<script src="/javascripts/behaviors.js" type="text/javascript"></script>
<script src="/javascripts/application.js" type="text/javascript"></script>

behaviors.js keeps a collection of all your behavioral classes (you could even split it into multiple files once you’ve defined a bunch of behaviors that are related) and application.js connects them to DOM elements using Event.addBehavior. Great and clean separation of concerns!

If you find that you’ve created lots of behaviors, you can even split them further and have behaviors.image.js, behaviors.form.js etc.

Making it more dynamic - passing in arguments

Let’s look at an example where we extend our existing behavior because this also gives us the opportunity of looking at the possibility to pass in arguments to a behavior to make it a little more dynamic. Take a look at the form with inline labels again and assume that you want some forms to not hide the labels but instead show them together with the inline labels.

Form.WithInlineLabels = Behavior.create({
  initialize: function(options) {
    this.options = options || {};

    this.label = this.element.previous('label');

    if(!this.options["showLabels"]) {
      this.label.hide();
    }

    this.labelText = this.label.innerHTML;
    this.element.tagName == 'TEXTAREA' ? this.element.innerHTML = this.labelText : this.element.value = this.labelText;
    this.element.addClassName('with_label');
  },
  // ... onfocus and onblur definitions
});

Now the behavior takes an optional hash named options as a parameter. Now we can use both variants, with and without labels (default is, of course, without labels):

Event.addBehavior({
  'form.withLabels input[type=text], form.withLabels textarea': Form.WithInlineLabels({ showLabels: true }),
  'form.withoutLabels input[type=text], form.withoutLabels textarea': Form.WithInlineLabels
});

Why did I use an options hash instead of just using a simple parameter? After all, we’re only passing in one argument, right? Well, for now, yes, there’s only one option: showLabels. But we don’t know what we might add in the future: There might be other options as well and there might be bugfixes. If we change methods without being careful, we might break backwards compatibility - and if we do, we might have to hack our bugfixes in if we want/need to use them. Wouldn’t it be nice if we could just drop the behavior in an replace the old, buggy one?

Turns out that hashes are a great way to do that. If you had deployed the first version of the behavior (the one without the showLabels option) and then later added the new option to it (and maybe some bugfixes), you could easily replace the old behavior. Why? Because nothing changes for the existing code: The behavior can still be used without a parameter and it defaults to the old behavior! If you make sure that you’re backwards compatible, anytime you find that you need new options or if you discover a bug, you can just go in, fix it and deploy it to all your existing projects. Now that’s really cool!

And by the way: This is how most cool frameworks in pretty much every programming language handle it, so it’s pretty much become some kind of best practice. Think of Ajax.Request in Prototype, for example: It takes the URL of the request as its first parameter (this makes sense - there absolutely needs to be a URL - otherwise there’s not going to be a request!) and all its options as its second parameter. If Sam Stephenson decided that the class needs more options, he can just add them without breaking existing installations.

Pre-defined behaviors

Writing your own behaviors isn’t actually the only option. lowpro actually ships with a couple of useful AJAX behaviors: Remote.Link, Remote.Form, Remote and Observed. You can find some additional (partly non-AJAX) behaviors in Dan’s SVN repository. Right now, only the DateSelector, Calendar and the drag&drop behaviors are implemented. I also have to admit that I haven’t tested these yet, so I don’t know whether they work correctly or not!

I’d give you an example of using Remote.Link but there’s already a great example out there written by Matt Aimonetti: In his article AJAX pagination in less than 5 minutes he shows how to use the great will_paginate plugin with AJAX.

Speaking of AJAX: If your application relies heavily on AJAX, you’ll find that quite often you add or modify elements of your page. Now imagine, you add content to the DOM that contains one or more images with class rollover. Since lowpro adds behaviors when the dom:ready event fires, the behaviors won’t get attached to elements that are added afterwards. Of course, Dan realized that and provided you with an option to ensure that behaviors are assigned properly after AJAX requests:

Event.addBehavior.reassignAfterAjax = true;

This triggers an additional onComplete handler for AJAX responders that lowpro defined. When triggered, it reloads the behaviors and reassigns them to the matching elements. This way, the behaviors you define are applied to each and every element that shows up on the page - either right from the start or after some AJAX requests.

One more thing while we’re talking about pre-defined behaviors. I haven’t talked about extending existing behaviors by subclassing them. If you take a look at the lowpro source you’ll find the following implementation of the Remote behaviors:

Remote.Base = {
  // code
}

Remote.Link = Behavior.create(Remote.Base, {
  // code
});

Remote.Form = Behavior.create(Remote.Base, {
  // code
});

I’ve left out the code for the sake of brevity.

What you can see here is that the Remote.Link and the Remote.Form behavior both extend the behavior of Remote.Base. This behaves quite similarly to Prototype’s Class.create - it’s just plain old subclassing that you’ll find in every object-oriented language. Since I assume that you understand the basics of OOP I won’t go into detail here.

Resumé

lowpro gives you the ability to easily define and apply behaviors for given elements on the page. Compared to the approach of the first article, this gives you not only a clear separation of concerns (i.e. HTML for structure, CSS for design, JavaScript for functionality - without them cluttering up each other) but also clearly encapsulates functionality and makes it reusable and easily maintainable. If you discover a bug or a browser incompatibility (mind you, IE8 is under way …) you can just go in an fix the bug in one place (i.e. the behavior class) and deploy the new behaviors for all your projects. Ideally, you don’t even have to modify anything else - now that’s great, isn’t it?!

Finally, here are a few suggestions about using lowpro:

The first rule is easy: Whenever you’re defining how an element should react to certain events, use lowpro. It makes your code more maintainable and gives you a clear separation of concerns.
If you feel that you’re defining a behavior that you might in future projects, externalize it in a behavior class. After that, it’s best to put it in an extra JavaScript file (e.g. behaviors.js) to make it easier to use for other projects.
When defining behaviors, try to be as general as possible. It’s absolutely okay to make a few assumptions about page structure and CSS (remember: Convention over Configuration) but general applicability is what sets a really good behavior apart.
If you need to configure your behavior objects, use parameters. If there’s one or two arguments that your behavior absolutely needs (like Ajax.Request needs a URL), use scalars - otherwise use an options hash. If you want to ensure backwards compatibility, don’t ever change the method signature - only add new parameters at the end and make them optional.

Resources

Some links that might be of interest for you:

What do you think?

As usual, I’d love to hear your comments and suggestions. Also, feel free to ask any questions related to lowpro or JavaScript in general. I’ll do my best to answer them if I find the time.

Tips for Learning Rails

2008-06-14 00:00:00 +0000

Most of you guys probably know Working With Rails - or even hang out at their forums like I do. There’s a board with the suiting name of Ask a Rails expert where beginning and slightly advanced Rails developers can ask the more experienced ones questions. Lately, there seem to be too many beginners for too few experts - this inspired me to write an article on some tips for picking up and starting Rails.

Experience = Pre-Requisite!

To start with: Don’t pick up Rails or Ruby if you haven’t ever programmed before or if you haven’t yet developed at least a couple of sites that actually earned someone money and/or recognition.

The reason is quite simple: Rails is a very complex and opinionated framework that hides a lot under its covers. If you’ve never experienced troubles with tainted user input (like SQL injection, CSS, CSRF, etc.) first hand and know how to handle it by yourself (i.e. without a framework doing it automatically) you’re quite likely to overlook some details even if the framework does a good job at offering basic help.
Moreover, Ruby, in my opinion, is a way too mighty language to be given to a beginner. All its meta programming features and mighty object-oriented features are hard to grasp if you’ve never dealt with anything remotely as complex before.

Obie Fernandez sums it up in an interview with Gregg Pollack at RailsConf 2008, so hop over there if you’re interested.

In short: Unless you have a very good grasp of OOP and probably a few years of experience in web development with other languages (PHP, ASP (.NET), Java, etc.), Rails is - in my opinion - definitely not right for you. You may get the job done eventually, but it’s quite likely that you’ve written bad code - i.e. it may have security holes, may not be tested thoroughly enough, etc.

Book(s)/Screencast(s) -> Real Project!

Don’t learn by theory - use it in a real project!.

After you’ve gotten familiar with the basics of Rails - by reading a good book like Agile Web Development with Rails (2nd edition covers Rails 1.2, 3rd edition - to be released in October - will cover Rails 2) or watching some screencasts over at PeepCode or Railscasts - get your hands dirty on a real project. Toying with Rails isn’t fun if all you’re doing is “Hello World” stuff - plus the progress you make is non-existent.

Standard Project > Highly Specialized Bulls$$t

When I say “Use it in a real project”, I mean a project that is not overly big and doesn’t require lots of “non standard” features.

Lots of people at the forums say they’re totally new to Rails (or have just started programming at all) ask about very specific topics: They want ActiveRecord to generate some CSV or Excel spreadsheets, use task scheduling, display pretty graphs and what not. If you test-drive something (be it a new car or a new computer), you take it down a familiar road … SLOWLY! Use it for your next small-scale business application that has some basic user authentication and a few standard features (but, of course, combined in a great and creative way) that can be built without relying on too many plugins. After you’ve gotten comfy with Rails, you can always move on to bigger and more demanding non-standard tasks.

Plugins You Don’t Understand = Plugins You Shouldn’t Use

Talking about plugins: Never, NEVER rely on plugins without fully understanding them.

Every plugin comes with some kind of documentation - either a readme file, some rDoc or even a full-blown website. If there’s no documentation, don’t use it! With 99% certainty, it’s crap! Also, I’d suggest only using plugins that are actively developed (unless there’s a really good reason they’re not actively developed - like small-scale plugins such as acts_as_list, acts_as_paranoid, etc. with only a few files and practically no hacks) and preferring plugins developed by established and experienced Rails developers like Rick Olson and other (former) Rails core developers.

That being said, you should be aware that even well-known and heavily used Rails plugins from experienced developers, such as Rick Olson’s popular restful_authentication plugin have (or at least had) their flaws. It’s a good idea to at least take a good look at the source code and try to understand what it does before blindly installing it and providing malevolent users with new security holes.

Learning from Other People’s Codes

Read other people’s codes.

Of course, again, I would focus on people that have a certain reputation in the Ruby/Rails community. Otherwise you’ll end up hacking away like in the old days with JavaScript (hah, sweet nostalgia). Take a look at some of the great open source projects you find at RubyForge. You should also try, step-by-step of course, to get to know the code of Rails itself. It’s a complex framework but the more you know about it, the better you’ll use it to your advantage.

Je ne comprends pas!

Code in English, God damn it!

Dozens of people at the WWR forums ask questions and post code snippets in Spanish, Portuguese, French, German, … Don’t do that! First of all - doesn’t it look wicked to you if you have your variables in your mother tongue and all the language and framework keywords in English? Second, having English code makes it easier to ask other people for help because everybody knows at least a bit of English. Help others help you and write your code (including database stuff and everything else that youc an) in English - preferably American English.

Test Your Ass Off!

Use TDD or BDD!

The first Rails application I put in production didn’t have a single unit test or spec (and it still doesn’t). If you’re a beginner, that’s even worse than when you’ve reached a certain expertise with Rails. Why? You don’t know yet how the framework handles certain situations - it might behave unexpectedly. Tests/specs to the rescue! Get in the habit of testing your Rails code - and start with your first app! I’d recommend using shoulda or rspec rather than the classic Test::Unit because I think they’re easier to understand if you haven’t done TDD/BDD before.

Two Weeks Back = Old!

Keep up to date what’s going on in the Rails world!

Everyday, there’s a lot going on out there that you should be aware of and there are lots of resources that help keeping you up-to-date such as the Rails Weblog itself, Railscasts, the RailsEnvy Podcast, RubyFlow, Planet RubyOnRails, Ryan Daigle’s Blog (including his series What’s new in Edge Rails), etc. You should definitely add these resources to your feed readers.

Rails Isn’t Always The Right Choice!

Last but not least: A good Rails programmer not only knows how to get a job done with Rails but also knows when Rails doesn’t fit!

I see lots of people writing really tiny apps with Rails these days just to be hip. By tiny I mean apps with only one or two models or even just little scriptlets that focus on a single purpose. Considering Rails’ memory footprint (a Rails app with a couple of Mongrel instances can easily use some 300-400 MB of RAM in heavy usage times) this is pure overkill.

For small apps, plain old PHP may still be a good, maybe even the best, choice. If you definitely want to do Ruby, maybe take a look at Merb - it unites some of the things we love about Rails in a much lighter framework.

Remember: If you’re a good programmer, you don’t select your jobs based on the language you want to use but rather you pick the best language/framework to get each job done - which may well mean you’ll write some apps with Rails, some with Merb and some with PHP, Java or ASP.NET (gross, did I really just suggest that?!).

Conclusion

These are the best tips that come to my mind for learning how to use Rails (or pretty much any framework). Take your time and learn carefully. Keep up-to-date - Rails is moving along more patiently than it did over the last few years but it’s still moving along quite rapidly compared to other frameworks/languages.

And don’t ever forget when using Rails: Have a lot of fun! ;-)

PS: Some of my more regular readers might wonder why the second part of my Guide to Unobtrusive JavaScript still hasn’t been published. I’m sorry for the delay but I have been busy over the last few weeks. The article is nearly finished and will probably be published by the end of next week, so please stay tuned.

Guide to Unobtrusive JavaScript - Part 1

2008-05-19 00:00:00 +0000

This is the first of two parts in this series about Unobtrusive JavaScript. The second part can be found here.

Way back in March 2005 (shortly after Jesse James Garrets giving birth to the term AJAX) Rails fast probably the first framework to introduce built-in AJAX-support: Sam Stephenson wrote the now famous Prototype library and together with David included helper methods in Rails’ ActionView. Then, it was called the JavaScriptHelper which was later refactored and split into 3 separate helpers - JavaScriptHelper, PrototypeHelper and ScriptaculousHelper. No doubt, the early and tight integration of AJAX capabilities in the framework was one of Rails’ great USPs that helped push it to where it’s now.

AJAX on Rails

Roughly said, the PrototypeHelper provides you with ways to create AJAX-backed forms, links, buttons, etc.. More specifically, it uses basic HTML elements and adds the relevant attributes (onclick, onsubmit, etc.) to them. In 2005, this was an acceptable way (but even then, it wasn’t quite great) whereas nowadays it’s considered a really bad thing to mix JavaScript in your HTML. People call it Obtrusive JavaScript. It’s bad because, like with MVC, concerns should be separated: HTML builds the basic page structure, CSS provides style and layout and JavaScript adds some behavior on top of everything else. Wherever possible, mixing these three should be avoided (a good example for an exception from this rule would be the inclusion of style:“display:none;” in some tags because some browsers don’t get it right if it’s not an inline style). It’s also bad because it tends to make refactoring a real nightmare (unless you’re coding very carefully - and let’s admit it: Who really does code carefully with JavaScript? ;-)).

This article will be the first of a few to come that cover the topic how to make your web application use the power of unobtrusive JavaScript. It’s basically an introduction to the topic - we’ll start out simple and more complex AJAX topics will follow in later articles. Also note that this first part is not Rails-specific and can be applied for pretty much every website you might develop.

UJS4Rails and lowpro

Some people realized the problem with Rails’ inline JavaScript and did something about it: Dan Webb started the UJS4Rails project that provided helper methods to use his Prototype-based library lowpro in Rails. He went as far as saying that link_to_remote and other remote functions that produce inline JavaScript should be removed from the Rails core (can’t find the link right now - does anyone have it at hand?).

However, UJS Rails turned out not to fit Dan’s plans for lowpro so it was ditched and has now been unmaintained for quite some time. However, lowpro is actively developed and can be used quite effectively to remove JavaScript from HTML.

Meet Event.addBehavior

lowpro’s key feature is a method named addBehavior that’s added to the Event class. Let’s take a look at what an addBehavior call could look like in your application.js. We’ll add a simple rollover function:

Event.addBehavior({
  // expects image names to be in the following format: some_image.extension
  'img.rollover:mouseover': function(event) {
    src = this.src.split(/\./); // split main part and extension
    parts = src.first().split(/_/);
    if(parts.last() != 'over') {
      parts.push('over');
      this.src = [ parts.join('_'), src.last() ].join('.');
    }
  },
  'img.rollover:mouseout': function(event) {
    src = this.src.split(/\./); // split main part and extension
    parts = src.first().split(/_/);
    if(parts.last() == 'over') {
      parts.pop();
      this.src = [ parts.join('_'), src.last() ].join('.');
    }
  }
});

What this does is basically split the src attribute of the image tag in multiple parts and checks whether the last part is over. This is a little convention over configuration: When using this, I usually have two icons, e.g. add_product.png and add_product_over.png. Two things worth mentioning here before I show you how to use it in your HTML:

Inside a behavior, this refers to the element the event is bound to, in this case an image.
Behaviors are applied to CSS-style selectors. In the example above we add the rollover behavior to every image that has the CSS class rollover. The event is attached at the end after a colon and by leaving out the on (so onsubmit becomes submit, etc.). If you don’t specify an event, behaviors are applied to the referenced elements when the DOM is ready (we’ll use this in the next example and in later articles a lot).

To use the rollover functionality you’ll have something like the following in your HTML:

<img src="/images/icons/ok.png" class="rollover" />

Quite simple, huh? No JavaScript cluttering up your HTML - just plain old HTML with a little CSS class definition. I’d have to lie if I told you that this wasn’t possible without using lowpro. In fact, just with Prototype itself you can get the same functionality with a little less code:

$$('img.rollover').each(function(element) {
  src = element.src.split(/\./); // split main part and extension
  parts = src.first().split(/_/);
  element.observe('mouseover', function(event) {
    if(parts.last() != 'over') {
      parts.push('over');
      element.src = [ parts.join('_'), src.last() ].join('.');
    }
  });
  element.observe('mouseout', function(event) {
    if(parts.last() == 'over') {
      parts.pop();
      element.src = [ parts.join('_'), src.last() ].join('.');
    }
  });
});

So why use lowpro if it makes you use more code instead of less? My reason is simple: In this case (and in many others) I favor clarity over mere brevity. The pure Prototype way - at least in my eyes - focuses on the element whereas the lowpro approach stresses the behavior of the element by adding the element to the selector itself. You’ll have to find out for yourself which style you prefer in which case - I promise there’ll be more than enough reasons to use lowpro in the next episode of this series.

Unobtrusive and accessible!

Anyway, let’s take at a more complex example. We want to build a fancy form that doesn’t use labels but instead help the user by writing the field name inside the text fields and text areas. Additionally, we want to make sure that users that have JavaScript disabled can still handle the form (ever heard of Accessibility and Progressive Enhancement I certainly hope so!). Let’s take a look at the JavaScript first:

Event.addBehavior({
  'label.text': function(e) {
    this.hide();
  },
  'input[type=text], textarea': function(e) {
    label = this.previous('label').innerHTML;
    this.addClassName('with_label');
    (this.tagName.toLowerCase() == 'input') ? this.value = label : this.innerHTML = label;
  },
  'input[type=text]:focus, textarea:focus': function(e) {
    label = this.previous('label').innerHTML;
    if($F(this) == label) {
      this.removeClassName('with_label');
      (this.tagName.toLowerCase() == 'input') ? this.value = '' : this.innerHTML = '';
    }
  },
  'input[type=text]:blur, textarea:blur': function(e) {
    label = this.previous('label').innerHTML;
    if($F(this).blank()) {
      this.addClassName('with_label');
      (this.tagName.toLowerCase() == 'input') ? this.value = label : this.innerHTML = label;
    }
  }
});

Wooha - getting a little more complicated here …

Our first two code blocks don’t use an Event selector - this means they are applied to the matching elements as soon as the dom:loaded fires. First, we hide all labels with class=“text”. Second, we select all text fields (note the CSS3 selector!) and text areas, read their labels, add them as their value/innerHTML (depending on the tagName attribute we decide whether it’s a text field or a text area) and add the CSS class name with_label.

The last two code blocks handle the selection/deselection of text fields and text areas. We want to make sure that user input is preserved - that’s what the if clauses are for: When focused, only empty the text field/area if the text equals the label’s text, when blurred, only restore the label’s text if the field is blank. Moreover, we add/remove the with_label class as appropriate. If you want, you could also remove whitespaces from the fields using strip().

Let’s take a look at the HTML:

<form action="/products" method="post">
<p>
  <label class="text" for="product_name">Name</label>
  <input id="product_name" name="product[name]" size="30" type="text" />
</p>
<p>
  <label class="text" for="first_name">First name</label>
  <textarea id="product_description" name="product[description]"></textarea>
</p>
<p>
  <input id="product_submit" name="commit" type="submit" value="Create!" />
</p>
</form>

Not much going on here - we define a form with two items (a text field for the product name and a text area for the description) and their respective labels.

The interesting thing happens when you view the page in your browser: As intended, none of the labels are there and both form fields have their respective label value as their content. If you click on one of the fields, it will be emptied. If you leave the field without changing its value, the value will be reset to its label text.

The other interesting thing happens when you deactivate JavaScript in your browser (pretty easy if you’re using Firefox with Chris Pederick’s great Web Developer Toolbar - just click on Disable >> JavaSript >> All JavaScript): All of a sudden the labels are there and the form items don’t carry their values anymore! So deactivating JavaScript pretty much produces the output you’d expect from the above HTML snippet: We only hid the labels and wrote their values to their corresponding elements using JavaScript, so no JavaScript means no action.

Upcoming

That’s it for the first part of the series. I know the two examples I’ve shown you don’t really show why it’s a good idea to take a good look at lowpro. Still, I’ve used them as examples to get you familiar with the syntax and basic ways of using lowpro. In the next part of the series which I’ll release in a few days (it still needs some love), I’ll show you how to use lowpro more efficiently so you’ll see how great it actually is.

What do you think?

Let me hear your opinions on lowpro and unobtrusive JavaScripts. I’d also love to hear from people that use lowpro and would like to share their experiences.

Do we really need Controller and View tests?

2008-05-14 00:00:00 +0000

Every project has unique requirements but one requirement always stays the same: It needs to be tested. Thoroughly.

In the last few projects, I’ve grown quite fond of Behavior Driven Development, or, more specifically, rspec. rspec - when used with Rails - allows you to independently test your models, controllers, views and even helpers. When used correctly, it serves as both, test and documentation of your code and what it should or should not do.

So rspec gives you the possibility to test every bit of your application, including real user stories (using the stories feature). If you want, you can (and maybe should) even sit down with and write the specs and stories together with your customer. This helps you and the customer to get a better understanding what the application should and should not do.

But question is: What do you really need to spec?

(Note: Usually when I’m talking about “specs” in this article, the same applies to unit/functional tests and vice versa.)

Rules of thumb

Many popular (and less popular) Rails figureheads have posted their opinions on what a well designed Rails application should look like. Here’s the gist of it:

Don’t overload controllers with responsibilities. Put logic in your models instead. This approach is usually called “skinny controller / fad model” (Jamis Buck wrote about it here and here, Courtenay of caboo.se also wrote an article).

Keep views clean of logic. Instead, put logic in your models and helpers.

If you’re following these two rules, this leaves you with fat models and helpers and skinny controllers and views. Obviously, something that has got lots of logic in it should be tested thoroughly - so we should definitely test models and helpers.

The question is, should we also put efforts into testing the - comparatively - simple controllers and views?

Specing controllers and views

If you’ve ever done it, you know that testing views and especially controllers can be quite tedious. Let’s take a look at the specs that rspec creates when using the rspec_scaffold command:

./script/generate rspec_scaffold Product name:string description:text

Running rake stats gives me the following output:

+----------------------+-------+-------+---------+---------+-----+-------+
| Name                 | Lines |   LOC | Classes | Methods | M/C | LOC/M |
+----------------------+-------+-------+---------+---------+-----+-------+
| Controllers          |    95 |    64 |       2 |       7 |   3 |     7 |
| Helpers              |     5 |     4 |       0 |       0 |   0 |     0 |
| Models               |     2 |     2 |       1 |       0 |   0 |     0 |
| Libraries            |     0 |     0 |       0 |       0 |   0 |     0 |
| Model specs          |    11 |     9 |       0 |       0 |   0 |     0 |
| View specs           |    90 |    68 |       0 |       0 |   0 |     0 |
| Controller specs     |   374 |   288 |       0 |      11 |   0 |    24 |
| Helper specs         |    11 |     7 |       0 |       0 |   0 |     0 |
+----------------------+-------+-------+---------+---------+-----+-------+
| Total                |   588 |   442 |       3 |      18 |   6 |    22 |
+----------------------+-------+-------+---------+---------+-----+-------+
  Code LOC: 70     Test LOC: 372     Code to Test Ratio: 1:5.3

Controller specs

As you can see, a fully RESTful default controller (i.e. it responds to the 7 default REST actions) that serves two content types (HTML and XML) takes up 288 lines of test code for only 64 lines of actual implementation.

If you take a closer look at the generated specs, you can see that rspec splits the controller test in two parts, namely specs of the controller itself (./spec/controllers/products_controller_spec.rb, 313 lines) and specs of the routing (./spec/controllers/products_routing_spec.rb, 61 lines). Bear in mind that we’re talking about a really basic controller here.

What happens if we want to add other actions to the controller like - say - a way to order our products list? What if we want to build an AJAX interface? What if we want to support another content type (e.g. PDF)? Every little addition to the controller bloats the spec a little more, depending whether all actions need additions and therefore additional specs (like adding AJAX support) or only to certain actions (like PDF support for maybe the show and index actions).

View specs

What about view specs? These tend to be a lot shorter. They usually “mock” the controller by assigning some instance variables in the before block and then test for the presence of certain HTML tags and structures. Most people (Geoffrey Grosenbach among others) seem to agree that view specs makes sense as long as they don’t get too specific. Geoffrey mentions in his screencast that he usually only tests for things that really matter: If there needs to be a form on a page (e.g. the new template), you should check that it is present and maybe also check that all the fields are there:

response.should have_tag("form[action=?][method=post]", products_path) do
  with_tag("input#product_name[name=?]", "product[name]")
  with_tag("textarea#product_description[name=?]", "product[description]")
end

What if we want to spec our AJAX stuff? rspec provides us with the have_rjs matcher that basically wraps Rails’ assert_select_rjs. If you’ve ever worked with assert_select_rjs, you know that everything quickly gets out of hand, especially if a certain AJAX call updates multiple elements on the page.

So should we spec controllers and views?

This post is about whether or not - in my opinions - we should test controllers and views and I haven’t answered that question yet. So here it goes.

I tend to trust the framework a little: I don’t test the standard stuff that everyone uses in their everyday application. So if my update action only calls update_attributes on a certain model and then redirects to some page, I don’t write a spec for it. If I do, it tends to be short and to the point - compared to the rspec standard test that has over 50 lines of code!

When adding custom actions, it’s a good idea to test them. So if you add an action to reorder your products list, you should definitely write a test for it. If you do, however, I’d suggest breaking one little rule of TDD/BDD: In controller specs, I don’t like mantra of having only one assert or should per test. You can do this because in controller specs you usually only test that certain instance variables were set and that a certain template is rendered or the user’s being redirected.

Test routing whenever you use anything more sophisticated than basic map.resources. Rails parses the routes file in the order the routes are defined, meaning if a URL is caught by two different routes, the first route wins. Testing all available URLs/routes in your application ensures that you don’t accidentally map one URL to multiple routes.

Testing views makes sense if you don’t go into too much detail. It’s a good idea to test for the presence and basic structure of a form or an error/success message.

It makes plenty of sense to test parts of your page that have restricted access. If your page has an admin area or some pages that only logged-in users can view, you should definitely test that this works correctly. It saves you the hassle to keep dozens of test accounts for every different user type and situation, especially if you also have some kind of role-based permission system where - theoretically - an infinite number of setups is possible.

You should absolutely test RJS templates. It’s hard enough to debug applications that make heavy use of AJAX anyway. Rails’ assert_select_rjs is one mighty tool to make it a little easier, once you’ve understood how to use it. Familiarize yourself with the options and test, test, test.

*If you provide an API, put every effort into specing and testing it. Especially if you want to charge people for using it! ;)* Test that XML/JSON/whatever responses work correctly. Test for status codes. Test everything. And most importantly: Within your application you can change specs if a spec turns out to be incorrect and/or outdated. Don’t do this with APIs, once you’ve released them to the public unless you absolutely know what you’re doing! Your API should always be backwards compatible that means, you shouldn’t change existing specs but only extend them.

Since this is an article about controller and view tests, I haven’t really talked about testing models and helpers. That doesn’t mean you shouldn’t test them thoroughly. Test all your models and helpers. If you have a Code-to-Test-ratio over 1.0 (i.e. less lines of code in tests than in models/helpers), it’s quite likely that something’s terribly wrong.

Check how well your code is covered with rcov

Finally, I’d like to recommend a tool that can be quite useful: rcov. rcov basically tests how well your code is covered by your specs or tests. It executes all specs/tests and analyzes which lines have been executed and which haven’t. It then creates some neat HTML documents that tell you exactly how well your code is covered by your specs/tests and highlight the areas of your code that haven’t been executed. Install the gem by typing

$ gem install rcov

You can then run it using the following command:

$ rake spec:rcov

This generates (or updates) a directory named “coverage” in your application’s directory, containing multiple HTML files. Just open the index.html in your browser of choice and take a look at the output.

What do you think?

What are your opinions on specing/testing controllers and views or BDD/TDD in general? Do you think I’m wrong? Do you want me to write a more extensive article about BDD/refactoring/etc.?

Let me know what you think!

Sidebars and other dynamic content

2008-04-25 00:00:00 +0000

In episode #100 of his screencasts, Ryan Bates shows - among other things - how dynamically fill elements of your layouts.

Imagine your layout contains a <div id=“sidebar”> where you want different sidebar content, depending on the page the user’s currently on. Ryan uses content_for(:side) which is definitely the way to go, but I think his approach isn’t as DRY as it could be.

Instead of using content_for(:side) in every template, I’d suggest extracting that bit in a helper method and put in in your ApplicationHelper module like that:

def sidebar(&block)
  content_for(:sidebar, &block)
end

After that, you can just call the newly created helper method in your views:

<% sidebar do %>
This goes in the sidebar!
<% end %>

This approach also gives you the opportunity to include some logic in the helper. Consider the possibility of showing an admin area as a part of the page if (and only if) the user has admin privileges. In one of my projects, I used something the following code:

def admin_area(section='', &block)
  if self.current_user.admin?
    concat(content_tag(:h2, "#{section.humanize} Admin", :class => "admin_options"), block.binding)
    concat(content_tag(:div, capture(&block), :class => "admin_options"), block.binding)
  end
end

In the views, it looks like this:

<% admin_area('forum') do %>
... display the admin area ...
<% end %>

This way, you keep logic out of the view so it stays nice and clean.

Database agnostic != database ignorant

2008-04-24 00:00:00 +0000

Every now and then I stumble across a blog post that triggers something in me. Lately, this was the case for an article I read over at RubyFleebie about a Rails programmer having trouble with the very basics when it comes to databases. While the resumee of the post was quite satisfying (he thinks that Rails developers should start to care about the database they’re using, despite relying on ActiveRecord), the content itself scared me: This guy, so it seems, doesn’t know anything about foreign keys, indexes and not even differences between outer and inner joins. I don’t mean to insult him, but it really, really scared me.

Back when I started coding, a web developer had to be familiar with all aspects of the whole application: database, HTML/CSS/JavaScript and the code that links it all together. I don’t think that this requirement has changed a lot - or in fact, it probably hasn’t changed at all. But frameworks like Rails make it easy to hide some of the aspects behind their mighty layers of abstraction.

Anyways, I’m not here to complain about that. Instead, I’d like to use this post to give you a quick rundown of the most important database related stuff, that I think is mandatory for every web developer, whether they use Rails or not. While I’m certainly no potential DBA, I know a thing or two about database design and I think, some things might help you along the way.

JOINs

In databases, joins are used to link multiple tables of a database together so that multidimensional (so to speak) data can be retrieved using just a single query. This is useful in two major aspects:

First of all, every database query is a - comparatively - time consuming operation. More often than not, you’ll find that reducing the number of queries in your application will give you a major performance increase. This is basically what Rails’ eager loading is all about:

Person.find(:all, :include => :addresses)

This loads all people including their addresses in just one query.

Secondly, it depicts reality a little more accurately. Your data is hardly ever only two-dimensional. You’ll have people that have addresses, orders that consist of items, topics that have posts, etc. Using only one query to fetch all related data is not only more efficient, but also seems logical.

An important thing to know about joins is that a join always connects two tables. You can, of course, chain multiple joins to fetch data from more than one table. But a join, by itself, can only connect two tables at a time. How the tables are connected is decided by the type of join you apply: You can outer or inner join tables.

INNER JOINs

The probably more common way to join tables is the so-called INNER JOIN. INNER JOIN connects all matching rows of both tables.

Let’s take another look at the people/addresses example I mentioned earlier and assume that there’s two people, George (id = 1) and Alice (id = 2). George has one address (i.e. there is one row in the addresses table with a person_id = 1) and Alice has two addresses. Let’s join them using an INNER JOIN:

SELECT * FROM people INNER JOIN addresses ON addresses.person_id = people.id

This will output a set of three rows - one for George and two for Alice. The set includes all rows where the join condition (ON addresses.person_id = people.id) matches.

Now we add a third person, Bill, who doesn’t have an address. If we run our query again, the output doesn’t change because the INNER JOIN only includes rows with matches in both tables. If we want to have one of the two tables included in the result set, regardless of whether there’s a corresponding row in the second table or not, we have to use an OUTER JOIN.

Note that the order in which you’re joining the tables doesn’t make any difference with inner joins. You could write the following and the result would be the same:

SELECT * FROM addresses INNER JOIN people ON addresses.person_id = people.id

OUTER JOINs

An OUTER JOIN includes one of the two joined tabled tables completely, even if it doesn’t find a matching row in the other table. Which table is going to be complete (i.e. not all fields are NULL values) and which may be incomplete is decided by adding kind of “direction” to the join and make it a LEFT OUTER JOIN or RIGHT OUTER JOIN. But let’s take a look at an example that makes this easier to understand.

Let’s revisit George, Alice and Bob. If we change our INNER JOIN to a LEFT OUTER JOIN, Bill is included in the result set and all fields of the addresses table are set to NULL.

So, in short, the added keyword LEFT/RIGHT decides, which of the joined tables will get fully included in the result set. Note that, contrary to the INNER JOIN, with the LEFT/RIGHT OUTER JOIN the order of the tables does actually matter - if you swap the tables, your result set will usually change.

When to use what

When to use which kind of join largely depends on what you’re trying to achieve. A general rule of thumb would be something like this:

If you can be sure that you only need records with corresponding rows in both tables or if you somehow enforce (at database or application level) that each record in one table must have a corresponding row in the other table, an INNER JOIN is the way to go, because it’s usually fast and doesn’t yield any NULL values you have to deal with.

If what you’re trying to achieved doesn’t fall in the category mentioned above, use an OUTER JOIN. Using LEFT/RIGHT OUTER JOIN is mostly a matter of personal preference (I prefer to LEFT OUTER JOIN), unless you’re chaining lots of tables in one big query - then you might need to have some LEFT/RIGHT OUTER JOIN alternating.

Indexes

Indexes (aka keys) are used to index records (who would have thought that?!). Basically, using indexes can result in an enormous performance boost if the indexes are applied correctly, because they help the database server to find the desired results more efficiently. This can be achieved because without an index, databases usually perform what is called a full table scan. Consider the following query:

SELECT * FROM people WHERE last_name = "Smith"

If there is no index on the last_name field, the database will go through the whole table and filter every record with a last_name value of “Smith”. While this is not a problem with only a few hundred records, it quickly gets inefficient if you have multiple thousand records stored in the table. An index, in this case, will build a kind of virtual table that is ordered by last_name. When the database is queried like above, it will find go in an search till it finds the first occurrence of “Smith” and selects all records till it finds the first record where last_name doesn’t equal “Smith”. Obviously, this can save a lot of time.

When and where to use indexes is a science by itself (contrary to what people might tell you, DBA is actually not redundant job for all those tech guys born before 1960), especially when it comes to combined indexes (i.e. indexes across multiple fields). Nevertheless, here’s some indicators when an index might be useful:

Any field of a table that gets mentioned in WHERE clauses may make a good index. Let’s assume that you have a table containing blog posts and you want to filter all posts by a certain user, using something like

SELECT * FROM posts WHERE author = 'Clemens'

In this case, author would be a potential index. Same goes for fields used in ORDER BY clauses - they’re usually more efficient when indexes, especially with large amounts of data.

Any field that is part of a join condition is definitely a good choice for an index. In other words: Index your foreign keys. Always! Example:

SELECT * FROM people LEFT OUTER JOIN addresses ON addresses.person_id = people.id

In this example, addresses.person_id should definitely be indexed. (Note: people.id is, hopefully, indexed as primary key already.)

Any field that is used by an aggregate function like SUM, COUNT, etc. may be a good candidate as well. If you don’t write your own SQL and solely rely on ActiveRecord, you will hardly ever (if at all) use aggregate functions.

With most database engines, there are multiple types of indexes. For example, in MySQL there are primary keys, unique indexes and “standard” indexes. I think, this is really straight forward, but I’ll explain it in short anyway.

There is only one primary key per table (hence the name) and is unique for the table (i.e. there is only one record with a certain value in this field). In Rails, this field will usually be called “id” and be an integer of some sort, that is auto incremented if a new record is inserted. Don’t change Rails’ behavior here - it’s a real pain!

The “standard” index is the most common index. You’ll use it for most foreign keys and if you speed up your WHERE statements by indexing the fields that are part of the clause.

A unique index can be used to make a second field (in addition to the primary key) unique for a table. A good example would be if you want to make sure that the users table in your application makes sure that a given e-mail address can only be used once - just make it unique. Of course, you always need some application logic (preferably in the model layer) to handle the unique violation if a second record with an already existing value is inserted.

Combined indexes are strictly speaking not a separate type of indexes - you can have combined primary keys, combined uniques and combined “standard” indexes. Since combined indexes are - I think - a very difficult topic and could easily make a full-blown article, I decided to put them out of scope.

Summing up

ActiveRecord and similar ORMs take a lot of pain and time consuming tasks away from the developer. That doesn’t mean, though, that it exempts the developer from learning and knowing what’s going on behind the covers. I hope, my article gave some people an insight in the works of a database. Maybe, some day I’ll write another article on how to optimize your database.

Comments are much appreciated!

UPDATE: pjm suggested to write a little paragraph explaining why an index may be helpful. I added a short section to the article - I hope it’s clear enough. Further information can be found in the MySQL manual about how MySQL uses indexes - most of the stuff that is mentioned there isn’t MySQL-specific but can be applied to other popular database engines as well.

Ways of teaching/learning Rails

2008-04-04 00:00:00 +0000

I’ve had discussions with some lecturers at my university lately whether or not Rails would fit the needs of an Austrian University of Applied Science. One of the main topics we discussed was the teaching method - is Rails more suitable for bottom-up or top-down teaching? What do I mean by that?

Bottom Up

When learning/teaching a topic in a bottom-up style, you start with the very basics - meaning arithmetic operations, loops, basic OOP stuff (instances, classes, methods, class-/instance-level) and so forth. What this means for a student is that they don’t see anything “real” for the first few weeks or even months of their training (depending on how tight the schedule is). Instead they’re forced into learning stuff without seeing the actual value of what they’re learning. Good thing is, though, that probably nobody will be overwhelmed by the sheer complexity of a programming language because they take everything in small steps.

Top Down

Top down in the context of learning means that you define some things as given - “God giveth thee arrays so thou shalt use them and ask questions later”. Someone is walked through an application (say, the omnipresent blog) and confronted with some real code (at least in parts). They see some value in their work - and later, they can also ask questions. Although that probably won’t be necessary since we’re doing a top-down approach, meaning, after showing the whole thing we break it up in little comprehensible units and explain them one by one (“This is a loop that goes through all the posts - and here is how a basic loop works”). This, of course, forces people to accept a whole lot of rules before they even know any of the basic stuff. And it usually involves people being overwhelmed and giving up because all that shit is way too complex for them.

Mixed style

Probably a mix style is the way to go. Let’s take the example of teaching Rails, including the basic Ruby stuff. I’d probably give a quick overview of Ruby, assuming that everyone in the class has at least a vague background in any modern programming language. The overview would probably be quite similar to what the Agile book features as an appendix. I’d definitely do this bottom-up, doing stuff like arithmetics and simple OOP in an IRB shell.

As soon as I’m switching over to Rails, I’d go for the top-down method first, showing the students a small, but full-blown, project and then making it into small units of specialized topics (Views, Controllers, Models, Basics of Routing/REST, etc.) to make them understand how all the elements of the framework are tied together to work flawlessly.

Oh, yeah, and I’d definitely make them do a real project with a real customer (i.e. me), giving every team of, say, 4-8 developers a real application and working on it step by step.

What do you think?

I know we don’t have a real readership right now, but I’d be happy to see some comments if you happen to stumble across this post! ;-)

Three recurring challenges in web development

2008-03-29 00:00:00 +0000

While there’s many things that keep us web developers busy, I think there are three challenges that keep coming back and bugging us all the time. I’m not talking about any special pitfalls when it comes to programming itself here. Mostly, I’m talking about limitations that we just can’t overcome - the user’s choice.

What do I mean by choice?

Browsers

For starters a user can choose which browser they use. When I started out coding some years back, usage logs showed us some 90%+ percent use of Internet Explorer 5(.5)/6. Although it was always good to also think of those poor non-IE users, there were also considerations whether or not it was actually worth the hassle (and the money) to fix IE-induced bugs in other browsers.

Along came Firefox to make the world, err … web, a better place - meaning that it all of a sudden made sense to actually stick to the rules set by the W3C. And soon there were more than 25% of users that actually used Firefox, going up to more than 35% right now - that makes it the market leader if you calculate IE6 and IE7 separately (usage statistics taken from here).

Now, what does this mean for web developers (i.e. programmers and designers)? Nowadays, we have to worry about satisfying (X)HTML/CSS needs for all kinds of different browsers. Sites like Position Is Everything keep us up-to-date with the latest tricks how to trick IE into producing valid visual output. Today’s web developer really needs multiple environments to validate that their code produces acceptable results in Windows, Linux and Mac OS X with all their different browsers. Günther probably spends a good third of his time making his perfectly valid HTML and CSS work in IE6 and IE7.

JavaScript

I’ve had huge discussions about using JavaScript, especially since AJAX surfaced and made our lives on the web more interactive. While one can safely assume that a good deal of people have JavaScript activated in their browsers, it is still mandatory to think of those few souls that don’t - either because they don’t want to have it active or because they’re for example using a mobile device to surf the web.

I’ve got used to a certain approach when it comes to relying on JavaScript. My answer is: don’t. When developing a site, I usually start out developing it for use without JavaScript. As soon as the basic stuff works, I start adding AJAXy stuff (if I use it at all) and some things to make a user’s life easier, such as pre-filled form fields that automatically reset their values when being clicked. This way, people with JavaScript deactivated can still use the site while others are rewarded with some fancy effects and added functionality. I would generally suggest sticking to that rule when building a site that is being used by a user base you can hardly determine or instruct upfront.

The case is different for stuff like admin areas. With backends and admin stuff, you can usually rely on the fact that there’s only few people using it and you can set up a constraint telling them that they need to have JavaScript enabled to use it. While it may still be a good idea to maintain a fallback for use without JavaScript (say, the user is on a business trip and uses a computer in an Internet Café or logs in using their iPhone), I think in this case it acutally is valid to require JavaScript for functionality.

Cookies

While fixing browser bugs is a necessity due to market shares and incompatibilities and the whole JavaScript-optimization doesn’t necessarily include lots of extra work if done properly (especially when using Rails), dealing with users that have their cookies deactivated can quickly become a real pain in the ass. Most users think of cookies as ways of advertising and other crazy stuff, they don’t realize that even a simple thing such as a shopping cart usually relies on cookies. Even if the site doesn’t store the shopping cart itself in cookies, it most definitely uses cookies to store the user’s session ID.

Of course, there’s alternatives like PHP’s way of adding its wicked PHPSESSID to each and every URL. This doesn’t only look ugly but also bears some crucial security issues - all of a sudden, you have to validate that the supplied session ID is actually used by the user it was generated for (i.e. storing session IDs and their corresponding user IP addresses in the database), expire the session manually and whatnot. This usually just gives me a headache and isn’t worth the fuzz.

I tend to be harsh when it comes to users disabling their cookies - I just don’t allow them to view the site. I’d have the standard approach to check whether or not cookies are enabled and then redirect the non-cookie users to a site that politely ask them to accept cookies for the site. I do this because I simply think it’s not economic to spend hours just to pave the way for some 0.something per cent of extra-paranoid users (especially since these users are - in my humble experience - usually the ones that buy/use the services least and complain about it in the forums/support area most … so you don’t even really want them to use your site).

Summing up

These are three recurring issues that I see in my every day work. As I said, I’m strongly in favor of ensuring cross-browser compatibility (although you should really take a peek at Andy Clark’s Transcending CSS where he makes some valid arguments against hacking HTML/CSS just to make sure web pages look the same in every browser). I usually also agree when people are saying that a page should be functional without JavaScript. But: I definitely don’t want to spend hours fixing security holes just because some paranoid users deactivated their cookie support.

That being said, I’m looking forward to hearing your opinions about it.

Rails 2 German released!

2008-03-19 00:00:00 +0000

As I announced some time ago, my German translation of Ryan Daigle’s is finally published at PeepCode.

Check it out here. If you have any comments or suggestions, don’t hesitate to drop me a line.

Template languages - the easier alternative?

2008-03-06 00:00:00 +0000

I’ve been thinking about template languages lately.

Some years ago when I was still coding PHP, I decided to give Smarty a try. Why? Well, it seemed to have a nice enough syntax. And if it’s not real code it sure is easier for a designer without deep knowledge of PHP to edit it, right?

Turns out that it isn’t. If you’re working with a template language you still need to grasp basic concepts of programming, such as loops and output formatting. As soon as your structures become a little more complex the once beautiful syntax quickly becomes cluttered and error-prone.

The reason why I’ve been thinking about the use of template languages is that I’ve stumbled across Ruby Waves - which apparently uses Markaby and will also support HAML and SASS pretty soon. All three allow you to write your view templates (and CSS files) in pure Ruby code.

Don’t get my wrong here - I like what these guys do from a programmer’s perspective. But I don’t think it’s a really DRY approach when it comes to work flow. The designer (that usually doesn’t know the template language of your choice) will build their stuff in plain old HTML and hand it over to a programmer who then converts it into Ruby code - so basically the same work is done twice, once by the designer and once by the programmer. And I hear there’s still people out there that like to design their web pages using a WYSIWYG editor - that of course only speaks HTML.

Performance is yet another concern: Template languages need to be parsed which will slow them down eventually, especially at servers with high traffic or older hardware.

And if this isn’t enough already, you may just ask yourself: What is really the big difference between these two:

  <h2>My Title</h2>

  h2 "My Title"

It’s subtle, isn’t it?

Peepcode Rails 2 PDF German - DONE!

2008-03-03 00:00:00 +0000

I’ve finished translating Ryan Daigle’s bestselling Rails 2 book (published by Peepcode) on the weekend. As soon as it’s been through a final layout check, it will be available on the Peepcode site for $9 or as a part of your subscription.

If you have any feedback regarding my translation, please be sure to drop me a line (either via e-mail or in the comments).

railway.at online!

2008-01-02 00:00:00 +0000

Hello world and welcome to railway.at,
(it’s cheesy, isn’t it?)

here we are and we are finally online. We love our clients which is why we did their stuff first before dealing with our homepage.

Read all about who’s behind railway in our about section. The other sections (portfolio, services, etc.) will be updated soon.

In the meantime, feel free to drop us a line at office@railway.at. We’ll be back with some updates in the next few days.

Cheers to you all,
G & C

Imprint / Impressum