my recent reads..

ActiveWarehouse/ETL and Reflections on BI for Rails

I've recently been considering the opportunity to apply Ruby and Rails goodness to mainstream Business Intelligence applications.

During my research into prior art I discovered Anthony Eden's ActiveWarehouse and ActiveWarehouse-ETL projects, and gave them a test drive using a fictitious "Cupcakes Inc" site.

I presented this at the Jan 2010 Singapore Ruby Brigade meetup held at hackerspace.sg. My "point-of-view" slides are embedded below, and you can find the sample project and doco on github.

Conclusions?

  • ActiveWarehouse is a textbook implementation of classic data warehousing techniques. That was clearly Anthony's intention, but it also means it does not really attempt to explore how data warehousing might be approached quite differently with Ruby and Rails

  • ActiveWarehouse/ETL are not for the faint-hearted. When you get them working, they works well, but the lack of documentation basically means it's inevitable you'll end up reading the sources to figure it all out

  • I have concerns about scalability. Having worked on terabyte warehouses using "classic" technology, I know just how far you push databases in order to scale. This bears more investigation and testing before it would be sensible to commit to ActiveWarehouse for a large-scale DWH implementation

Nevertheless, ActiveWarehouse and ActiveWarehouse-ETL are interesting projects, and the underlying implementations make for some educational code reading. Hopefully my slides and the Cupcakes sample project will add a bit to the available documentation, and give a bit of a leg up to anyone intersted in checking out these projects;-)


Soundtrack for this post: Information Overload- Living Color
read more and comment..

Two Ruby Books To Own..

If I had to pick two..

Design Patterns in Ruby by Russ Olsen is the first technical book in a very long time that I have enjoyed reading from cover to cover.

It's more than just a naïve translation of the classic GoF patterns. Olsen manages the dual trick of not only demonstrating how the classic patterns can still be relevant in Ruby, but how to approach them with the full power of ruby at your disposal.

I liked the way that Olsen avoided doing bare minimum implementations. So when looking at the Composite pattern, he spruces things up with a little operator overloading. And where ruby affords a number of possible approaches, these get discussed and compared (like with the Decorator pattern).

The final chapters in the book present a few additional patterns that go beyond the GoF and are particularly topical and relevant for ruby: DSLs, meta-programming, and convention over configuration.

In short, Design Patterns in Ruby is a grand tour, an effective tutorial in a selection of ruby practices, and ultimately a very enjoyable, rewarding, and sometimes even funny book to read.


The second book I'd stowaway with is Ruby Best Practices by Gregory Brown.

It doesn't pretend to be encyclopedic in the manner of The Ruby Way. However, where sometimes I find The Ruby Way curtails topics just when they start to get interesting, Brown dives deep with Ruby Best Practices.

Clear examples are accompanied by thoughtful and full treatments of the subject at hand. It has particularly useful focus on "Mastering the Dynamic Toolkit", "Text Processing", "Functional Programming Techniques", and "Designing Beautiful APIs".

So they're my picks. Now, obviously these are not ideal books for learning ruby from scratch, but once you're past the basics these are the two at the top of my pile;-)

Anyone willing to counter with their top two picks? Agree or disagree with my choice?


Soundtrack for this post: I Like Your Old Stuff Better than Your New Stuff - Regurgitator from the album Unit Re-Booted

read more and comment..

#Amazon, #Audible: can you get your global act together?

I bitched about Audible for not doing a good job of serving the global audience.

Well. I just got an email today that reminded me not to forget lambasting Amazon (now audible's parent company).

Over 800 Albums for $5 Each..

..from the Amazon mp3 store. Or so it said. It was a lie and grand deception.

I so want to buy from Amazon's mp3 store - heaven save me from even considering the Apple iTunes Store - but guess what? I can't. Not authorized outside the US (even though I can buy the exact same thing on a bit of plastic and have it shipped to me).

Now, I know it is not Audible and Amazon that set these policies. It's the RIAA and the rest of the old-fashioned publishing industry (be it books or music). And judging by The Washington Post's recent article "E-books spark battle inside the publishing industry", it seems things may get worse before they get better.

But I wish Audible and Amazon were a little more aggressive in championing consumer rights. In particular, take close aim at the notion of regional distribution deals.

Once upon a time, it was reasonable to ink regional deals. After all, someone needed to provide the warehouse, retail frontage and so on. In far off, foreign lands. But in the digital age, we have global retail frontage. Local distribution deals (and all their attendant evils such as DVD region coding) are an anachronism.

To put it simply: When Amazon, Audible or any other internet distributor puts a product in their stores, it should be available (and have been sold on) a global basis. If publishers are not able to make such a deal, don't stock their stuff. Send them packing and tell them to come back when they've got a deal that works for a global audience.

But is there an incentive for Amazon, Audible and the like to take such a stand against the publishers? Well here's one: the other 80% of the world market. I loo-ve Audible (props @jason), and Amazon has been a favoured source for years. But if you keep jilting me under the control of US-centric publishers, I'll be the first to jump to a regional/truly-global competitor. Your future growth will be limited to the shores of the continental US.



Soundtrack for this post: Can't Take Me Home - Pink
read more and comment..

Understanding Authlogic Plugin Dynamics

authlogic is by far and away my favourite authentication framework for Rails. I've raved enough in my slides on Authlogic_RPX.

It's true beauty is making authentication so unobtrusive for application developers.

However, the same can't be said for Authlogic plugin developers. I spent quite a bit of time meandering through the authlogic source and other plugins in order to produce Authlogic_RPX (the RPX plugin for authlogic, to support JanRain's RPX service).

I recently returned to the Authlogic_RPX in order to provide an update that finally adds identity mapping (with contributions from John and Damir; thanks guys!).

Luckily my previous exploits were recent enough that much of what I learned about authlogic were still pretty fresh. But before I forget it all again, I thought it would be worthwhile to write up a few of the "insights" I had on the authlogic source.

Hence this post. I'm just going to focus on one thing for now. Since authlogic is so "unobtrusive", one of the big conceptual hurdles you need to get over if you are attempting to write an authlogic plugin is simply:

Just how the heck does it all get loaded and mixed in with my models??

(To follow this discussion, I'd recommend you have a plugin close to hand. Either my previously mentioned Authlogic_RPX, or another like Authlogic_OAuth, or Authlogic_openid)

By unobtrusive, I mean like this. Here is the minimal configuration for a user model that uses Authlogic_RPX:
  class User < ActiveRecord::Base
acts_as_authentic
end

Pretty simple, right? But what power lies behind that little "acts_as_authentic" statement?

What follows is my attempt at a description of what goes on behind the scenes..

First: get loaded


The main file in an authlogic plugin/gem is going to have the relevant requires to the library files. But they do squat. We start mixing in our plugin with the includes and helper registrations:
require "authlogic_rpx/version"
require "authlogic_rpx/acts_as_authentic"
require "authlogic_rpx/session"
require "authlogic_rpx/helper"
require "authlogic_rpx/rpx_identifier"

ActiveRecord::Base.send(:include, AuthlogicRpx::ActsAsAuthentic)
Authlogic::Session::Base.send(:include, AuthlogicRpx::Session)
ActionController::Base.helper AuthlogicRpx::Helper

Note that your plugin ActsAsAuthentic module get's mixed in with ActiveRecord itself (not just a specific ActiveRecord model). That's crucial to remember when considering class methods in your plugin (they are basically global across all ActiveRecord).

What including ActsAsAuthentic in ActiveRecord::Base does..


What happens when the previous lines included the plugin's ActsAsAuthentic module?
The self.included method handles the initial bootstrap..


module AuthlogicRpx
module ActsAsAuthentic
def self.included(klass)
klass.class_eval do
extend Config
add_acts_as_authentic_module(Methods, :prepend)
end
end
..

Here we see we do a class_eval on the class that the module is included in (i.e. ActiveRecord::Base). You'll immediately get the sense we're doing some kind of mixin with the Config and Methods modules. The Config / Methods module structure is a common pattern you will see throughout authlogic.

extend Config takes the Config module (AuthlogicRpx::ActsAsAuthentic::Config) and add it to the ActiveRecord::Base class cdefinition. i.e. methods defined in Config become class methods of ActiveRecord::Base. (If you add a def self.extended(klass) method to Config you'll be able to hook the extension).

add_acts_as_authentic_module(Methods, :prepend) adds the Methods module (AuthlogicRpx::ActsAsAuthentic::Methods) to the authlogic modules list. That's all. Take a look at add_acts_as_authentic_module:


def add_acts_as_authentic_module(mod, action = :append)
modules = acts_as_authentic_modules
case action
when :append
modules << mod
when :prepend
modules = [mod] + modules
end
modules.uniq!
write_inheritable_attribute(:acts_as_authentic_modules, modules)
end


Ready to launch..


It is only when we add the acts_as_authentic in our model class that things start to happen. This method loads all the modules from the list built up by all the call(s) to "add_acts_as_authentic_module". Note the include in the last line of the method:

def acts_as_authentic(unsupported_options = nil, &block)
# Stop all configuration if the DB is not set up
return if !db_setup?

raise ArgumentError.new("You are using the old v1.X.X configuration method for Authlogic. Instead of " +
"passing a hash of configuration options to acts_as_authentic, pass a block: acts_as_authentic { |c| c.my_option = my_value }") if !unsupported_options.nil?

yield self if block_given?
acts_as_authentic_modules.each { |mod| include mod }
end


Ignition..


Once the include is invoked, our plugin will usually hook the event and do some setup activity in our module's def self.included method.


module Methods
def self.included(klass)
klass.class_eval do
..
end
..
end
..

Unlike the Config extension, the class you are including in (the klass parameter in the example), is the specific ActiveRecord model you have marked as "acts_as_authentic".

In other words, the methods in the Methods module get included as instance methods for the specific ActiveRecord models class (User in the example I presented earlier).

Hanging it on the line..


Let's hang it all out in a simplified and contrived example. Take this basic structure:

module AuthlogicPlugin
module ActsAsAuthentic
def self.included(klass)
klass.class_eval do
extend Config
add_acts_as_authentic_module(Methods, :prepend)
end
end
module Config
def config_item
end
end
module Methods
def self.included(klass)
klass.class_eval do
def self.special_setting
end
end
end
def instance_item
end
end
end
end

If we add this to our User model, then the result we'd end up with is this:

  • config_item: will be a class method on ActiveRecord::Base

  • instance_item: will be an instance method on User

  • special_setting: will be a class method on User



Conclusions & Implications?


I've covered the main points in bootstrapping authlogic. There's obviously a lot more that goes on, but I think once you get these basics it makes authlogic-related code so much easier to read and understand. It's a pretty neat demonstration of dynamic ruby at work.

Understanding the loading process is also makes it possible to be definitive about how your application will behave, rather than just treating it as a heuristic black box.

Take authlogic configuration settings for example. Say we have a configuration parameter in our plugin called "big_red_button" that takes values :on and :off.

Syntactically, both of these user model definitions are valid:


class User < ActiveRecord::Base
acts_as_authentic do |c|
c.big_red_button :on
end
end

class User < ActiveRecord::Base
acts_as_authentic
big_red_button :on
end

However, the behaviour is slightly different, and the difference will be significant if you have any initialisation code in the plugin that cares about the setting of the big_red_button.

In the second case, it should be clear that setting big_red_button :on only happens after all the plugin initialisation is complete.

But in the first case, it is a little more subtle. If you go back to review the acts_as_authentic method you'll see that setting the big_red_button occurs at yield self if block_given?. Implications:

  • Config extension of ActiveRecord::Base takes place before the big_red_button is set

  • Method methods are included in the User model before the big_red_button is set

  • Method's def self.included is called after the big_red_button is set (meaning you can safely do conditional initialisation here based on the big_red_button setting)


How's that? Pretty cool stuff, but thankfully as I mentioned before, these details only really concern plugin authors and anyone who just loves to read dynamic ruby code.

There's much more to authlogic that what I've discussed here of course (and RPX). Perhaps good fodder for a future post? Let's see..


Soundtrack for this post: Because it's There - Michael Hedges
read more and comment..