Partially bypass ActiveRecord instantiation when using Memcached
With the recent scaling woes and Blaine Cook's Scaling Twitter talk, I think slide 23 is a welcome, and bold, statement.
Memcahed is generally used as an intermediate cache to lessen database load, and especially more crucial when using ORM ( eg. ActiveRecord ) as avoiding/bypassing object instantiation will give you additional mileage in the App tier as well.
The context:
Three example models: Account, Customer and Booking ( some omitted for simplicity ).
class Account < ActiveRecord::Base
acts_as_cached :version => 1, :include => [:omitted], :ttl => 2.hours
has_many :customers
has_many :bookings
has_many :contacts
has_many :addresses
end
class Customer < ActiveRecord::Base
acts_as_cached :version => 1, :include => [:address, :contact, :bookings, :account], :ttl => 2.hours
belongs_to :account
has_one :address, :as => :addressable, :dependent => :destroy
has_one :contact, :as => :contactable, :dependent => :destroy
end
class Booking < ActiveRecord::Base
acts_as_cached :version => 1, :include => [:omitted], :ttl => 2.hours[:date, :progress, :user, :customer, :events, :booking_extras, :booking_products, :notes, :payment, :account, :coupon], :ttl => 2.hours
belongs_to :customer
end
The problem
We are caching the Customer model, with its direct and most often used associations (Contact, Address and Bookings) and would like to maintain this integrity, without instantiating additional AR objects and related associations when referencing all Customers or Bookings for a given Account:
account.customers.to_a #or account.customers(true)
account.bookings.to_a #or account.bookings(true)
However, Rails do have an association proxy reader method:
account.customer_ids # => [1]
account.booking_ids # => [6, 1, 5, 2, 3]
... which isn't very helpful if we'd like a specifc subset of customer or booking identifiers.
The solution
Extend ActiveRecord::Base with a find_ids singleton method, with the exact same usage as AR::Base#find, but never instantiates any objects.We only fetch the ID's from the raw connection:
module ActiveRecord
class Base
class << self
def find_ids(*args)
options = extract_options_from_args!(args)
logger.debug("Find by ID:" + options.inspect)
validate_find_options(options)
case args.first
when :first then find_initial_id(options)
when :all then find_every_id(options)
end
end
def find_by_sql_ids(sql)
connection.select_all(sanitize_sql(sql), "#{name} Load").collect! { |record| record['id'] }
end
private
def find_initial_id(options)
options.update(:limit => 1) unless options[:include]
find_every_id(options).first
end
def find_every_id(options)
records = scoped?(:find, :include) || options[:include] ?
find_with_associations_ids(options) :
find_by_sql_ids(construct_finder_sql(options))
records
end
end
end
end
module ActiveRecord
module Associations
module ClassMethods
def find_with_associations_ids(options = {})
catch :invalid_query do
join_dependency = JoinDependency.new(self, merge_includes(scope(:find, :include), options[:include]), options[:joins])
logger.debug("All rows: " + select_all_rows(options, join_dependency).inspect)
return select_all_rows(options, join_dependency).collect { |row| row[join_dependency.joins.first.aliased_primary_key] }
end
[]
end
end
end
end
Usage examples
Following is an association extension that illustrates compatibility with AR::Base#find :
module BookingsExtension
def upcoming( page = 1) self.find_ids(:all, default_find_options(page)) end
def recent( page = 1 ) self.find_ids(:all, default_find_options(page).merge!({:order => 'booking_dates.date_from DESC'})) end
def by_status(status = 'pending', page = 1)
self.find_ids(:all, default_find_options(page).merge!(:conditions => ['bookings.status = ?', status]))
end
def since(date = Time.now.utc, page = 1)
self.find_ids(:all, default_find_options(page).merge!(:conditions => ["bookings.status != ? AND booking_dates.date_from >= ?", 'in_progress', date.to_s(:db)]))
end
def until(date = Time.now.utc, page = 1)
self.find_ids(:all, default_find_options(page).merge!(:conditions => ["bookings.status != ? AND booking_dates.date_from <= ?", 'in_progress', date.to_s(:db)]))
end
def by_user(user, page = 1)
self.find_ids(:all, default_find_options(page).merge!(:conditions => ["bookings.status != ? AND bookings.user_id = ?",'in_progress',user.id]))
end
def by_reference( reference, page = 1 )
self.find_ids(:all, default_find_options(page).merge!(:conditions => ["bookings.status != ? AND bookings.reference = ?",'in_progress',reference]))
end
def default_find_options(page)
{ :include => [:customer, :date], :conditions => ['bookings.status != ?', 'in_progress'], :order => 'bookings.created_at DESC', :page => { :size => 10, :current => page, :first => 1 } }
end
end
Standalone examples:
account.bookings.by_status(:pending).to_a # => ["2", "1"]
account.bookings.by_user( User.get_cache(1) ).to_a # => ["2", "1", "3"]
Memcached friendly examples
I use cachefu to interface with memcache-client.The multiget_cache extension method is particularly useful here:
Booking.multi_get_cache( ["2", "1", "3"] ) # => lots of output
In the above example we are attempting to fetch Bookings with ID 1..3 from Memcached.Should the objects already be cached, we only had the DB overhead of 1 relatively cheap query while maintaining cache integrity without duplicating any processing or data.
Conclusion
The above is a slight anti-pattern, which true the 90/10 principle, would only ever be useful to those users of the framework with Memcached in their production stack.
It's OK to break free from constraints, religious DRY development that may shoot you in the foot later and even denormalize, as per Blaine's slide number 23, if and when the framework doesn't natively solve your problem ( performance? design constraints? production environment? ) at hand.
0 comments
Jump to comment form | comments rss [?]