Most relational databases support the concept of a view, a pre-defined relation that’s essentially a stored query. You can query a view just like any other database table, but the fields and values in a view are defined based on the query used to create the view.
Views can be used for a number of reasons, most of them related to abstraction. For example, a view could be defined to allow easier querying of a complex join or to filter a relation to include only a subset of data.
In this post, we’ll take a look at using a Postgres view with Rails to encapsulate data in a repeated aggregate query.
The Project and the Problem
As my capstone project for Metis, I created BdwyCritic (source on GitHub), a review aggregator for Broadway shows, similar to Metacritic. Every show (I called them “events”) has many user reviews and many media (or critic) reviews.
When building the site, I found that I was frequently displaying the average user and media review score for events. In my original implementation, I used a class method to gather these statistics:
# app/models/event.rb class Event < ActiveRecord::Base has_many :user_reviews, dependent: :destroy has_many :media_reviews, dependent: :destroy def self.with_review_statistics select("events.*"). select("AVG(media_reviews.score) AS average_media_review"). select("AVG(user_reviews.score) AS average_user_review"). joins("LEFT JOIN media_reviews ON events.id = media_reviews.event_id"). joins("LEFT JOIN user_reviews ON events.id = user_reviews.event_id"). group("events.id") end end
There were at least two pain points I encountered with this code.
- That’s a lot of ugly SQL to have in the model, and it made it difficult to read and understand the method.
Eventlogically has the concept of an “average user review” or an “average media review.” With this implementation, those values were only available for a given
Eventif I had explicitly queried the model using this class method.
Both of these code smells prompted me to explore alternative implementations.
Choosing a Database View as the Solution
I considered a couple of solutions before settling on using a database view:
Change this class method into the default scope for the
Eventmodel. That would resolve the second issue above, because every instance of
Eventwould be automatically scoped with fields representing the average review values. The difficult-to-read SQL code would still remain in the model, however, and default scopes can sometimes lead to other unexpected complications with queries.
De-normalize the database and keep a cache of the average review scores for each event. This would potentially come with a performance increase for database reads (which, in this application, are more frequent than writes). It would also likely involve using ActiveRecord callbacks to manually maintain the cached values, something I prefer to avoid when at all possible.
A database view solves both of the challenges of my previous implementation. It removes the SQL statements from the model and allows every
Eventto have methods for average user and media reviews. (I achieved the latter through delegation, as we’ll see.)
(There is one caveat to call out here: because ActiveRecord doesn’t natively support views, using a view removes the flexibility that ActiveRecord provides to interface with different database backends. There could be a cost later if we want to switch over to, say, MySQL. Given what I know about this project and the production environment, that’s a trade-off I’m currently comfortable making.)
Creating the Database View
Like any other database change in Rails, creating the view involves a migration.
Because the ActiveRecord DSL doesn’t provide methods for creating views, we actually have to write the SQL ourselves, but it’s not too complicated. The
SELECT is basically a translation of our earlier class method into raw SQL.
# db/migrate/TIMESTAMP_create_review_statistics_summary_view.rb class CreateReviewStatisticsSummaryView < ActiveRecord::Migration def up execute <<-SQL CREATE VIEW review_statistics_summary AS SELECT events.id as event_id, AVG(user_reviews.score) AS average_user_review, AVG(media_reviews.score) AS average_media_review FROM events LEFT JOIN user_reviews ON events.id = user_reviews.event_id LEFT JOIN media_reviews ON events.id = media_reviews.event_id GROUP BY events.id SQL end def down execute "DROP VIEW review_statistics_summary" end end
Then, just run
Setting up the Models
Defining a Model for the Database View
Despite ActiveRecord’s lack of support for views, it does treat the view as any other relation, so we can easily hook it up to a model.
# app/models/review_statistics_summary.rb class ReviewStatisticsSummary < ActiveRecord::Base self.primary_key = "event_id" belongs_to :event end
This view doesn’t have it’s own
id column because
event_id is the value that uniquely identifies records in the relation. ActiveRecord expects the primary key column to be called
id, though, so we explicitly define the primary key column in the model.
Technically, because this model will only be used in relation to an
Event object, we don’t even need to specify the primary key. I’m inclined to include it here anyway, since there’s a chance we might run into unexpected ActiveRecord errors in the future without it (if we try to call
ReviewStatisticsSummmary.find, for example).
Updating the Event Model
The second step is to relate the
Event model to the
ReviewStatisticsSummary model. This one is easy:
# app/models/event.rb has_one :review_statistics_summary
Delegating the Average Methods
Finally, we’d like to avoid chained method calls (like
event.review_statistics_summary.average_user_review), which are difficult to read and violate the Law of Demeter.
The simple solution here is to delegate the
average_media_review methods to the
# app/models/event.rb delegate :average_user_review, :average_media_review, to: :review_statistics_summary
Now we can call
event.average_media_review as expected, and those method calls are delegated to that event’s instance of
This syntax is a Rails shorthand equivalent to defining these two methods on
# app/models/event.rb def average_user_review review_statistics_summary.average_user_review end def average_media_review review_statistics_summary.average_media_review end
We can also delete the entire
self.with_review_statistics class method from earlier. Hooray!
Postgres Views and the Rails Database Schema
One final caveat to note here: because of the way we’ve defined the view with raw SQL, ActiveRecord has no way of adding that information to Rails’s
However, if you’re deploying an app, you often don’t want to run all of the migrations to set up the database. Instead, the preferred way to deploy the database is to just load the schema itself. Since the schema doesn’t contain a definition for the view we’ve just created, we’ll run into problems.
This is of particular concern when using RSpec with Rails 4.1+, because RSpec loads the test environment database from the schema file by default. Since we’ve manually created the view and it’s not in
db/schema.rb, Postgres will complain that the view does not exist when running the tests.
The workaround is to tell Rails to use raw SQL for its schema in a file called
db/structure.sql. This file replaces
db/schema.rb and dumps the structure directly from the database instead of using the ActiveRecord DSL. The Rails Guide on Migrations provides detailed instructions for making that change. It’s pretty painless.