Beyond True & False: Enumerated Data Types in Postgresql & Rails

Multiple Choice

One of the great advantages of Postgresql in comparison to more limiting databases like SQLite is the number of data type options available for database customization. The current version of Postgresql offers dozens of data types as options, compared to SQlite3's fairly limiting five data types, of which one (BLOB) is nearly useless.

One data type that I have found useful for implementation is the enum type. Enums present yet another way to for a data architect to store values that have a limited number of potential options. By setting your accepted values for a column in advance, you have built in some validations, signified to other developers what options are acceptable for the column, and nicely cleaned parts of your data structure.

Before getting too far into enums, it is appropriate to look at what other options are available to those not ready to make the plunge.

  1. Boolean: If your options can be summed up as a simple or-else condition with only two potential values, a boolean often makes sense, particularly if your logic can be reasoned linguistically as 'my-instance-is-in-this-state-or-not'. Problems can however arise when we mix trues and falses with NULLs, since both false and NULL will be declared 'falsy' by most coding languages. true or false
  2. String: Another option to develop a multiple choice field for your database is to only provide a limited number of specific values. While this works closest to how an enumerated data type will, there a number of extra steps involved. Notably, you will likely have to provide at least one custom validation for your instances that may require quite a bit of typing. You also lose much of the built-in logic that comes with the boolean type of data, since all strings evaluate to truthy. Strings are not alright
  3. Join Table: You could give your option values their own table in the database and utilize foreign keys to connect one with your instance. This is probably the preferred method if your options will need other data associated with them and you plan to nest other models of your options. Rails will also build in some of the logic, allowing you to quickly and easily get all instances with one option without having to building your own search query or enumerator method, for example. However, if you are not building more data off your options, it can lead to deeper nesting in your data serializers than you would otherwise have.

Setting Up Your Postgresql Database in Rails to Accept An Enumerated Data Type

In exploring the enumerated data type (henceforth, 'enum') of Postgresql for the first time (alongside the uuid data type, which is a blog for another day), I made frequent use of other blogs and stack overflow posts that delivered step-by-step instructions. Most of them, frankly, contained a number of unnecessary steps.

The step that cannot be overlooked, because, well, I overlooked it the first time I tried to use enums and I had a hell of time figuring out the problem, is to set up your rails project to use Postgresql instead of the default database. That's `rails new |project-name| --database=postgresql`.

Once you have your models ready, you can add an enum option either when you develop the rails database migration that creates the table or a migration that adds a column to the table.

    
class CreateTransactions < ActiveRecord::Migration[6.0]
    def change
        create_table :transactions do |t|
            t.float :amount
            t.datetime :trans_date
            t.integer :status
            t.integer :trans_type
            t.string :notes
            t.references :account, null: false, foreign_key: true

            t.timestamps
        end
    end
end
    
  

In the migration above, we set up our transactions table to have two enum fields: status and trans_type. Note that ActiveRecord needs to be told these are integers. When we set up our enums for the first time in our model, that will be prepared in an array, and Postgresql will need to know those indexes to identify what options are available.

    
class Transaction < ApplicationRecord
    belongs_to :account 
    enum status: [:pending, :settled, :rejected]
    enum trans_type: [:debit, :credit]
    
  

And that's it for the basic set-up. Our transaction model is ready to accept values of only the limited number of options we have provided. On top of that, our rails app has access to logic that could not be afforded to strings. Upon seeding our transactions table:

    
base_account = Account.find(1)
trans1 = Transaction.create(
    amount: 50.87, 
    trans_date: Date.today, 
    notes: 'gas bill', 
    status: :settled, t
    trans_type: :debit, 
    account: base_account)
trans2 = Transaction.create(
    amount: 16.95, 
    trans_date: 
    Date.today, 
    notes: 'dry cleaners', 
    status: :pending, 
    trans_type: :debit,
    account: base_account)

puts trans1.pending? # => false
puts trans2.pending? # => true

trans2.status = 'settled'
puts trans2.pending? # => false
puts Transaction.where(status: 'settled').count # => 2

trans2.pending!
puts trans2.pending? # => true
puts tran2.settled? # => false
    

We can use both the question mark to determine whether the transaction status matches and the exclamation mark to forcibly change the transaction's status in our database. We can use both symbols and strings to set or reset the values. If we wanted to, we could even use an integer if we were to memorize each index position that we initially put into our transaction model.

Preparing for Enumerable Expansion

Having all these options to change our transaction's status helps in manipulation, but we are still in the end using an array. If we were to add more and more status options or transaction types to our array later... for example, adding a refund status or type... we could continuously add values to our array, but be stuck with our original order. Furthermore, we have not actually told Postgresql that we want to use its enum feature yet. We are still just giving it integers. If we were to directly query for data in SQL rather than through ActiveRecord, we would see numbers only.

so many credit card transaction types

If we do not really need to access Postgresql directly and are comfortable letting ActiveRecord do the heavy lifting, we can covert our original array into a hash with values of integers:

    
class Transaction < ApplicationRecord
    belongs_to :account
    enum status: {refunded: 3000, pending: 0, settled: 1, rejected: 2}
    enum trans_type: {debit: 0, credit: 1}
    

It certainly feels a bit hacky, but it solves our problem (and kind of represents the seeming randomness of credit card transaction codes anyway.) Getting Postgresql to sync is a bit more out there, though, requiring us to more directly fire the SQL command rather than rely on ActiveRecrd. We should probably create a separate migration for this column.


class AddStatusToTransactions < ActiveRecord::Migration[6.0]
    def up
        execute <<-SQL
            CREATE TYPE transaction_status AS ENUM ('refunded', 'pending', 'settled', 'rejected')
        SQL
        add_column :transactions, :status, :transaction_status
    end

    def down
        remove_column :transactions, :status
        execute <<-SQL
            DROP TYPE transaction_status
        SQL
    end
end

There's a lot going on there, but basically we are creating our own data type called transaction status as an enum inside the database. We are then adding a column that will take accept that data type, instead of an integer or boolean or some other standard data type.

As a final step, we can completely get rid of the integers in our model and replace them with strings equal to the keys:

    
class Transaction < ApplicationRecord
    belongs_to :account
    enum status: {refunded: 'refunded', pending: 'pending', settled: 'settled', rejected: 'rejected'}
    

We lose the ability to update the status using integers in our ruby files and rails console, but we gain the ability to query using ActiveRecord's '?' syntax:

    
(Transaction.where{'status = ?', 'pending'}) # => only available once we match Postgresql up with our model.
    

Conclusion

Moving beyond standard data gives us flexibility to define our own data types. In a case where we have limited,but multiple options for a database column, utilizing Rails and Postgresql's enum option provides access to rails-based methods to create, seed, edit, and query our databases using only ruby.