Lingceng's Blog

Migrate Legacy Data to Anthother Rails Project: Practical Tips About Using ActiveRecord

I migrated legacy data from one Rails project to its refactored Rails project recently. I’ll share some tips while I’m doing this job.

I fetched the legacy data directly from old DB and then save into new DB in a rake task. I used following code to connect the old DB:

class DB < ActiveRecord::Base
  self.abstract_class = true
  if ENV['DEBUG']
    self.logger =

  establish_connection adapter: 'mysql2', encoding: 'utf8',
    host: '', port: '3306', database: 'databasename', 
    username: 'username'

Then select result with raw SQL by using select_all method

DB.connection.select_all("select * from users").each do |user|
  record =
  new = UserInNewDB.find_or_initialize_by(id:
  user.attributes = record.to_h.slice(*%i[ phone name created_at updated_at])! if user.changed?

But it’s hard while doing some complicated query with raw SQL. So I tried to copy models in old project to new one. To keep new project clean, I mainly reproduced the relations between models.

class User < DB
  has_many :orders

class Order < DB
  belongs_to :user

Then life gets better. I can do queries with ActiveRecord model and get all the benefits.

User.find_each do |record|
  # Do the migration

Tip0 Skip some callback

Process.skip_callback(:save, :before, :log_changes)

Tip1 Skip some validation

if new.invalid? && 1 == new.errors.size && new.errors[:batch_id]!(validate: false)

Tip2 Cache basic table in a hash

brands = Brand.all.index_by(&:name)
Order.find_each do |record|
  new = NewOrder.find_or_initialize_by(id:
  new.brand = brands[name]!

Tip3 Show current progress

def show_process(name, total, index)
  printf "%s %.2f %%, %d / %d\r", name, index * 100.0 / total, index, total

Tip4 Find out records with no associated records

Settlement.joins("left join orders on orders.settlement_id =").
  where(" is null")

Tip5 Update column without triggering callbacks

PayRecord.find(10059).update_column(:amount, 5986)

Tip6 Do nested inner joins with ActiveRecord

query = ProcessesChange.joins(process: [:technic, batch: :workgroup])

Tip7 Use squeel to do outer join


Tip8 Use pluck method to return array of data

Order.joins(:item).group('items.brand').pluck("items.brand, count( as order_count")

Tip9 Use Mysql GROUP_CONCAT to return all items in a group

sql =  Settlement.joins(:orders).group(:id).
  having("sum(orders.final_price) != sum(settlements.amount)").
  select(", GROUP_CONCAT(orders.number),
  GROUP_CONCAT(orders.final_price), MIN(settlements.amount)").to_sql


Tip10 Manage rake task dependencies with an empty task

task :user_module => [:users, :sources, :customers, :operators]