Ruby on Rails
ActiveSchema (Version #47)

Active Record already enforces a lot of implementation details of the SQL schema; why not take advantage of this coupling to allow Active Record classes to build their own tables?

Benefits

The schema would live in Ruby code, rather than in a separate SQL file per database per supported RDBMS

Tests could set up their own tables as soon as the database is configured; or, if the privileged user specified in database.yml is able, the databases themselves could be created too in some environments

Would be great for projects desiring to support multiple configs—they could run on any backend that’s supported by an AR adapter

Could possibly automate versioning/data refactoring moves in the future

Could give new_model a lot more power

Hand-tuned vs. generated SQL

It’s an abstraction away from SQL, therefore it’ll probably be a binary choice: either use Active Schema and or create your table using SQL-based tools, but not both. However, this should be a choice the developer could make per table, not per project or database.

Also, Active Schema could be cautious and only specify what kind of persistence schema must be present, using existing hand-coded columns as long as they meet the minimum schema requirements specified in the Active Record class. Then developers would have a way to get up and running on Madeleine or another DB quickly, without giving up any benefit by hand-tuning the SQL schema.

Wow, the next step from that would be that schema migration might somehow be able to be expressed somewhere. I don’t know if it would make sense to keep that in the AR class, but if Active Schema already respected existing schema, schema migration may be just a logical feature-add in the future.

General implementation ideas

require ‘activeschema’ # to be replaced with DI cleanliness?

class Person < ActiveRecord::Base
string_field :first_name, :last_name
date_field :date_of_birth
end

ActiveSchema.generate # creates an appropriate table using the current database adapter

Database adapters would be responsible for the implementation of the necessary mechanics (creating tables including their columns & relations, or analogous steps in the case of Madeleine or whatever)

There should be a way to separate the create-table privileges from the normal AR privileges for the table. This will probably mean new elements in database.yml that only Active Schema code uses; it’ll be up to the programmer not to call those methods inside normal application code, or not provide them for non-development installations.

Even if someone didn’t trust Active Schema to actually muck with the schema as a privileged user, this could output a databasename.sql file that would help the user build the tables by hand

Add explicit “macro?? class methods for fields other than foreign keys, and have the schema-generation code look for objects kind_of? ActiveRecord::Base, or respond_to? :some_method. The Active Record descendants would each talk to a collector object. When finished, the collector would tell the DB adapter what it learned, and the adapter would make it so, or at least spit out SQL which the user could inspect and use to build the works.

This will probably take the form of patches to Active Record itself, but implementing condition builders that use Ruby operators and methods rather than SQL strings will help decrease the database coupling. One implementation of this is Criteria; I have some independently developed code of my own that isn’t LGPL’d.

Development

Just created the RubyForge project:

http://rubyforge.org/projects/activeschema

Discuss

Sound good? Can you find holes to poke in the concept? A lot of this just jelled as I wrote this page, so it’s by no means a set-in-stone concept. Please provide lots of feedback.

—Ryan Platte

But all the great tools to design database layouts are made for SQL and not ruby. Its actually AR’s best feature that it does not work like you want. For example can i add a new field in my database and have it available to me in code without touching a thing. Even if you don’t mind adding all the extra meta data to the model classes to describe which fields are available i still think that it would be of no advantage. In the best possible case AR would do an ok job creating the DB Schema in reality it would create a sub par database schema like other ORM tools do which implement this (see Hibernate). I think this feature would be a clear violation of DRY.

—Tobias Luetke

Thanks for the comments,

I can definitely imagine important categories of Active Record users that wouldn’t want this. If I were to go write this, I’d develop it as an alternative to maintaining separate SQL files, certainly not as a change to the way AR already works. I envision many folks continuing on with the current techniques, but prototypers and folks for whom portability is more important than tight SQL code could use Active Schema and its new field-definition macro methods instead of writing SQL.

Seen in this light, does this proposal still violate DRY somehow?

—Ryan Platte

(Folks on #rubyonrails gave Python’s SQLObject (Rails-ish example) and n/Db as references to folks doing similar work outside Rails in some way)

Check out Og (http://www.rubyforge.com/projects/nitro) included in Nitro. It is a fuzion of ideas from Active Record and NDB.

—George Moschovitis

I found it difficult synchronizing DB schemas between my production and developpment application (under other frameworks)
It would be great if rails could have a way to update tables, so that you just have to upload the code to the production server and run a sync script.
If I understand Active Record well, it has to know the table structures anyway, so why not store it in SQL or YAML files ?
—Anonymous Coward

Have a look in /yourrailsapp/db/production_structure.sql. Your entire db structure should sit there…
—-Johan Sorensen

ActiveRecord already violates DRY. You want constraints in your controllers, so that you catch errors before they go to sql and for simpler/more flexible handling of validation. Writing those same constraints into the database is a duplication. A growing number of people seem to be quite happy writing db schemas for rails apps that function as a fast, queriable store, without any sort of constraints (postgresql users in particular). Check out the comments David made in the Weblog a while back about the diffference between databases that are inteded to be integrated against and databsaes that are the private store of an OR layer.

I think something like this would be great. Double so if it can sniff the validates_* statements in your models and adds the equivalent constraints to the database. Protection on both ends without repeating yourself.

There’s many interesting possabilities as far as managing versions of schema and packaging deployments of rails applications such that your production database is adapted/extended as necessary without you having to manually find the difference between your app_dev database and app_production database and write the equivalent sql.

—Jason Watkins

Actually, constraints belong in the database itself. The benefits of constraings in the database are twofold.

  1. The database’s query optimizer uses these constraints to better plan joins. For example, if it sees a unique constraint or a foreign key constraint it can produce much better estimates of whether different join algorithms will be more efficient.
  1. It’s rare that only Ruby would be interacting with a database of any size or complexity. Almost certainly you’ll have third party reportng tools and almost certainly you’ll be using database-specific tools for bulk loads and schema migrations. Having the constraints in the database ensures that these guarantees are maintained.

What I’d like to see is for Active Record to inspect the foreign-key constraints in the database and automatically create the “has one”, “has many”, etc. members for me. That would be the ultimate in not repeating myself, IMHO.

- RonM?

Absolutely agree with the last comment. And one more thing: Data in a database typically have typically a longer live than the application that access them. Even if it’s a ruby app ;-). So better keep the constrains with the data, and not with the app…

-??

We stumbled onto this idea as well today talking about Django’s model. I think it’s an okay thing really to make your schema when you’re developing, but the lack of schema for various DBs can hinder deployment. I like the idea of building it in the model (and having the adapter create the actual SQL), since that restores primacy to the model than the database (and allows for iffy cases like booleans to be handled consistently irregardless of the backend DB).

That said, I wonder if it might be easier in some sense to have an external YAML file to represent the tables for the DB rather than embedding the information in the model. This still separates the SQL schema from the model to some degree if that’s helpful.

- Jacob Harris

Isn’t the ActiveRecord::Migration system doing this already? I actually have no schema in my system and use “rake migrate” to create my schema. You can also generate migration script with ruby scripts/generate migration . It even allow me to send specific SQL command based on the adapter in used to create my foreign constraints statements.
—Yannick Koehler

Also rails has a rake task to dump an existing database to a db/schema.rb file that can be portably used against any DB supported by AR. However, the generated schema.rb does not has any constraint from the database.

cd project_dir<br> rake db_schema_dump

The generated (or manually created schema.rb) can then be imported to your database using:
rake db_schema_import
— vic

The ActiveRecord::Migration system is a big piece of the puzzle, although further refinement is possible. The problem is that the model needs to be kept in sync with the migration scripts and often you need to specify the same concept in both places. For example if you reduce a string column’s :limit, you need to remember to update validates_length_of for the same column. That’s un-DRY.

It would be better if the model represented a snapshot of the most current DB state. It would then be possible to start at version 1 and run all the migration scripts (without executing any commands against the DB) to find out the last known DB state. Comparing the two states you could then auto-generate a migration script with the proper up/down methods to move from one state to the other.
— Dan Kubb

In many cases I tend to use the database to enforce constraints. Even when it is just used as a private store for a Rails app, I think the duplication of constraints in the db is worth it because it gives you a safety net for any code bugs.

To eliminate duplication, one option could be to let the Rails developer pick the strategy explicitly. For example:

class Customer inherits_constraints_from_database end

This would automatically have the object inherit the constraints to enforce correct data types, not nulls, value lengths etc.

Hopefully this would keep both camps happy and reduce the duplication for those that do chose to enforce business rules in the database.

Note that this subject has also been mentioned on the DatabaseConstraintReflection page.

TobinHarris

Have you read this thread?
http://lists.rubyonrails.org/pipermail/rails/2006-January/009640.html

It looks like what you are talking about.

I also fell in love with migrations and then directly thought: “heay, im slightly repeating myself for up() and down() —> this can be generated —> this belongs in the Model file!”

Have a look at the thread it contains a beautifull example of what could be future Rails code:

class Author < ActiveRecord::Base
  has_many :posts

  attribute :name, :string

  attribute :email, :string do |e|
    e.validate_format :with => /regex/
  end
end

# [...]

class Post < ActiveRecord::Base
  belongs_to :author # automatically add author_id to the table

  attribute :title, :string do |t|
    t.validate_presence
    t.validate_length :in => 3..100
  end

  attribute :summary, :text
  attribute :content, :text
end

I’d love to work this way… Rails can then (in development mode off course) automagically:
1) propagate changes in the model files to migrations,
2) run those migrations,
3) maybe even copy the relevant excerpt of the generated schema.rb file – as a comment – back into the model file for debugging purposes.

Look at this code: this is really elegant — everything where it logically belongs, no repetition: HARDCORE D.R.Y

— Cies Breijs

I’d like to jump back in and mention that I started this page before migrations existed (in 2004 or early 2005). I was very happy to see them added, but as others have said on this page, I’d love to see this DRYed up to something even better. —RyanPlatte

Active Record already enforces a lot of implementation details of the SQL schema; why not take advantage of this coupling to allow Active Record classes to build their own tables?

Benefits

The schema would live in Ruby code, rather than in a separate SQL file per database per supported RDBMS

Tests could set up their own tables as soon as the database is configured; or, if the privileged user specified in database.yml is able, the databases themselves could be created too in some environments

Would be great for projects desiring to support multiple configs—they could run on any backend that’s supported by an AR adapter

Could possibly automate versioning/data refactoring moves in the future

Could give new_model a lot more power

Hand-tuned vs. generated SQL

It’s an abstraction away from SQL, therefore it’ll probably be a binary choice: either use Active Schema and or create your table using SQL-based tools, but not both. However, this should be a choice the developer could make per table, not per project or database.

Also, Active Schema could be cautious and only specify what kind of persistence schema must be present, using existing hand-coded columns as long as they meet the minimum schema requirements specified in the Active Record class. Then developers would have a way to get up and running on Madeleine or another DB quickly, without giving up any benefit by hand-tuning the SQL schema.

Wow, the next step from that would be that schema migration might somehow be able to be expressed somewhere. I don’t know if it would make sense to keep that in the AR class, but if Active Schema already respected existing schema, schema migration may be just a logical feature-add in the future.

General implementation ideas

require ‘activeschema’ # to be replaced with DI cleanliness?

class Person < ActiveRecord::Base
string_field :first_name, :last_name
date_field :date_of_birth
end

ActiveSchema.generate # creates an appropriate table using the current database adapter

Database adapters would be responsible for the implementation of the necessary mechanics (creating tables including their columns & relations, or analogous steps in the case of Madeleine or whatever)

There should be a way to separate the create-table privileges from the normal AR privileges for the table. This will probably mean new elements in database.yml that only Active Schema code uses; it’ll be up to the programmer not to call those methods inside normal application code, or not provide them for non-development installations.

Even if someone didn’t trust Active Schema to actually muck with the schema as a privileged user, this could output a databasename.sql file that would help the user build the tables by hand

Add explicit “macro?? class methods for fields other than foreign keys, and have the schema-generation code look for objects kind_of? ActiveRecord::Base, or respond_to? :some_method. The Active Record descendants would each talk to a collector object. When finished, the collector would tell the DB adapter what it learned, and the adapter would make it so, or at least spit out SQL which the user could inspect and use to build the works.

This will probably take the form of patches to Active Record itself, but implementing condition builders that use Ruby operators and methods rather than SQL strings will help decrease the database coupling. One implementation of this is Criteria; I have some independently developed code of my own that isn’t LGPL’d.

Development

Just created the RubyForge project:

http://rubyforge.org/projects/activeschema

Discuss

Sound good? Can you find holes to poke in the concept? A lot of this just jelled as I wrote this page, so it’s by no means a set-in-stone concept. Please provide lots of feedback.

—Ryan Platte

But all the great tools to design database layouts are made for SQL and not ruby. Its actually AR’s best feature that it does not work like you want. For example can i add a new field in my database and have it available to me in code without touching a thing. Even if you don’t mind adding all the extra meta data to the model classes to describe which fields are available i still think that it would be of no advantage. In the best possible case AR would do an ok job creating the DB Schema in reality it would create a sub par database schema like other ORM tools do which implement this (see Hibernate). I think this feature would be a clear violation of DRY.

—Tobias Luetke

Thanks for the comments,

I can definitely imagine important categories of Active Record users that wouldn’t want this. If I were to go write this, I’d develop it as an alternative to maintaining separate SQL files, certainly not as a change to the way AR already works. I envision many folks continuing on with the current techniques, but prototypers and folks for whom portability is more important than tight SQL code could use Active Schema and its new field-definition macro methods instead of writing SQL.

Seen in this light, does this proposal still violate DRY somehow?

—Ryan Platte

(Folks on #rubyonrails gave Python’s SQLObject (Rails-ish example) and n/Db as references to folks doing similar work outside Rails in some way)

Check out Og (http://www.rubyforge.com/projects/nitro) included in Nitro. It is a fuzion of ideas from Active Record and NDB.

—George Moschovitis

I found it difficult synchronizing DB schemas between my production and developpment application (under other frameworks)
It would be great if rails could have a way to update tables, so that you just have to upload the code to the production server and run a sync script.
If I understand Active Record well, it has to know the table structures anyway, so why not store it in SQL or YAML files ?
—Anonymous Coward

Have a look in /yourrailsapp/db/production_structure.sql. Your entire db structure should sit there…
—-Johan Sorensen

ActiveRecord already violates DRY. You want constraints in your controllers, so that you catch errors before they go to sql and for simpler/more flexible handling of validation. Writing those same constraints into the database is a duplication. A growing number of people seem to be quite happy writing db schemas for rails apps that function as a fast, queriable store, without any sort of constraints (postgresql users in particular). Check out the comments David made in the Weblog a while back about the diffference between databases that are inteded to be integrated against and databsaes that are the private store of an OR layer.

I think something like this would be great. Double so if it can sniff the validates_* statements in your models and adds the equivalent constraints to the database. Protection on both ends without repeating yourself.

There’s many interesting possabilities as far as managing versions of schema and packaging deployments of rails applications such that your production database is adapted/extended as necessary without you having to manually find the difference between your app_dev database and app_production database and write the equivalent sql.

—Jason Watkins

Actually, constraints belong in the database itself. The benefits of constraings in the database are twofold.

  1. The database’s query optimizer uses these constraints to better plan joins. For example, if it sees a unique constraint or a foreign key constraint it can produce much better estimates of whether different join algorithms will be more efficient.
  1. It’s rare that only Ruby would be interacting with a database of any size or complexity. Almost certainly you’ll have third party reportng tools and almost certainly you’ll be using database-specific tools for bulk loads and schema migrations. Having the constraints in the database ensures that these guarantees are maintained.

What I’d like to see is for Active Record to inspect the foreign-key constraints in the database and automatically create the “has one”, “has many”, etc. members for me. That would be the ultimate in not repeating myself, IMHO.

- RonM?

Absolutely agree with the last comment. And one more thing: Data in a database typically have typically a longer live than the application that access them. Even if it’s a ruby app ;-). So better keep the constrains with the data, and not with the app…

-??

We stumbled onto this idea as well today talking about Django’s model. I think it’s an okay thing really to make your schema when you’re developing, but the lack of schema for various DBs can hinder deployment. I like the idea of building it in the model (and having the adapter create the actual SQL), since that restores primacy to the model than the database (and allows for iffy cases like booleans to be handled consistently irregardless of the backend DB).

That said, I wonder if it might be easier in some sense to have an external YAML file to represent the tables for the DB rather than embedding the information in the model. This still separates the SQL schema from the model to some degree if that’s helpful.

- Jacob Harris

Isn’t the ActiveRecord::Migration system doing this already? I actually have no schema in my system and use “rake migrate” to create my schema. You can also generate migration script with ruby scripts/generate migration . It even allow me to send specific SQL command based on the adapter in used to create my foreign constraints statements.
—Yannick Koehler

Also rails has a rake task to dump an existing database to a db/schema.rb file that can be portably used against any DB supported by AR. However, the generated schema.rb does not has any constraint from the database.

cd project_dir<br> rake db_schema_dump

The generated (or manually created schema.rb) can then be imported to your database using:
rake db_schema_import
— vic

The ActiveRecord::Migration system is a big piece of the puzzle, although further refinement is possible. The problem is that the model needs to be kept in sync with the migration scripts and often you need to specify the same concept in both places. For example if you reduce a string column’s :limit, you need to remember to update validates_length_of for the same column. That’s un-DRY.

It would be better if the model represented a snapshot of the most current DB state. It would then be possible to start at version 1 and run all the migration scripts (without executing any commands against the DB) to find out the last known DB state. Comparing the two states you could then auto-generate a migration script with the proper up/down methods to move from one state to the other.
— Dan Kubb

In many cases I tend to use the database to enforce constraints. Even when it is just used as a private store for a Rails app, I think the duplication of constraints in the db is worth it because it gives you a safety net for any code bugs.

To eliminate duplication, one option could be to let the Rails developer pick the strategy explicitly. For example:

class Customer inherits_constraints_from_database end

This would automatically have the object inherit the constraints to enforce correct data types, not nulls, value lengths etc.

Hopefully this would keep both camps happy and reduce the duplication for those that do chose to enforce business rules in the database.

Note that this subject has also been mentioned on the DatabaseConstraintReflection page.

TobinHarris

Have you read this thread?
http://lists.rubyonrails.org/pipermail/rails/2006-January/009640.html

It looks like what you are talking about.

I also fell in love with migrations and then directly thought: “heay, im slightly repeating myself for up() and down() —> this can be generated —> this belongs in the Model file!”

Have a look at the thread it contains a beautifull example of what could be future Rails code:

class Author < ActiveRecord::Base
  has_many :posts

  attribute :name, :string

  attribute :email, :string do |e|
    e.validate_format :with => /regex/
  end
end

# [...]

class Post < ActiveRecord::Base
  belongs_to :author # automatically add author_id to the table

  attribute :title, :string do |t|
    t.validate_presence
    t.validate_length :in => 3..100
  end

  attribute :summary, :text
  attribute :content, :text
end

I’d love to work this way… Rails can then (in development mode off course) automagically:
1) propagate changes in the model files to migrations,
2) run those migrations,
3) maybe even copy the relevant excerpt of the generated schema.rb file – as a comment – back into the model file for debugging purposes.

Look at this code: this is really elegant — everything where it logically belongs, no repetition: HARDCORE D.R.Y

— Cies Breijs

I’d like to jump back in and mention that I started this page before migrations existed (in 2004 or early 2005). I was very happy to see them added, but as others have said on this page, I’d love to see this DRYed up to something even better. —RyanPlatte