[Schevo-devel] Schema evolution

Patrick K. O'Brien pobrien at orbtech.com
Wed Dec 7 17:41:51 EST 2005


Tom Locke wrote:
> OK - whats puts the evo in schevo?

Because its you, Tom, I'll put the marketing hype machine on hold and
give you the real answer.  The truth is that while evo is the guiding
light in everything that we do, actual evo features are pretty minimal
at this point.  In fact, we have fewer evo features now than we did a
year ago.  How is that possible?  Well, we rewrote Schevo from the
ground up for the third or fourth time.  And while we've added a lot of
new features, and improved the syntax tremendously, we are still
catching up on some of the evo features we used to have.

> Easy one first - how would I rename a field?

Right now, you can't.  Pretty pathetic, huh?  So how do I survive
without this feature?  Well, I make really heavy use of initial and
sample data until my schema is pretty solid.  Then I leverage the
sql/xml/csv import/export capabilities that are partially built into
schevo itself and partially built into the various apps I'm working on.

We used to have this feature and the way it worked was this.  Once you
had a schema in production you stop making changes to it.  Then you copy
the 'schema_001.py' file to 'schema_002.py' and make your changes there.
 To rename a field you would do something like this:

schema_001.py
=============

class Foo(E.Entity):

  old_name = f.unicode()


schema_002.py
=============

class Foo(E.Entity):

  new_name = f.unicode(was='old_name')


Renaming an Entity class worked like this:

schema_001.py
=============

class Foo(E.Entity):

  name = f.unicode()


schema_002.py
=============

class Bar(E.Entity):

  _was = 'Foo'

  name = f.unicode()


At least that's how I think it worked, and still looks like a nice
enough syntax to me.  Suggestions are welcome, of course.  The way Matt
and I tend to work is that we start with the syntax and ask ourselves
"how would we like to type these things in the schema" and then we try
everything in our bag of Python tricks to make it work.

The one rule I have about schema migration is that I don't want to have
to carry a lot of baggage from one version of the schema to the next.
So as much as possible I'd like the migratory stuff to be in a separate
section that can be easily cut out when you copy, for example,
schema_002.py to schema_003.py.  So the examples above might be better
if they were at the bottom of the schema and looked like this:

E.Bar._was = 'Foo'

and from the first example:

E.Foo.f.new_name.was = 'old_name'  #Or something like that.

However we do it I want the result to be a schema that you can open and
look at and not be overwhelmed by migratory cruft.

> Then a harder one - I have a db with People who have an address field
> and 0-or-more PhoneNumbers (which have a name field - work, home etc.).
> 
> Now I want to separate Homes from People:
> 
> A Person has a Home and 0-or-more PhoneNumbers (cell phones, work
> phones, i.e. not related to the home)
> 
> A Home has an address and 0-or-more PhoneNumbers.
> 
> So some of the PhoneNumbers got moved out to the Home along with the
> address (those named "Home"), and the others stayed with the Person -
> make sense?
> 
> The reason for the change is that the database has a lot of families, so
> there's a ton of redundant info (same home phone number and address for
> all members of the family).
> 
> Doable?

Sure.  But for this kind of thing - rapid, experimental development, I
think you'll really be happier maxing out the sample data capabilities
while you mess around with one schema, than to have to create a bunch of
schema versions and migration code (even if the migrations are mostly
declarative).  With the sample data you just create a new database over
and over again, populating with sample data each time, until you get to
the point that you feel like you've got something good enough for
version 1 of your schema.  Then you start on version 2 and have to worry
about evolution and migration issues.

I guess I should point out that you *can* add and drop fields and entity
classes now just by running 'evo db create' against an existing database
file and that database will be evolved to the new schema.  Kind of a
hack, but it can be useful for a limited subset of schema changes.

So, now that I've bared my soul and confessed that the evo part of our
product is rather weak, I feel that I need to point out a few things
that we have done.  One, for example, is this - we don't store any
entity or field name strings in Durus.  Instead, we store integer values
and we map the names to those integers.  So once we have a syntax and
mechanism in place to rename these things the operation in Durus will be
to simply change the mapping value.  One change in one BTree (or
persistent dict, I can't remember which).

In addition to renames, we used to support a migrate() function in the
schema that would get passed the database and was a place where you
could make more extensive changes that were not easily handled
declaratively.  Since schema migration is definitely and important topic
it would be great to get some feedback and see if we can't move forward
on some of these ideas.

Thanks for your patience during our remodeling.  ;-)

-- 
Patrick K. O'Brien
Orbtech       http://www.orbtech.com
Schevo        http://www.schevo.org




More information about the Schevo-devel mailing list