List Info

Thread: preserving data during cosmo 0.3 upgrade




preserving data during cosmo 0.3 upgrade
user name
2006-02-07 22:39:55
as part of the proposed cosmo 0.3 release, we'd like to
guarantee that
upgrading from 0.2 will not require the existing data to be
wiped.

since the repository schema has changed greatly in 0.3,
we'll need to
provide a tool that can convert an existing repository from
the 0.2
format to the 0.3 format. such a tool could be be run while
the server
is offline or as a component of the application server that
executes
at startup time. i argue that the offline approach is the
right
choice, since the tool would be more easily scriptable by
deployers
and won't require any extra configuration of the server
environment.

complicating factors include:

 * user account and security info are stored in an embedded
hsql db in
0.2 but are stored in the repository in 0.3

 * the repository schema has changed significantly in 0.3 -
the dav
and caldav node types have become mixins, and some property
definitions have changed - jackrabbit doesn't have a good
mechanism
for changing existing schema definitions like sql's ALTER
TABLE
command

here is my proposal for the offline upgrade tool:

 * it will be written in java since it needs to use the jcr
api to
access the repositories

 * it will be packaged together with all dependencies and
run via shell script

 * it will not attempt to convert a 0.2 repository "in
place" but
rather will copy the data out the 0.2 repository and user db
into a
new, blank 0.3 repository

 * it will accept as input (via command line options) the
locations of
the data directories and config files for the source (0.2)
repository
and destination (0.3) repository

 * at the end of the process, the 0.2 repository and user db
will have
exactly the same data as they did at the beginning

the upgrade process will be broken into the following tasks:

 1) get the root user's details from the 0.2 user db and
update the
out-of-the-box root user's details in the 0.3 repository

 2) get the rest of the user details from the 0.2 user db
and add a
user+homedir node for each into the 0.3 repository

 3) iterate through the homedir nodes in the 0.2 repository,
copying
all content for each (calendars, events, collections,
resources,
tickets) to the corresponding homedir node in the 0.3
repository

since passwords are stored in an encrypted format in the 0.2
user db,
we'll have to extend the 0.3 internal apis to allow the
creation of a
user with an encrypted password (the existing api only
allows a user
to be created with a plaintext password).

in the long term i think we'll want an upgrade tool that can
convert
from an arbitrary older version, so that one could upgrade a
0.2
server to 0.6 with the same tool rather than requiring 4
separate
tools. this implies an upgrade framework of some sort. since
we really
want to release 0.3 in two weeks or less, i think we should
punt on
this and then revisit it as a primary tenet for a later
release.

i estimate that building the upgrade tool as proposed,
testing on
copies of the foxcloud and/or cosmo-demo repositories, and
fixing
found bugs will take 3-4 days.

thoughts on the proposal?
_______________________________________________
Cosmo mailing list
Cosmoosafoundation.org
http://lists.osafoundation.org/mailman/listinfo/cosmo
preserving data during cosmo 0.3 upgrade
user name
2006-02-07 23:42:05
Hi Brian,

Looks like a good plan to me.  Thanks for giving so much
detail.

Sincerely,
Jeffrey

Brian Moseley wrote:
> as part of the proposed cosmo 0.3 release, we'd like to
guarantee that
> upgrading from 0.2 will not require the existing data
to be wiped.
> 
> since the repository schema has changed greatly in 0.3,
we'll need to
> provide a tool that can convert an existing repository
from the 0.2
> format to the 0.3 format. such a tool could be be run
while the server
> is offline or as a component of the application server
that executes
> at startup time. i argue that the offline approach is
the right
> choice, since the tool would be more easily scriptable
by deployers
> and won't require any extra configuration of the server
environment.
> 
> complicating factors include:
> 
>  * user account and security info are stored in an
embedded hsql db in
> 0.2 but are stored in the repository in 0.3
> 
>  * the repository schema has changed significantly in
0.3 - the dav
> and caldav node types have become mixins, and some
property
> definitions have changed - jackrabbit doesn't have a
good mechanism
> for changing existing schema definitions like sql's
ALTER TABLE
> command
> 
> here is my proposal for the offline upgrade tool:
> 
>  * it will be written in java since it needs to use the
jcr api to
> access the repositories
> 
>  * it will be packaged together with all dependencies
and run via shell script
> 
>  * it will not attempt to convert a 0.2 repository
"in place" but
> rather will copy the data out the 0.2 repository and
user db into a
> new, blank 0.3 repository
> 
>  * it will accept as input (via command line options)
the locations of
> the data directories and config files for the source
(0.2) repository
> and destination (0.3) repository
> 
>  * at the end of the process, the 0.2 repository and
user db will have
> exactly the same data as they did at the beginning
> 
> the upgrade process will be broken into the following
tasks:
> 
>  1) get the root user's details from the 0.2 user db
and update the
> out-of-the-box root user's details in the 0.3
repository
> 
>  2) get the rest of the user details from the 0.2 user
db and add a
> user+homedir node for each into the 0.3 repository
> 
>  3) iterate through the homedir nodes in the 0.2
repository, copying
> all content for each (calendars, events, collections,
resources,
> tickets) to the corresponding homedir node in the 0.3
repository
> 
> since passwords are stored in an encrypted format in
the 0.2 user db,
> we'll have to extend the 0.3 internal apis to allow the
creation of a
> user with an encrypted password (the existing api only
allows a user
> to be created with a plaintext password).
> 
> in the long term i think we'll want an upgrade tool
that can convert
> from an arbitrary older version, so that one could
upgrade a 0.2
> server to 0.6 with the same tool rather than requiring
4 separate
> tools. this implies an upgrade framework of some sort.
since we really
> want to release 0.3 in two weeks or less, i think we
should punt on
> this and then revisit it as a primary tenet for a later
release.
> 
> i estimate that building the upgrade tool as proposed,
testing on
> copies of the foxcloud and/or cosmo-demo repositories,
and fixing
> found bugs will take 3-4 days.
> 
> thoughts on the proposal?
> _______________________________________________
> Cosmo mailing list
> Cosmoosafoundation.org
> http://lists.osafoundation.org/mailman/listinfo/cosmo
_______________________________________________
Cosmo mailing list
Cosmoosafoundation.org
http://lists.osafoundation.org/mailman/listinfo/cosmo
preserving data during cosmo 0.3 upgrade
user name
2006-02-09 11:50:27
On Tue, 2006-02-07 at 14:39 -0800, Brian Moseley wrote:
> such a tool could be be run while the server is
offline...

Offline +1

>  * it will be packaged together with all dependencies
and run via shell script

Ok.  Details not important; a raw java command line with -D
whatever or
command-line arguments would be fine too.  No need for this
to be
pretty, though it does set a fine precedent for future
needed data
migrations.

>  * it will not attempt to convert a 0.2 repository
"in place" but
> rather will copy the data out the 0.2 repository and
user db into a
> new, blank 0.3 repository

If we're talking about whole $OLD/data to a $NEW/data
(everything ends
up in a drop-in directory), I'm a happy camper.

>  * it will accept as input (via command line options)
the locations of
> the data directories and config files for the source
(0.2) repository
> and destination (0.3) repository

What config files are needed?  Some of those may have been
mussed with,
so I thought I should ask and coordinate.

>  1) ,,, 2) ... 3) ...

Should be fine if the semantics haven't changed 

> since passwords are stored in an encrypted format in
the 0.2 user db,
> we'll have to extend the 0.3 internal apis...

Good.

> in the long term i think we'll want an upgrade tool
that can convert
> from an arbitrary older version, so that one could
upgrade a 0.2
> server to 0.6 with the same tool rather than requiring
4 separate
> tools.

+1 on punting.  A pretty robust framework is just a wrapper
script that
runs each various steps, as separate scripts, in the right
order.  If
you write independent data-migration scripts, then you have
a
customizable framework to run detection and migration steps
in any
sequence diagram you want.

Thus a pre-dependency feature is the ability to detect what
version a
repo is at.  This could be starting a Java process, or as
simple as
reading a flat file with a version number.

To update production, I see basically:

- Announce downtime
- Create new 0.3-based production instance
- Take down old production service
- rm -rf $NEW/data
- run-migration-script --old $OLD/data --new $NEW/data
- Bring up new production service

> i estimate that building the upgrade tool as proposed,
testing on
> copies of the foxcloud and/or cosmo-demo repositories,
and fixing
> found bugs will take 3-4 days.

How are you going to go about finding those bugs?  (Which
relates to how
I find bugs in an update)  I'm thinking maybe a small script
that just
dumps the CMP users structure out, and do a before-and-after
diff.
That, plus doing a Chandler sync for my accounts, will
probably confirm
overall data preservation.  Beyond that, bugs found are more
likely to
be semantic or protocol changes.

Did you have anything particular in mind for testing?

> thoughts on the proposal?

Certainly seems reasonable and in line with what I'd expect
for a
good-enough solution.

-- 
Jared Rhine <jaredwordzoo.com>

_______________________________________________
Cosmo mailing list
Cosmoosafoundation.org
http://lists.osafoundation.org/mailman/listinfo/cosmo
preserving data during cosmo 0.3 upgrade
user name
2006-02-09 20:10:23
On 2/9/06, Jared Rhine <jaredwordzoo.com> wrote:

> If we're talking about whole $OLD/data to a $NEW/data
(everything ends
> up in a drop-in directory), I'm a happy camper.

yep.

> What config files are needed?  Some of those may have
been mussed with,
> so I thought I should ask and coordinate.

$OLD/etc/repository.xml and $NEW/etc/repository.xml.

> Thus a pre-dependency feature is the ability to detect
what version a
> repo is at.  This could be starting a Java process, or
as simple as
> reading a flat file with a version number.

yeah, the current plan is to add a node to the repository
with
properties for schema version and timestamp of last update.

> To update production, I see basically:
>
> - Announce downtime
> - Create new 0.3-based production instance
> - Take down old production service
> - rm -rf $NEW/data
> - run-migration-script --old $OLD/data --new $NEW/data
> - Bring up new production service

you won't rm -rf $NEW/data, as we'll be copying stuff into
it (the out
of the box repository has no user data but it does have an
initialized
schema and seed data), but otherwise, yes.

> How are you going to go about finding those bugs? 
(Which relates to how
> I find bugs in an update)  I'm thinking maybe a small
script that just
> dumps the CMP users structure out, and do a
before-and-after diff.
> That, plus doing a Chandler sync for my accounts, will
probably confirm
> overall data preservation.  Beyond that, bugs found are
more likely to
> be semantic or protocol changes.

yep, the cmp script and possibly also a similar one for dav.

> Did you have anything particular in mind for testing?

clicking around to verify manually ;) your script idea is a
good one.
_______________________________________________
Cosmo mailing list
Cosmoosafoundation.org
http://lists.osafoundation.org/mailman/listinfo/cosmo
[1-4]

about | contact  Other archives ( Real Estate discussion Medical topics )