Protégé in a DTAP environment

Post Reply
ToniVerbeiren
Posts: 5
Joined: 13 Aug 2010, 05:55

Hi,

I'm struggling to set up a Development-Test-Acceptance-Production scheme for a Protégé knowledgebase, much like this is done in development projects.

The idea is the following:
- A collaborator makes changes to the knowledgebase (model & data) in the Dev environment. The Dev environment can be a locally hosted copy of Protégé and the model.
- Once these changes are tested (and possibly approved in some way) in the Test environment, they can be moved to the Acceptance environment.
- The acceptance environment can be accessed by multiple people and is used as pre-production stage.
- Once approved in acceptance, the changes are pushed to production.

Some ways to tackle the above scheme:
- Dev en Test are locally stored copies of the model. Every collaborator has his own copy to work on. This copy can easily be restored to the original situation. Restoring it to the latest production state is harder (see further).
- Acceptance and Production are two completely separate database instances, possibly (but not necessarily) hosted on different Protégé servers.
- By requiring that modifications to the model are scripted using Python, these changes can nicely be propagated from one environment to the next.

From a model point of view, this means that one can easily start from a baseline configuration by
1) loading the essential_baseline project
2) applying the model changes by executing the correct scripts in the right order

What I'm still missing, is a flexible mechanism to move the data from one environment to the next. This is relevant in two directions: from development to production in order to push updates, but also the reverse direction in order to start testing on the latest and most accurate version of the data in production.

There are some things I've tried:
- Importing all instances by script at all times. This, however, turns out to be impractical
- Exporting all data in some way, and importing it in the target project. This turns out to become impractical too.
- Using the 'Prompt' plugin in order to copy/move frames between ontologies. This seems to work for a relatively simple list of instances, but yields unexpected results when applied to a larger and more complex set of instances.
- By exporting a project and using this as the basis of a 'new' project. This practically means removing the complete production database in order to proceed from acceptance to production.

Are there other ways to tackle the same issue/situation? How do other users handle the systematic development of the architecture documentation?

Thanks!
Toni
User avatar
jonathan.carter
Posts: 1087
Joined: 04 Feb 2009, 15:44

Hi Toni

Apologies for the delay in getting back to you.

I completely understand what you're trying to do and I think that everything that you've tried makes a lot of sense.

Generally, Protege works as what I refer to as a 'live repository' - that is it's the only copy of the repository and it reflects the current view. Unfortunately, apart from Prompt (and even then I'm not sure that this is what Prompt is really intended for) Protege doesn't really have support for 'lifecycling' the contents of the repository from one environment to another.

Really, there's only ever a production repository from the Protege point of view.

There are some problems to consider, with taking content from one repository to another, however. Which instances from the repository do you want to promote? How do you easily identify and select those? Maybe by class?

Without wanting to discourage you from the approach that you're taking, I think it might be worth considering who your users are for the 2 main components of Essential (Modeller - Protege and Essential Viewer). As it tends to be more 'technical', we normally suggest that the repository in the Protege environment is for use by architects.

If we need to branch off to try capturing some aspects of the architecture, then we normally recommend taking a copy of the 'live repository', working on that until you are happy, then import the relevant instances from this branch into the live repository to merge things back together.

However, on the Viewer side, you can also set up multiple Viewer environments (e.g. Development, Test, Production; Public, Restricted, Architecture Team Only). As the content in the live repository gets to various stages, you can publish it to the selected Viewer environment - each targeted at specific user groups.

Going back to the question of which instances to promote between environments. This is basically the same question as identifying the relevant instances to merge back into the live repository from a branch.
I would tackle this by creating a View (for Essential Viewer) that creates an XML document that contains the relevant instances (including any relationship classes) most likely in the default Essential/Protege XML schema format. Then this can be imported into the live repository.

Finally, I think it's important to realise that the repository - the ontology - that you're building in Protege provides a knowledge base that is designed to provide decision-support for architects and other users. The model you are building in the ontology is not just for documentation.

Rather, I would suggest that any documentation about your enterprise can be produced by Views from the repository. Either by controlling the environments to which these Views are available (Development, Test, Production), you could control the lifecycle of the documentation, rather than of the ontology.

I realise that this doesn't really solve your problems but hopefully gives you some ideas.

Jonathan
Essential Project Team
Post Reply