Integrations

amundin · 30 Jul 2010, 04:14

This posting started on the protege mailing list but got a little off target and on Jonathan's suggestion I am taking the discussion here.

I understand your description about how integrations can be modeled as a variation of application in that it provides behavior, though I am not sure you have sold me on the concept. Most integrations only move data without actually doing anything to it. Integrations typically do not maintain any long running state of business information. Integrations are typically much smaller than applications. For these reasons I think one should consider treating integrations different from applications.

~Andreas

================================
Hi Andreas,

To answer your question, in Essential integration solutions are modelled as Applications, using the Application Service and Application Provider meta classes. So, you would capture your particular ETL script using an Application Provider instance and describe its behaviour in terms of the data that it is moving between other applications. Details of the technology you use, e.g. DataStage, Informatica, Oracle Data Integrator, are then captured in the supporting technology architecture of that application.
We're just working on a really nice extension to the meta model to better manage how Information and Data is managed (created, moved etc.) around the organisation.

As you can see, we consider integration solutions (note: not the technology) as an application because they provide behaviour just like any other application. However, there are ways in our meta model to identify such applications as integrations rather than end-user applications. For these reasons, we have no explicit 'Integration' or 'middleware' meta classes. New types of integration solutions are constantly emerging, data integration, process integration, EAI, B2B integration etc. but we consider these as effectively different reference architectures for providing behaviour to exchange information and data or to invoke other behaviours. Rather than make these sorts of things meta classes, they become Instances in the model. After all, who knows what the next integration approach is going to be?!

Apologies to the Protege community for my email being a somewhat way-off Protege.
Perhaps we could go into more details about this, Andreas, on the Essential Project forum?

Jonathan
_______________________________________

Jonathan Carter
Enterprise Architecture Solutions Ltd
_______________________________________

Proud sponsors of The Essential Project.
The free open-source Enterprise Architecture Management Platform
http://www.enterprise-architecture.org
_______________________________________

On 29 Jul 2010, at 20:31, Andreas wrote:

> Ulf,
>
> Thanks for pointing out Essential. It is something I have looked into, along with IteraPlan, ArchiMate, TOGAF, and DMTF's CIM, as tools or frameworks with existing ontologies or meta-models. The challenge is just the learning curve which comes with a well established meta-model.
>
> In looking at Essential my first reaction was that the concept of integrations was missing. How would, for example, an ETL script moving data from one system to another be represented in the Essential meta-model?
>
> Andreas
>
>
> --- Ulf wrote:
>
> From: Ulf
> To: [email protected]
> Cc: [email protected]
> Subject: Re: [protege-discussion] Question about slots
> Date: Thu, 29 Jul 2010 09:08:54 +0200
>
> Hello Andreas,
>
> instead of answering your questions about slots I would like to ask if you already considered using an existing ontology for your IT systems and dependencies? For example take a look at the meta-model created by the "Essential Project" <http://www.enterprise-architecture.org/>.
>
> "The Essential Architecture Meta Model, is a detailed, extensible, yet easy-to-use meta model for describing an enterprise from top to bottom. Essential Architecture Meta Model is published as open source and is available for use in any modelling toolset."
> <http://www.enterprise-architecture.org/ ... meta-model>
>
> Regards,
> Ulf
>
>
>
> Von: Andreas
> An: <[email protected]>
> Datum: 29.07.2010 01:34
> Betreff: [protege-discussion] Question about slots
> Gesendet von: [email protected]
>
>
>
> All,
>
> I just started using Protege-Frames. I have already trolled the archives. If someone can point me to an old discussion, or better yet, briefly answer my question I would greatly appreciate it.
>
> I am a little confused when it comes to the use of slots. Let me explain by way of example.
>
> I am trying to model IT systems and their dependencies. To this end I have three classes: System, Path, and Integration. A Path shows where there are paths between systems across which any kind of information can flow. An Integration describes a specific type of information flowing between systems. There will be one or more integrations for a path. Now to my problem.
>
> Both Path and Integration has source systems and target systems. I first created a slot 'source_system' for Path with an inverse-slot='downstream_path'. Now it would just make sense to also have a slot for the Integration class by the name of 'source_system' and this is where I run into trouble. I can't add a new slot using the name 'source_system' because one already is defined. I can't use the one already defined because it has been configured with an inverse-slot='downstream_path' and I need the inverse-slot='downstream_integration' for the Integration.
>
> Now the easy way out is just to name the slots 'interface_source_system' and 'integration_source_system'. In the long run it would seem this would lead to lots of slots with very descriptive names. So am I getting something wrong here? Is there something like a Class bound slot which is not part of the global slots and does not have the namespace problem? Do I have it conceptually backwards and I should be solving the problem in an entirely different way?
>
> Any advice appreciated.
>
> Andreas

ulfl · 30 Jul 2010, 07:40

Jonathan,

can you please elaborate on the "really nice extension to the meta model to better manage how Information and Data is managed (created, moved etc.) around the organisation."?

Thank you,
Ulf

jonathan.carter · 02 Aug 2010, 22:34

Hi Ulf,

We're not quite there with these extensions yet but will let you know when we have it ready to share.
I'll go into proper detail when we've tested it properly, rather than explain it and then find some problems, etc.

Jonathan

jonathan.carter · 02 Aug 2010, 23:03

Hi Andreas,

Apologies for the delay in getting back to you.

I take your point about how some integrations do not appear to do very much, e.g. simple ETLs. However, even these simple integrations are automating the process of moving information from one system to another and in doing so encapsulate that behaviour.
In the Essential Meta Model, the Application layer is all about capturing behaviour. It doesn't matter if that's a lot of behaviour or a simple piece of functionality - which is why in the upper logical view we have Application Functions and Application Services. We do not enforce the use of the Application Services (which provide a logical grouping of Application Functions, e.g. in a services paradigm) and it is perfectly reasonable to just use the Functions.

Interestingly, we were just discussing the other day the distinction between Application Providers (from the lower logical view that represent the actual systems (but not physical instances) in your architecture) that manipulate information as part of an integration and those that do not. E.g. an EAI solution that performs a look-up to add additional items to the information it is passing from A to B as opposed to a simple MQSeries configuration that just transports the information without any changes.

We concluded that these 2 should be treated differently from the point of view of understanding 'where' B gets the information from - is it from A or the EAI solution? - but that both the EAI 'project' and the MQSeries configuration are captured as Application Provider instances in the meta model.

This kind of distinction isn't something that we can encode into the meta model, rather it becomes a modelling heuristic.

The question we asked ourselves when we first tried to capture integration solutions was "what is an integration?". We concluded that it is a configuration (or coding) of the specification of the required integration solution that executes using a specific technology (e.g. from raw Java to Tibco BusinessWorks) and that it operates on information as it does so. And then there are some integrations are more about integrating functionality than moving data - e.g. Web Services, CORBA components or similar but again, they encode some behaviour in order to serve that functionality and they operate on information and execute on technology. From this viewpoint, these exactly the same as any other application, although they have a systems interface rather than a user interface and we capture that in the Application Functions that it provides.

I appreciate that there's a leap to make here - that integrations are a class of applications - but we've found that it has worked very nicely and fits smoothly with the rest of the meta model.
To make it easier to categorise these applications, we added the "Purpose" slot to the Application Provider meta class. This allows us to say what the application is for, in coarse-grain terms - what type of application it is, effectively. Of course, the specifics of what the Application does, functionally, are modelled more completely through the relationships to other Application layer classes and the business and information layers. The "purpose" slot enables you to select from an enumeration of Application Purpose instances that are managed in the EA_Support/Utilities/Enumeration part of the meta model, e.g. we have Application Integration, Business Application and Data Integration in there. You can create new purpose instances to extend this without having to extend the meta model and this could help to further categorise the integration solutions.

I'd be really interested to know your thoughts on this, in particular if there are some things that we might have missed.

Jonathan

amundin · 05 Aug 2010, 17:53

Hi Jonathan,

Thanks for the elaboration. To provide you with more feedback I need to understand better how the information meta-model relates to the application meta-model.

I think of systems, any system, in terms of behavior, information, and structure. I find this breakdown to be very useful and I have come across it over and over again. For example, ArchiMate takes a similar view in their modeling notation.

Within IT a system is piece of software (structure) which maintains some state (information) and provides functionality (behavior) which acts on input (information) and state (information) and creates output (information).

A basic integration that only moves information without changing it is then a piece of software structure with minimal behavior (only moving data) and information limited to input and output with no state.

The statelessness of a basic integration is in my mind the greatest differentiator with the typical interpretation of an application (application provider).

Basic integrations of this type are typically used to propagate information from one information store to another. The question now becomes, what is an information store? Do you, in Essential, model information stores as just another variation of application providers or is the intent of the Information Layer to capture information stores?

This is really where I have run into roadblocks in the past. Data is often treated separately from Applications. Over time I have come to the conclusion that this really not a productive way of dividing the IT world. A system is a system is a system. A database that stores information which supports the business is just another system, or application provider to use your terminology. It is a piece of software (structure) that maintains state (information) with basic CRUD functionality (behavior) taking input and providing output.

The point is that an application and a data store are both very similar. Both are systems. One type has a greater focus on the information and the other has a greater focus on the behavior, but both maintain state.

This is relevant because integrations typically transfer information from one system state to another.

This conversation so far has taken the point of view that integrations are data integrations. One can take the same approach to differentiate behavior based integrations, i.e. services. For basic service integrations the intent is not to move data but rather to invoke behavior. What both data and service integrations have in common is that they are stateless and depend on other systems for the information or behavior.

Again the service integration above takes the purist point of view that a service is stateless. Both service and data integrations can be created to maintain some state in order to provide their behavior. This is really where the boundaries of integrations and systems begin to blur. There really is nothing discrete to differentiate an integration at this point from a system. Subjectively speaking one would still treat an integration as an integration because it is so small and only provides a single function and applications are large and provide many functions. But once state has been introduced there is nothing to prevent multiple integrations using the same state we end up with a system consisting of loosely coupled component sharing the same state.

One more detail with regards to modeling integrations as separate from systems. It only makes sense if the integration is independent and decoupled from the systems they interact with. Otherwise it is just another component of a system.

I hope this gives you an understanding of why I need to better understand the intent behind your Information Layer. If it is just another view or perspective for the same set of systems then I might understand how it is intended to be used. However, if it is intended to capture a separate set of systems from those described using the application provider paradigm then I think the separation of Information from Application in the Essential model introduces an artificial barrier. Hopefully you can clarify this for me.

~Andreas

amundin · 24 Aug 2010, 22:44

Jonathan, I really do hope you will have some time to follow up on this thread.

best regards,
~Andreas

jonathan.carter · 02 Sep 2010, 12:49

Hi Andreas,

Sorry for taking so long to get back to you - this is a very interesting and important thread, I think.

I think your definition of a system might be a bit strong in that it has to change state or output information. We've drawn on our practical experience of many different approaches to how to capture and model these sorts of thing and have found that placing anything that provides systems behaviour in the Application Layer works very well.

In particular, we drew upon our extensive experience of SOA to provide a potential future application and software architecture to test our approach. If we consider a hypothetical pure-play SOA environment, then arguably there is no integration OR everything is integration. Some services will be providing data, some functionality - and those functional services may not actually output any information other than a confirmation that the function is provides completed successfully. Data services are arguably an approach for providing data integration and some more functionally oriented services (e.g. a credit card payment service) could be implemented using an EAI approach.
If we take the current established and state-of-the-art integration approaches they all provide some behaviour whether that is performing a large piece of functionality or just moving a small piece of data.

One of the key questions we asked ourselves about integrations is whether the use of any particular technology changed this. If I use Tibco, is it still an application? If I write a piece of raw Java code to do the same thing, conceptually what's the difference from a behaviour perspective?

From this, we concluded that the in this example, the Java code I write makes up an application, as does the configuration that I create in the Tibco toolset to perform that behaviour. NOTE: it's the configuration that is the application, not Tibco.

Another important thing to consider is that granularity is not a factor in determining what's an application. It really does not matter whether the behaviour is a simple step of a larger process or whether the application is something that provides all the behaviour that your organisation needs.

Even a 'basic integration' performs some behaviour and whether this behaviour is state-ful or state-less, it's still an application.

A quick word on software. We deliberately do not provide software modelling. This is best handled in CASE tools. We are building architectural models from an EA perspective. However, we provide some simple software constructs to enable us to capture dependencies between applications (including integrations!) and the main software components that are used to deliver these applications. This is important when trying to understand where common software supports multiple applications. However, this is not intended to capture detailed software design, rather 'what are the software components and the key dependencies between them that support this application?'. It also gives us a link to the Technology Products that applications are dependent on.

So, to answer your question about Information Stores, these are not a variation of Application Provider. Rather, these are physical locations where instances of Information Representations can be found. That could be a particular database or it could be a filing cabinet in your office. In this definition, the Information Store is not a system. It is not performing any behaviour on the information in that store.

We have separated Information and Data - and are road-testing some new extensions to provide more capability to manage Data. However, we are taking the approach that Information is Data that is used in a particular context - and so far this seems to be working nicely. Applications and Business Processes operate on and produce Information and the Information has the relationships to the Data that is used to deliver the Information.

I think this fits with your view (even though it seems to contradict it!) about Data and Applications being treated separately. In our meta model, it's the Processes and Applications that perform the CRUD on the data. So, what's operating on the data are systems (in simple terms, processes performed by people, applications automating behaviour through systems).

It is important to separate applications from technology, though - and like integrations, this is something that's changing all the time. What was once part of the applications that we wrote is now part of a technology platform, e.g. web application servers. So, for applications, we focus on the behaviour that is important to supporting a particular process rather than 'technical' behaviour that is provided by our platforms. The actual functions provided by an RDBMS are part of that technology product (e.g. Oracle 10g) and would not be considered part of an Application that requires a technology capability to manage structured data.

You make a point about whether an integration is part of a larger system or not. That's really a modelling question - about the level of detail that you need to go to. Do you need to model all the individual data integrations for your ERP or do you just model that the ERP receives information from the other systems? It's up to you, depending on what you are trying to understand about your architecture. If you need to understand about the technology they use, the problems that these integrations are causing, then you probably do need to model them. If you don't need to answer those kinds of questions now, then you do not need to go to that level of detail. Remember, you can always come back and add the detail to your model later and your ontology will manage how this additional detail fits with everything else that you've captured so far.

To answer your last question, the purpose of the Information Layer is to capture and manage knowledge about the Information that is required, used and produced by your enterprise, regardless of the systems, processes and technologies that are involved in doing so. This will also include knowledge about the Data that is used to deliver that Information. Having said, there are a number of relationships in the meta model between the Information Layer and the Application, Business and Technology Layers that capture how the information is used, stored, created etc.

In the context of a simple integration, Information is passed from one Application to another. It may be that actually, there's a 3rd application in there (the integration) that mediates this in some way. Maybe that integration application does transforms, cleansing etc. or maybe it just passes the Information from one application to another. Either way, the Information has no behaviour about how it is moved. That's all in the (integration) application.

We are trying to (and I think we are succeeding) in separating the concerns of Application, Business, Information and Technology whilst at the same time managing how these different concerns are related. Certainly, in my experience, Information architects have a very different set of problems to manage than an application architect and it's important to preserve this separation of concerns if we are to manage the complexity of our systems. Taking this approach enables the Information, Application, Business and Technology Architects to work on their areas whilst using/referring to/etc. elements defined by the other architects. The meta model relationships between these layers and the resulting ontology manage that for us. However, that's not to say that the Application and Information Architect have to be different people but the intention in our approach is to provide some structure in which to consider these different concerns separately whilst at the same time being connected through the relationships in the meta model.

So, there is a separation between each layer of the enterprise architecture but this is not a barrier. Rather, this separation provides a structure and approach for managing the complexity across the 4 layers of the core meta model, whilst ensuring that the inter-dependencies are captured and managed in a way that hides the resulting complexity but, importantly, without ignoring it.

Hope this helps to explain some more of the background thinking behind the way the meta model is today. There are some new things coming in the Information layer around Data, as I've mentioned above. More about all that when we've completed the road-testing to avoid sending anyone down any blind alleys etc!

Jonathan

Integrations

Login • Register