XQuery-based data integration - one step forward, can we do two?


XQuery engine can be used to query and combine results from multiple data sources. This is a step forward in enterprise data integration. Can we do more?

Of course we can do better integration using Topic Maps.

XQuery-based integration tries to address a real business problem: How does an organization get a consolidated view of its information?

XQuery-based solution:
- get information from diverse sources in XML
- use XQuery adapters if XML is not supported natively by data sources
- create aggregated views using XQuery transformations, joins and filtering

Can we do better with Topic Maps?

XQuery does information integration at data level nicely. But it cannot handle integration at knowledge level.

Main problem with pure XML-based information integration is that semantic information is hidden (lost?) in XML. Lack of explicit semantics limits ability to use general information merging rules. Each XML-based query in fact implements own information merging procedure.

XML-based integration also ignores two other important problems:
- providing standard mechanism for identity mapping between different information sources
- providing standard mechanism for checking data integrity

Topic Maps technology helps to deal with the same problem "How does an organization get a consolidated view of its information?" at a different level. Instead of XQuery view we introduce Topic Maps view of data sources. Topic Maps technology promotes usage of ontology as a basis of information integration. Topic Maps view of specific data source has not only factual information but also is enriched by information about classes and taxonomies (which is typically lost in relational databases).

Topic Maps technology defines standard mechanism for merging information from different sources. This technology cares about identity management / mapping and suggests best practices to minimize problems with different identification schemas.

Using TMCL it will be possible to check integrity of resulting "virtual topic map". TMCL will allow also to monitor business-related constraints using powerful rule-based language.

It will be very easy to create XML views based on "virtual topic map" using TMQL and/or templates. Difference with pure XML solution is that we can have semantically enriched XML. For each item we can keep reference back to topic map constructs. We know WHAT is represented using XML.

Creating XML views still makes sense because it allows to use standard well developed technologies (such XSLT, later may be XForms?) at the end of integration pipeline.

And ... of course, Topic Maps-based integration hub is not a replacement for XQuery hub. XQuery hub nicely implements concept of "virtual resources". Topic Maps hub helps organize these and other resources in semantically enriched "knowledge map".

Posted: Wed - August 11, 2004 at 09:21 AM      


©