XQuery-based data integration - one step forward, can we do two?
XQuery engine can be used to query and combine
results from multiple data sources. This is a step forward in enterprise data
integration. Can we do more?
Of course we can do better integration using Topic
Maps.
XQuery-based integration tries to
address a real business problem: How does an organization get a consolidated
view of its information?
XQuery-based
solution:
- get information from diverse
sources in XML
- use XQuery adapters if XML
is not supported natively by data sources
-
create aggregated views using XQuery transformations, joins and
filtering
Can we do better with Topic
Maps?
XQuery does information
integration at data level nicely. But it cannot handle integration at knowledge
level.
Main problem with pure XML-based
information integration is that semantic information is hidden (lost?) in XML.
Lack of explicit semantics limits ability to use general information merging
rules. Each XML-based query in fact implements own information merging
procedure.
XML-based integration also ignores two other
important problems:
- providing standard
mechanism for identity mapping between different information
sources
- providing standard mechanism for
checking data integrity
Topic Maps
technology helps to deal with the same problem "How does an organization get a
consolidated view of its information?" at a different level. Instead of XQuery
view we introduce Topic Maps view of data sources. Topic Maps technology
promotes usage of ontology as a basis of information integration. Topic Maps
view of specific data source has not only factual information but also is
enriched by information about classes and taxonomies (which is typically lost in
relational databases).
Topic Maps
technology defines standard mechanism for merging information from different
sources. This technology cares about identity management / mapping and suggests
best practices to minimize problems with different identification
schemas.
Using TMCL it will be possible
to check integrity of resulting "virtual topic map". TMCL will allow also to
monitor business-related constraints using powerful rule-based language.
It will be very easy to create XML
views based on "virtual topic map" using TMQL and/or templates. Difference with
pure XML solution is that we can have semantically enriched XML. For each item
we can keep reference back to topic map constructs. We know WHAT is represented
using XML.
Creating XML views still
makes sense because it allows to use standard well developed technologies (such
XSLT, later may be XForms?) at the end of integration
pipeline.
And ... of course, Topic
Maps-based integration hub is not a replacement for XQuery hub. XQuery hub
nicely implements concept of "virtual resources". Topic Maps hub helps organize
these and other resources in semantically enriched "knowledge map".
Posted: Wed - August 11, 2004 at 09:21 AM