Topic Maps - based information integration
I was recently involved in implementation of an
intranet portal. Topic Maps engine was out of question from the
beginning...
Project team had to do a lot of traditional
portal-based information integration. The portal combines information about
topics of several classes with cross-references between topics. Project team
implemented several portlets which represent various pieces of information.
These portlets are combined into templates (one per class). Portlets extract
information from several databases, simple document management system and
several collaborative applications. Documents are tagged with references to
portal topics. Integrated full-text/metadata search engine allows to find all
relevant resources to specific topic. Typical portal
project.
This project helped me a lot
to understand how Topic Maps - based information integration is different from
traditional portal integration.
I would
like to describe some "ideal knowledge integration scenario" which can be
partially implemented with existing Topic Maps software. Some features are in
fact in my "wish list" for next generation Topic Maps engines and development
infrastructure.
Hmmm...
I
would like to start a new portal project with designing/implementing ontology. I
would like to describe classes of objects which portal deals with. On next
phase I would like to define mapping between existing data sources and ontology.
I would like to specify where I can get different pieces of information and what
kind of transformation is required. I should be able to see "mapping gaps" at
any time.
I prefer "conceptual
language" as a target of these transformations. As a portal developer I do not
think in terms of relational tables or document forms or even XML trees. I think
in terms of objects, classes, properties and relationships. Mapping to
conceptual language automatically adds metadata to factual information from data
sources.
I should be able to define
caching/update strategies for data transformations. In some cases I would like
to do "just in time" transformation and integration. In some cases I can use
some kind of caching. I would like to have some smart agent which manages
updates from different data sources based on declared polices. This agent
preferably uses asynchronous communication with data sources for improved
performance. Agent creates virtual, ontology enriched knowledge base.
One of the key issues in knowledge
integration is identity management. Quite often different data sources use
different identification schemas for the same subjects. "Ideal Integration tool"
should help to define/implement/monitor identity
mappings.
Knowledge base agent also uses ontology and
rule-based constraints to identify knowledge conflicts. I would like to have
ability at any time to check existence of conflicts. I also would like to have
conflict notification mechanism and conflict resolution workflow. Some
conflicts can be resolved automatically based on defined polices. Knowledge
conflicts are natural feature of any open information system. We are not
"afraid" of conflicts. We should have infrastructure which helps to deal with
them. Conflict identification and resolution mechanisms should improve knowledge
base consistency.
It is important to
note that knowledge base includes not only facts about business objects but also
facts about different resources (documents, reports, diagrams etc.). Non- and
semi-structural information is integrated with other data
sources.
Knowledge base should support
query language. It allows to query virtual knowledge base at 'conceptual level'.
At this stage we have already in fact
"information portal". It just does not have visual
representation.
It is much easier to
define and implement visual part of a portal in this scenario. All knowledge
integration is already done.
What we
need is to define some views and templates. When we define views we rely on
knowledge base query language.
View design
can be guided by portal ontology.
Posted: Sat
- July 10, 2004 at 02:21 PM