Sun - December
11, 2005
Scalability issues and ... MyTopics
It becomes challenging to work with a topic map
when number of topics grows.
I am going to experiment with the following
approach: - allow user to maintain a list of
personal roles - allow user to specify for
any topic that this topic is in "MyTopics" for a specific
role - allow switching between roles and
navigating to a role-specific list of
"MyTopics" - make "MyTopics" available
through RSS - do data mining on MyTopics
between different users
Posted at 04:21 PM
Read More
Sat
- October 1, 2005
Agile knowledge management or why I still choose Topic Maps
I was looking around and investigating different
technologies/products which can be used for knowledge management
recently...
Oracle just announced support of RDF in 10g Release
2. I really like the way RDF is implemented in Oracle database: clean and
elegant design, beautiful integration with traditional relational and XML
data.
This announcement made me think
again about RDF and Topic Maps. I was playing in my mind with idea of using RDF
for projects I am involved in...
First
interesting observation is about modeling dynamic worlds. In my domains objects
change names, properties, they move around, they create and delete
relationships. With Topic Maps I can create a type "TimeInterval" with
occurrences "DateStart", "DateEnd" and use instances of this type in scope of
occurrences and associations. Using TMQL I can easily create a projection of a
topic map for specific moment in time. With RDF... replace simple values with
objects which have DateStart, DateEnd properties?... hmmmm... use reification
for each time sensitive assertion?...
brrrr....
Scopes in Topic Maps allow to
represent context sensitive knowledge very nicely. And contexts are not limited
by time, of course. We can define and use any dimensions which are useful for
modeling.
Second observation is about
agile ontology development. I found myself refactoring ontologies for topic
maps in "production" again and again. Why Topic Maps in this case? Topic Maps
can work without schemas or additional ontology definitions such as "inverse"
or "symmetric" properties. Topic Maps basic semantic model is rich enough to
represent useful information. We can easily modify type hierarchies, add/delete
constraints without changing factual information. Topic Maps basic semantic
model combined with TMCL - pattern-based Topic Maps Constraint Language,
supports this agile ontology development style better than heavy OWL, from my
perspective.
Posted at 01:18 PM
Read More
Thu - December
2, 2004
Instant messaging, subject centric group chats, topic maps and... goodbye
email (... almost)
I recently saw presentation of Parlano MindAlign
for Microsoft Live Communication Server instant messaging
platform.
I enjoy using IM, IRC and enterprise group chats for
many years. I use them for person to person communications, for getting and
providing quick answers from/to peers, for notifications about important events.
MindAlign introduces new trend, I think: real-time subject centric
communication. MindAlign smart client allows to manage effectively and
participate in hundreds of subject centric channels at the same time. It also
allows to see history of all conversations and search message archives. All
these features are not new. But effective support of hundreds of channels on
the client side changes rules of the game and moves group chats to a new level.
There is something extremely powerful
in combination of real-time subject centric communication, ability to access
message history and search. I think that this kind of system can replace about
80% of emails in the future.What is
the next step? We can connect channel topics with topic map and allow users to
reference subtopics in real time conversation using analog of WikiWords. In this
case we have a topic map which is modified by users in real-time. This topic map
has references to group chat messages. But as any other topic map it also can
have information about associations between topics and references to other
resources.We can add ability for users
to provide Wiki-like occurrences in this topic map and ability to add links to
resources (analog of social bookmark manager del.icio.us ).
Result - live topic map which
integrates summary information about subjects, associations between subjects,
real-time messages and links to resources connected with subjects.
Posted at 05:24 PM
Read More
Tue - October 12, 2004
Jabber, publish & subscribe arcitecture and ... Topic Maps
One of the main goals of the Topic Maps is to
facilitate development of a distributed network of providers and consumers of
"subject centric information maps". Existing Topic Maps standards provide basic
support for building this kind of environment: XML-based interchange syntax
(XTM), scalable identification schema based on a concept of Public Subject
Identifiers (PSIs), concept of topic maps merging and leveraging of HTTP as a
standard access protocol.
In a "simplified world" we can create some topic
map and we can publish it to a public web server and everyone can view it using
topic map viewer (such as Omnigator
) or reference it (and reuse) in other topic maps. Problem with this approach is
that it is static. If later we change our topic map, nobody will know about
that. Only when topic map consumers refresh their topic maps they will be able
to reload and remerge our topic map with their own
information.Building real-life Topic
Maps-based applications, however, often requires more sophisticated protocols
and architectures. I would like firstly to reference several interesting ideas
and approaches:- TMShare
- TMRAP and here
- Virtual
and Federated Topic Maps
We are ready now to jump into
discussion about possible usage of Jabber messaging architecture
for building distributed networks of topic
maps...What we really need is a near
real time messaging / notification mechanism which distributes updates to all
interested parties and/or allows running asynchronous queries to multiple
information providers.Jabber, from my
perspective, is a very good candidate for this kind of infrastructure.
Why?- Jabber is an XML-based messaging
infrastructure with extensible
architecture.- It allows to implement
query-response scenarios.- It also supports
notification scenarios.- It supports store
and forward mechanism (important for occasionally connected
clients)- It is relatively firewall friendly
(it uses client to server outgoing connections on predefined
port).- It is platform/ language neutral.
- Jabber's protocol XMPP is published
recently by IETF as RFCs- It is a proven
(and running already) messaging infrastructure for communicating between humans
and applications in all
combinations.The most interesting part
is probably JEP-0060:
Publish-Subscribe. This specification defines a generic
publish/subscribe framework for use by Jabber entities. I think it can be reused
for building publish/subscribe infrastructure for topic
maps.With this approach topic map
providers can announce which topics are available for subscriptions. And
consumers can subscribe to specific topics. As soon as new information is
available about topic it can be distributed to all subscribers. Inside of
messages we can use topic map fragments defined by TMRAP , for
example.Interesting example of JEP-0060
usage can be found on PubSub.com
Posted at 12:26 PM
Read More
Sat
- September 25, 2004
Subject oriented computing, new approaches to user interface and... Topic
Maps
During last years we saw several interesting
attempts to implement new user interface. Trick here is that just moving
existing application centric computing model to 3D world will not do
it...
It is nice to have shrinking/extending/rotating
application windows or "3D room paradigm" but true paradigm shift is a shift
to subject based computing. I am
thinking about several technologies/ideas which can help in building new user
interface for subject oriented
computing.1.
Marvin Minsky's concept of K-Lines ("Society Of Mind"). This theory of memory
tries to explain how people can remember and use memories in solving new
problems and addressing new situations.
"...Whenever you "get a good idea",
solve a problem, or have a memorable experience, you activate K-line to
"represent" it. A K-line is a wirelike structure that attaches itself to
whichever mental agents are active when you solve a problem or have a good
idea.When you activate that K-line
later, the agents attached to it are aroused, putting you into a "mental state"
much like the one you were in when you solved that problem or got that idea...."
This theory can provide some insight
into dynamic nature of subject proxy maps. According to this theory activation
of some subject proxy leads to activation of some other subject proxies based on
connections between proxies. At any moment of time different proxies can have
different activation level. When we change our focus from one subject to another
activation level is also changed.
2.
Treemaps
"... Treemap is a
space-constrained visualization of hierarchical structures. It is very effective
in showing attributes of a leaf nodes using size and color coding..." If we
combine K-Lines with treemaps we can get very interesting result, I think. We
can connect size of treemap items with subject proxy activation level and to use
color to represent changes in activation level. Treemaps also can be very good
for representing hierarchy of subject
proxies.3.
Jef Raskin's concept
of "applicationless" user interface . "...The idea of an
application is an artificial one, convenient to the programmer but not to the
user. From a user's point of view there is a content (a set of objects created
or obtained by the user) and there are commands that can operate on
objects....". Concept of "Zooming "
is also extremely interesting from perspective of subject oriented
computing.
Posted at 02:02 PM
Read More
Wed - September 8, 2004
Managing subject proxies and Topic Maps: Enterprise perspective
If we would like to extend subject centric computing
environment to Enterprise level we need to support multiple levels of relevancy
and security for subject proxies and information resources.
In Enterprise case we deal not only with "personal"
subjects, but with Enterprise, department, team relevant subjects. Some subjects
can be available only for specific groups of users based on security policies.
The same, of course, is true for information
resources.
Applications often have own
security models. If we want to export and merge subjects proxies from different
applications we need to have "generalized" security model for subject proxy
map.
In Enterprise environment
information workers consume and produce information resources. They can wear
different "hats" during a day. It is important to implement concept of subject
relevancy based on context (role,
workflow-based).
I see subject proxy
map as a very active substance. At any time I can work with subject proxies
which I have access to. But my "subject views" are optimized based on current
context. I can manually specify context, for example: "Researching technology",
"Testing application". If I start some application, application can introduce
additional dynamic context. If I open resource (document, report, web page
etc.), resource also creates a new context and activates subject proxies which
are relevant to this resource.
One of
the challenges is that most of existing Enterprise applications are not
transparent in terms of subject proxies. Try to ask questions about different
kind of objects application deals with. Typically "main" objects can be found in
application forms and reports. Some secondary objects often do not have explicit
representations. Application design documents (UML diagrams ?) can help at this
step.
Posted at 12:10 AM
Read More
Mon - September 6, 2004
Subject Oriented Computing - Topic Maps and management of subject
proxies
It is surprising that modern desktop operating
systems continue to ignore fundamental aspect of information processing, it's
subject orientation.
Computers become better and better in helping people
to create/edit/transmit information resources. Today we can easily manipulate
resources of different types including pictures and music. But we still have
minimum support for managing subjects of our interests. The main computing
paradigm continues to be resource and application-centric.
Let's say that I participate in
projects and I need to keep track of different information resources related to
these projects. How can I do it now? I will probably create a spreadsheet and
list project names with some summary information (start, finish dates, project
manager, team members). I also can create subfolders on my hard drive for each
project and try to keep documents related to each project in corresponding
subfolder. But what if a document is related to several projects? What if I also
would like to keep track of technologies used in each project? And, I am also is
interested in managing information resources about different technologies (news
items, industry reports, reviews, predictions, rumors etc.) Should I create a
new spreadsheet with list of technologies and subfolders?
Hmmm....
In the world of resource and
application-centric computer environments it is "expensive" to manage subjects
of our interests. We are forced to use tools which were not designed for this
task.
Now let's switch to subject
oriented computer environment. In this environment subjects (more precisely
subject proxies) are basis of user experience. It is easy to create proxies for
subjects which we are interested in. It is easy to describe relationships
between subjects. It is easy to connect resources and subject
proxies.
For example, in case of
projects, I can type "projects" in a search box and I will get list of projects
which I am interested in. I can also click on a "new" button to create a subject
proxy for a project which is new or I just started monitoring. If I click on a
project name I can get representation of a subject proxy which can include
summary information and relationships with other subject proxies. I can see, for
example, who is a project manager and project members, what technologies are
used, tasks involved etc. I also can see all resources on my hard drive which
are related to this project: documents, plans, emails, schedules, links. I
easily can navigate to other subject proxies or resources.
The most important thing is that when
I create a subject proxy for a project it becomes available for all applications
on my computer. I can connect any resource managed by any application with this
subject proxy. When I am working with resource my computer helps me to identify
relevant subject proxies. I also can manually connect resource and subject proxy
(for example, by dragging and dropping subject proxy to resource or vise versa).
At any time I can jump between resources and subject proxies, create new and
modify existing
connections.
Applications can register
"actions" available for subjects of specific classes. So when I am looking at
project proxy I can easily jump to specific actions which are related to this
project: create a new task, schedule a meeting, prepare a status report etc.
Applications typically do not "own
subjects". They own and manage some partial information about subjects. But all
these different pieces of information are combined together at "desktop
level".
Topic Maps technology provides
basis for building subject oriented environments. Some of the ideas described
above are implementable right now with Topic Maps. Other ideas require deeper
integration with desktop operating systems.
Posted at 05:39 AM
Read More
Sun - August 15, 2004
"SOA Challenges: Entity Aggregation" from .Net Architecture Center and ...
Topic Maps
In my journey to understand Topic Maps strength for
enterprise knowledge integration I came across this article at Microsoft .Net
Architecture Center: "SOA
Challenges: Entity Aggregation" .
I love this article! It explains well challenge of
information integration about the same subject (it is called an "entity
aggregation" in the article). It also introduces concept of "Entity Aggregation
Service" which is responsible for presenting unified view on entities across
multiple enterprise applications.
It
also demonstrates that SOA in fact does not provide solution for entity
aggregation by itself. (That's already my interpretation :-) This is exactly
place where database replication, cross SQL servers queries, XQuery-based
integration, XML schemas etc. should be lifted to "knowledge level" using
concept of ontology-based knowledge integration. And ... Topic Maps technology
is a technology which can help enterprise system/software architects to
implement this concept of "entity aggregation".
Posted at 08:03 PM
Read More
Thu - August 12, 2004
How Topic Maps view enriches relational data sources
Relational databases represent important data
sources in enterprise knowledge integration pipeline. How can we improve
knowledge integration by providing Topic Maps view on relational
databases?
Let's take a quick look at a development cycle of a
traditional enterprise
application.
During design phase
development team creates a conceptual model of a future application. UML is used
for this quite often. What is important that model explicitly represents
semantics of the application domain. Typical model includes descriptions of
domain classes, relationships, constraints.
Later this conceptual model is mapped
to relational database. During this transformation process a lot of domain
semantic information is lost or presented in compressed form. On a good side,
this compression helps to build efficient application. On a bad side, it limits
information integration from different applications (the same conceptual
information can be mapped to different relational structures in various
applications). Just looking at tables from two different applications it is
difficult to find if they reference the same subjects and if they contain the
same kind of information.
Service
Oriented Architecture (SOA) promotes concept of information providers and
information consumers. Typical SOA enterprise application is a provider of some
well defined information and can consume information from other data sources.
In SOA world there is a shift from application centric view to service-based
view. Each service is responsible for management of own set of assertions about
subjects. Two service providers should not have the same assertions about the
same subjects except assertions responsible for identifying subjects. In SOA
world we should know service-provider for any assertion. Shift to service-based
architecture helps to minimize data overlapping between different
applications.
How can we create topic
map view on a relational database (or data service)? Firstly we identify primary
information of this database. In SOA world it is already done by definition.
Then we retrieve our UML diagrams (or create new ... :) and represent
explicitly taxonomy of classes and relationships using "ontology" topic map.
After that we define export procedure which produces "factual" topic map based
on assertions from database/data service. "Factual" topic map is merged with
"ontology" topic map. There is an extremely important issue regarding subject
identification. We should clarify subject identification schema used by
application/service for each basic class and define procedure for generating
Public Subject Identifiers (PSIs). When we generate "factual" topic map we use
these PSIs to reference subjects.
In
ideal situation we should have only one identification schema for subjects of
basic classes in enterprise applications. But it does not happen often these
days so we have to build mapping topic maps which define mapping between
different identification
schemas.
Another interesting question
is about enterprise ontology. Quite often UML diagrams are created specifically
for each application. What happens in this case is that application designers
"reinvent" again and again parts of enterprise ontology. With Topic Maps we try
to explicitly define and use enterprise ontology. We also try to reuse existing
standard ontologies.
When we try to
verbalize ontology for application we should attempt to reuse existing parts of
enterprise ontology and extend/refine it as needed. "Enterprise ontology" can
sound scary. But even simple taxonomies help a lot in knowledge integration.
We do not need to export all
information from databases to topic maps. A lot of factual information can be
delivered to users in a form of "reports" or "dynamic resources". Topic Maps are
ideal for representing relationships between subjects, summary facts about
subjects and references to static and dynamic resources. With this approach
users will be able efficiently navigate between different subjects and when they
need details they can "jump" to dynamic or static
resources.
As a result we have
"virtual" enterprise-wide topic map with shared ontology and shared (or mapped)
identification schemas. This topic map effectively represents classes of
subjects and resources important for enterprise business processes. It has
summary information about all important subjects, relationships between subjects
and cross-references between subjects and
resources.
Posted at 11:37 PM
Read More
Wed - August 11, 2004
XQuery-based data integration - one step forward, can we do two?
XQuery engine can be used to query and combine
results from multiple data sources. This is a step forward in enterprise data
integration. Can we do more?
Of course we can do better integration using Topic
Maps.
XQuery-based integration tries to
address a real business problem: How does an organization get a consolidated
view of its information?
XQuery-based
solution: - get information from diverse
sources in XML - use XQuery adapters if XML
is not supported natively by data sources -
create aggregated views using XQuery transformations, joins and
filtering
Can we do better with Topic
Maps?
XQuery does information
integration at data level nicely. But it cannot handle integration at knowledge
level.
Main problem with pure XML-based
information integration is that semantic information is hidden (lost?) in XML.
Lack of explicit semantics limits ability to use general information merging
rules. Each XML-based query in fact implements own information merging
procedure.
XML-based integration also ignores two other
important problems: - providing standard
mechanism for identity mapping between different information
sources - providing standard mechanism for
checking data integrity
Topic Maps
technology helps to deal with the same problem "How does an organization get a
consolidated view of its information?" at a different level. Instead of XQuery
view we introduce Topic Maps view of data sources. Topic Maps technology
promotes usage of ontology as a basis of information integration. Topic Maps
view of specific data source has not only factual information but also is
enriched by information about classes and taxonomies (which is typically lost in
relational databases).
Topic Maps
technology defines standard mechanism for merging information from different
sources. This technology cares about identity management / mapping and suggests
best practices to minimize problems with different identification
schemas.
Using TMCL it will be possible
to check integrity of resulting "virtual topic map". TMCL will allow also to
monitor business-related constraints using powerful rule-based language.
It will be very easy to create XML
views based on "virtual topic map" using TMQL and/or templates. Difference with
pure XML solution is that we can have semantically enriched XML. For each item
we can keep reference back to topic map constructs. We know WHAT is represented
using XML.
Creating XML views still
makes sense because it allows to use standard well developed technologies (such
XSLT, later may be XForms?) at the end of integration
pipeline.
And ... of course, Topic
Maps-based integration hub is not a replacement for XQuery hub. XQuery hub
nicely implements concept of "virtual resources". Topic Maps hub helps organize
these and other resources in semantically enriched "knowledge map".
Posted at 09:21 AM
Read More
Sat
- July 10, 2004
Topic Maps - based information integration
I was recently involved in implementation of an
intranet portal. Topic Maps engine was out of question from the
beginning...
Project team had to do a lot of traditional
portal-based information integration. The portal combines information about
topics of several classes with cross-references between topics. Project team
implemented several portlets which represent various pieces of information.
These portlets are combined into templates (one per class). Portlets extract
information from several databases, simple document management system and
several collaborative applications. Documents are tagged with references to
portal topics. Integrated full-text/metadata search engine allows to find all
relevant resources to specific topic. Typical portal
project.
This project helped me a lot
to understand how Topic Maps - based information integration is different from
traditional portal integration.
I would
like to describe some "ideal knowledge integration scenario" which can be
partially implemented with existing Topic Maps software. Some features are in
fact in my "wish list" for next generation Topic Maps engines and development
infrastructure.
Hmmm...
I
would like to start a new portal project with designing/implementing ontology. I
would like to describe classes of objects which portal deals with. On next
phase I would like to define mapping between existing data sources and ontology.
I would like to specify where I can get different pieces of information and what
kind of transformation is required. I should be able to see "mapping gaps" at
any time.
I prefer "conceptual
language" as a target of these transformations. As a portal developer I do not
think in terms of relational tables or document forms or even XML trees. I think
in terms of objects, classes, properties and relationships. Mapping to
conceptual language automatically adds metadata to factual information from data
sources.
I should be able to define
caching/update strategies for data transformations. In some cases I would like
to do "just in time" transformation and integration. In some cases I can use
some kind of caching. I would like to have some smart agent which manages
updates from different data sources based on declared polices. This agent
preferably uses asynchronous communication with data sources for improved
performance. Agent creates virtual, ontology enriched knowledge base.
One of the key issues in knowledge
integration is identity management. Quite often different data sources use
different identification schemas for the same subjects. "Ideal Integration tool"
should help to define/implement/monitor identity
mappings.
Knowledge base agent also uses ontology and
rule-based constraints to identify knowledge conflicts. I would like to have
ability at any time to check existence of conflicts. I also would like to have
conflict notification mechanism and conflict resolution workflow. Some
conflicts can be resolved automatically based on defined polices. Knowledge
conflicts are natural feature of any open information system. We are not
"afraid" of conflicts. We should have infrastructure which helps to deal with
them. Conflict identification and resolution mechanisms should improve knowledge
base consistency.
It is important to
note that knowledge base includes not only facts about business objects but also
facts about different resources (documents, reports, diagrams etc.). Non- and
semi-structural information is integrated with other data
sources.
Knowledge base should support
query language. It allows to query virtual knowledge base at 'conceptual level'.
At this stage we have already in fact
"information portal". It just does not have visual
representation.
It is much easier to
define and implement visual part of a portal in this scenario. All knowledge
integration is already done.
What we
need is to define some views and templates. When we define views we rely on
knowledge base query language. View design
can be guided by portal ontology.
Posted at 02:21 PM
Read More
Sat
- May 1, 2004
It is time for "Save as XTM" initiative
More and more applications can produce XML
representation of internal information and save it to shared storage. It helps
users to synchronize information on several computers. XML representation also
helps to create user communities based on sharing of information. Think about
shared calendars, music and picture mixes, blogs, recipes. It's nice, but it can
be much better... with topic maps.
Topic Maps provide "out of the box" support for
information sharing and
merging. This support is based on ability to
explicitly represent subjects and ability to connect any piece of information
with subjects.
If we have a blog
entry, for example, we have a standard
mechanism to express that this entry is related to
specific subjects. And we have a standard way to merge information from several
blogs. As a result we can easily find all blog entries related to the same
subject.
"Pure" XML solutions can
encode relationships between information pieces and subjects. But these
solutions are based on custom schemas. Each time we need to define custom
merging rules which also can include transformations between various XML
schemas.
It is
time... it is time to promote XTM
format as "save as" option for various
applications. Applications can use optimized
internal data models to implement specific set of functions. But applications
can also publish Topic Map - based representations of internal information to
shared storage. Other applications can "subscribe" to external topic maps and
merge external and internal information. Of course, applications remember source
of information so users can keep track of "who said
what".
With "save as XTM" support it
will be possible to use "universal topic map browsers" to explore information
from different applications. Users also will be able to rely on specific
applications with optimized views.
Posted at 01:18 PM
Read More
Three levels of information integration
Topic Maps in investment industry
|
Quick Links
Calendar
| | Sun | Mon | Tue | Wed | Thu | Fri | Sat
|
Categories
Archives
XML/RSS Feed
Statistics
Total entries in this blog:
Total entries in this category:
Published On: Dec 11, 2005 04:22 PM
|