Sat - Jun 23, 2007

I am re-launching Topic Map Thoughts blog under a new name: Subject-centric. Many ideas from my old blog will be incorporated into Ontopedia research project.

Sun - December 11, 2005

Scalability issues and ... MyTopics


It becomes challenging to work with a topic map when number of topics grows.

I am going to experiment with the following approach:
- allow user to maintain a list of personal roles
- allow user to specify for any topic that this topic is in "MyTopics" for a specific role
- allow switching between roles and navigating to a role-specific list of "MyTopics"
- make "MyTopics" available through RSS
- do data mining on MyTopics between different users

Posted at 04:21 PM     Read More  


Sat - October 1, 2005

Agile knowledge management or why I still choose Topic Maps


I was looking around and investigating different technologies/products which can be used for knowledge management recently...

Oracle just announced support of RDF in 10g Release 2. I really like the way RDF is implemented in Oracle database: clean and elegant design, beautiful integration with traditional relational and XML data.

This announcement made me think again about RDF and Topic Maps. I was playing in my mind with idea of using RDF for projects I am involved in...

First interesting observation is about modeling dynamic worlds. In my domains objects change names, properties, they move around, they create and delete relationships. With Topic Maps I can create a type "TimeInterval" with occurrences "DateStart", "DateEnd" and use instances of this type in scope of occurrences and associations. Using TMQL I can easily create a projection of a topic map for specific moment in time. With RDF... replace simple values with objects which have DateStart, DateEnd properties?... hmmmm... use reification for each time sensitive assertion?... brrrr....

Scopes in Topic Maps allow to represent context sensitive knowledge very nicely. And contexts are not limited by time, of course. We can define and use any dimensions which are useful for modeling.

Second observation is about agile ontology development. I found myself refactoring ontologies for topic maps in "production" again and again. Why Topic Maps in this case? Topic Maps can work without schemas or additional ontology definitions such as "inverse" or "symmetric" properties. Topic Maps basic semantic model is rich enough to represent useful information. We can easily modify type hierarchies, add/delete constraints without changing factual information. Topic Maps basic semantic model combined with TMCL - pattern-based Topic Maps Constraint Language, supports this agile ontology development style better than heavy OWL, from my perspective.

Posted at 01:18 PM     Read More  


Fri - June 10, 2005

Yahoo, Mindset ... and subject centric computing


I really like idea behind Mindset research project

It is great that we can easily change ranking of the search results by specifying our intention. I would suggest to extend idea of dimensions and to introduce "Document - Subject" dimension.

I think we should have one "Subject" node on the right side. On the left side we can have different nodes which correspond to different dimensions (Shopping, News, Entertainment etc.) . I think that other side of each dimension is "Subject" so all these dimensions are connected to one subject node.

So if we position our intention close to subject node we will have something like "Answers.com ": results are subject centric pages. But we can add different "flavors" to search request by adding more "News" or "Shopping" for example. Search results become less subject centric in this case.

Posted at 11:44 AM     Read More  


Thu - December 2, 2004

Instant messaging, subject centric group chats, topic maps and... goodbye email (... almost)


I recently saw presentation of Parlano MindAlign for Microsoft Live Communication Server instant messaging platform.

I enjoy using IM, IRC and enterprise group chats for many years. I use them for person to person communications, for getting and providing quick answers from/to peers, for notifications about important events. MindAlign introduces new trend, I think: real-time subject centric communication. MindAlign smart client allows to manage effectively and participate in hundreds of subject centric channels at the same time. It also allows to see history of all conversations and search message archives. All these features are not new. But effective support of hundreds of channels on the client side changes rules of the game and moves group chats to a new level.

There is something extremely powerful in combination of real-time subject centric communication, ability to access message history and search. I think that this kind of system can replace about 80% of emails in the future.

What is the next step? We can connect channel topics with topic map and allow users to reference subtopics in real time conversation using analog of WikiWords. In this case we have a topic map which is modified by users in real-time. This topic map has references to group chat messages. But as any other topic map it also can have information about associations between topics and references to other resources.

We can add ability for users to provide Wiki-like occurrences in this topic map and ability to add links to resources (analog of social bookmark manager del.icio.us ).

Result - live topic map which integrates summary information about subjects, associations between subjects, real-time messages and links to resources connected with subjects.

Posted at 05:24 PM     Read More  


Wed - October 27, 2004

Resources and Subject pages ... towards better web


One of the most interesting aspects of subject centric computing (and smart search) is a relationship between resources and subjects.

I use several news web sites such as EWeek and ZDNet to check news in computer industry. Each time when I read an article I (automatically) estimate it from perspective of subject centric computing. General practice is to reference other news articles on the same web site, provide links to company web sites and links to conference web sites.

That's not how it should be in a subject centric world. In subject centric world we should have special kind of resources - subject pages. Subject page provides summary information about subject with links to other subjects and (regular) resources.

Several examples of subject page collections:

Wikipedia (sample pages: Topic Maps , Treaty establishing a constitution for Europe, Search Engine , WiFi Microsoft ,IBM)
Internet Movie Database (sample pages: Troy , Brad Pitt, Wolfgang Petersen )
Yahoo Finance (sample pages: Microsoft IBM)

Italian Opera Topic Map (sample pages: Tosca , Giacomo Puccini , Teatro alla Scala )

When I read an article on IT news web site I would like to have references to subject pages for companies, events (! such as company merges , spin-offs, product announcements), technologies, people, products etc. In this case I can easily jump from article to subject pages where I have links to other related subjects and resources.

I think web can be much more user friendly if we have additional resource layer - subject pages.

Posted at 11:45 PM     Read More  


Tue - October 19, 2004

Google desktop search or ... "Where are my subjects?"


I installed Google desktop search (beta) last week. I have mixed feelings about it.

I really like concept of desktop and Internet search integration. What I do not like is that I cannot organize search results around subjects and subject categories. So now in addition to all those links on Internet I also have thousands links from my hard drive and web pages cache.

I can just repeat my comments and recommendations from
Apple's Spotlight, What do we search for and ... Topic Maps and Smart search, blinkx and ... Topic Maps

with probably one addition.

In short term it can help if Google desktop search can allow users to define virtual folders/categories. When users define virtual folder/category they can specify which URI patterns should go to this folder. Folder rules also can use metadata. So, for example, if I trust Wikipedia as a source of my "subject - centric " pages, I can specify that pages from Wikipedia under specific category should go to specific virtual folder in my desktop search. Same is true for "local subjects". I can run desktop Wiki (and/or topic map engine) and use it to manage my own subjects and produce local subject centric pages. Now, if I have virtual folders/categories defined and I type keywords in a search text field, I should get search results organized by virtual folders.

Posted at 07:33 PM     Read More  


Tue - October 12, 2004

Jabber, publish & subscribe arcitecture and ... Topic Maps


One of the main goals of the Topic Maps is to facilitate development of a distributed network of providers and consumers of "subject centric information maps". Existing Topic Maps standards provide basic support for building this kind of environment: XML-based interchange syntax (XTM), scalable identification schema based on a concept of Public Subject Identifiers (PSIs), concept of topic maps merging and leveraging of HTTP as a standard access protocol.

In a "simplified world" we can create some topic map and we can publish it to a public web server and everyone can view it using topic map viewer (such as Omnigator ) or reference it (and reuse) in other topic maps. Problem with this approach is that it is static. If later we change our topic map, nobody will know about that. Only when topic map consumers refresh their topic maps they will be able to reload and remerge our topic map with their own information.

Building real-life Topic Maps-based applications, however, often requires more sophisticated protocols and architectures. I would like firstly to reference several interesting ideas and approaches:
- TMShare
- TMRAP and here
- Virtual and Federated Topic Maps

We are ready now to jump into discussion about possible usage of Jabber messaging architecture for building distributed networks of topic maps...

What we really need is a near real time messaging / notification mechanism which distributes updates to all interested parties and/or allows running asynchronous queries to multiple information providers.

Jabber, from my perspective, is a very good candidate for this kind of infrastructure. Why?

- Jabber is an XML-based messaging infrastructure with extensible architecture.
- It allows to implement query-response scenarios.
- It also supports notification scenarios.
- It supports store and forward mechanism (important for occasionally connected clients)
- It is relatively firewall friendly (it uses client to server outgoing connections on predefined port).
- It is platform/ language neutral.
- Jabber's protocol XMPP is published recently by IETF as RFCs
- It is a proven (and running already) messaging infrastructure for communicating between humans and applications in all combinations.

The most interesting part is probably JEP-0060: Publish-Subscribe. This specification defines a generic publish/subscribe framework for use by Jabber entities. I think it can be reused for building publish/subscribe infrastructure for topic maps.

With this approach topic map providers can announce which topics are available for subscriptions. And consumers can subscribe to specific topics. As soon as new information is available about topic it can be distributed to all subscribers. Inside of messages we can use topic map fragments defined by TMRAP , for example.

Interesting example of JEP-0060 usage can be found on PubSub.com

Posted at 12:26 PM     Read More  


Sun - October 10, 2004

Smart search ... and specifics of Topic Maps approach


I read several interesting materials regarding search improvement last week: comments about Microsoft Search Champs conference, notes from Web 2.0 conference and Thomas B. Passin's book "Explorer's Guide to the Semantic Web". A lot of different ideas and approaches! These thought-provoking readings inspired me to elaborate on Topic Maps approach for solving search problem.

I would like to start with reference to Lars Marius Garshol's work "Metadata? Thesauri? Taxonomies? Topic maps!" and his summary of benefits of using Topic Maps for search. What is not probably obvious is how Topic Maps can be used by "traditional" search/directory engines such as Google, Yahoo, MSN to implement new generation of search. And also how Topic Maps can be used for building Internet-scale search infrastructure.

1. Topic Maps - based approach shares understanding that people are searching mostly for information, not for "documents/resources".

2. Information is "hidden" in resources. Resources are optimized for reading by humans, not computers.

3. Automatic extraction of information from resources is expensive operation with limited reliability.

4. Automatic reliable matching queries with information in resources is expensive operation.

5. Context plays important role in search. People can play different roles and can switch area of interests.

6. Personalization is important. Each person can have own topics of interests and requirements for information retrieval. Personal interests are relatively stable.

Instead of concentrating on advanced general algorithms for 3 and 4 using 5 and 6, Topic Maps approach breaks tradition of working in a "resource world" and suggests to shift efforts to a world of "topics of our interests" or subject proxies.

Topic Maps approach concentrates more on a question of how to create a distributed network of information providers and consumers based on interchange standard for managing "maps" of subject proxies linked with resources.

Topic Maps approach is based on explicit management of subject proxies which represent "topics of our interests". With Topic Maps we also explicitly represent summary of information about subjects and their relationships.

We also connect "world of subjects" with "world of resources" using explicit links. Topic Maps approach does not really define how these links are created. It can be done manually by person, by sophisticated linguistic or statistical algorithms or combination of available methods.

Topic Maps approach is supported by ISO standard which helps to create, interchange and merge topic maps and in the future query and constraint topic maps.

Any person, organization, company can be provider of information using Topic Maps interchange standard.

Topic Maps standard does not force information providers to use topic maps for internal representation. Information suppliers can use relational, XML, object databases with different schemas to represent information. The only requirement is to provide "topic map view" using interchange standard.

If topic map views are available information from multiple suppliers can be aggregated. This aggregation can be done by aggregators (such as Google,Yahoo, MSN), and/or directly on desktop. This reminds us a world of RSS with exception that we are interested in distributing and aggregating topic maps instead of RSS feeds. Inspiring preview of desktop topic map aggaregation can be found in Steve Pepper's presentation "Seamless Knowledge with TMRAP"

Problem of search against network of resources is replaced by a problem of search against network of subject proxies and resources. Second approach can provide better user experience because it effectively bridges the gap between resources, subjects and users.

Posted at 12:52 PM     Read More  


Fri - October 1, 2004

Smart search, Yahoo, iTunes and... Topic Maps


I was thinking who can provide this kind of subject proxy service which I described in previous posting

First company which came to my mind was Yahoo. They already manage huge directory of subject proxies. So, for example, if I am interested in philosophy there is already subject proxy for this topic and subject proxy for class Philosophers Web page which represents this subject proxy has a list of philosophers with some comments. Very subject centric! (almost, there are some links to resources there) Let's click on Martin Heidegger link, for example. Hmmm... I see links to several resources related to Martin Heidegger with some comments. In subject centric environment I expect more. I would like to see summary of facts about this philosopher with links to other subject proxies. These facts can be aggregated from several different sources. This set of facts is the first thing I would like to know and see. After that I should have links to different resources: original publications, comments, reviews, related works, news, pictures, blogs, RSS feeds etc. And with Yahoo personalization I should have ability to specify that Martin Heidegger is a topic of my interests so I can easily get access to facts and resources about this topic.

What about Google (directory)? From perspective of subject centric computing it is very close to Yahoo directory. For example, there is page for Philosophers and a page for Martin Heidegger. Unfortunately, again, mixture of "subject" and resource links, no "facts".

My second thought was about Apple's iTunes music store. True, it has limited "ontology". But it is very subject centric. We can find "subject pages" for artists, albums, genres. We can get biography, links to influencers and contemporaries. Search provides results grouped by subject classes: albums, artists, songs.

I do not mind to have iTunes-like subject centric service as part of extended .Mac. .Mac also can provide some interesting ideas about subject proxy synchronization. Think about iDisk extended to idea of subject proxies. I can use my "subject proxies" in disconnected scenario. I can add new proxies, facts, links to resources. It can be synchronized with "subject proxy server" and with local copies on my different computers. And with Apple's seamless network connectivity I even can share some of my subject proxies, comments and resources with my friends sitting somewhere in Starbucks.

Posted at 08:46 PM     Read More  


Tue - September 28, 2004

Smart search, blinkx and ... Topic Maps


I was experimenting with blinkx recently and I tried to understand how close/far it is from subject centric computing model.

Blinkx "attaches" nicely itself to Internet Explorer, Word and several other programs. When you look at resource using these programs you can select some phrase and blinkx will try to find available resources which are related to concept(s) in selected phrase. Blinkx has several channels for resources: local drive, internet, news, products, video clips and web logs. List of channels can be extended. That's nice.

What is not so nice, I think, is that concept of subjects is hidden and not available for users. When I select some phrase blinkx tries to find "ideas not keywords" behind this phrase. But I cannot really see what blinkx's guess is. I only can see resources which somehow are related to blinkx's guess.

I think better results can be achieved if we introduce subject proxies explicitly and allow user to manage subjects of his/her interests. I would split channels into two groups. First group represents subjects which I am interested in: People, Projects, Technologies, Products etc. Second group represents resources such as News, Reports, Video Clips etc.

When I select some phrase on a web page or document, entries in all channels (resources and subjects) can be activated with different "relevancy" level. I personally will go in most cases to "subject" channels and will jump to "subject page" which is a summary of information about specific subject combined from different sources. So when I select word "Troy" on a web page I would like to have activated subject proxy for city "Troy" in my City channel and subject proxy for movie "Troy" in Movie channel and in my DVD channel... well... DVD "Troy". Relevancy can be assigned based on whatever algorithms are available for web page "subject scanning".

As a next step I can jump, for example, to Troy-movie. My smart search client (with my help) "knows" now better what I am really interested in. It can go to server(s) and retrieve summary information about Troy-movie. Troy-movie is in my focus now, it changes activation level of different resources and subject proxies.

So if I look at People channel I will probably see several names related to this movie. I can jump to one of this names and activation level of resources and subject proxies will be changed again.

Now, let's say I selected a name of a person on a web page and my smart search client cannot give me any reasonable suggestions. Well... I drag and drop this name to People channel and new local subject proxy is created. As a next step my smart search client goes to server environment and tries to find well known subject matching my local subject. If there are some suggestions and I agree with one of them my local proxy becomes connected with "world wide" subject proxy network.

Server environment can monitor "false" subject requests and create new public subject proxies for subjects which become "popular".

That's how I see smart search...

Posted at 08:25 PM     Read More  
Subject oriented computing, new approaches to user interface and... Topic Maps
Managing subject proxies and Topic Maps: Enterprise perspective
Subject Oriented Computing - Topic Maps and management of subject proxies
Apple's Spotlight, What do we search for and ... Topic Maps
"SOA Challenges: Entity Aggregation" from .Net Architecture Center and ... Topic Maps
How Topic Maps view enriches relational data sources
XQuery-based data integration - one step forward, can we do two?
Topic Maps - based information integration
Mapping topic map schemas to schema level predicates
Extending "rule" part of TMCL
It is time for "Save as XTM" initiative
Three levels of information integration
Topic Maps in investment industry
TMSchema presentation update
TMSchema - TMCL Lite proposal


©