Fri - June 10, 2005
Yahoo, Mindset ... and subject centric computing
I really like idea behind Mindset research
project
It is great that we can easily change ranking of
the search results by specifying our intention. I would suggest to extend idea
of dimensions and to introduce "Document - Subject"
dimension.I think we should have one
"Subject" node on the right side. On the left side we can have different nodes
which correspond to different dimensions (Shopping, News, Entertainment etc.) .
I think that other side of each dimension is "Subject" so all these dimensions
are connected to one subject node.
So if we position our intention close
to subject node we will have something like "Answers.com ": results are
subject centric pages. But we can add different "flavors" to search request by
adding more "News" or "Shopping" for example. Search results become less
subject centric in this case.
Posted at 11:44 AM
Read More
Wed - October 27, 2004
Resources and Subject pages ... towards better web
One of the most interesting aspects of subject
centric computing (and smart search) is a relationship between resources and
subjects.
I use several news web sites such as EWeek and
ZDNet to check news in computer industry. Each time when I read an article I
(automatically) estimate it from perspective of subject centric computing.
General practice is to reference other news articles on the same web site,
provide links to company web sites and links to conference web sites.
That's not how it should be in a
subject centric world. In subject centric world we should have special kind of
resources - subject pages. Subject page provides summary information about
subject with links to other subjects and (regular)
resources.Several examples of subject
page collections:Wikipedia (sample pages: Topic Maps , Treaty
establishing a constitution for Europe, Search Engine
, WiFi Microsoft ,IBM)
Internet Movie Database (sample
pages: Troy , Brad Pitt, Wolfgang Petersen
)Yahoo Finance (sample pages:
Microsoft IBM)
Italian Opera Topic
Map (sample pages: Tosca
, Giacomo
Puccini , Teatro alla
Scala ) When I read an
article on IT
news web site I would like to have references to subject pages for companies,
events (! such as company merges , spin-offs, product
announcements), technologies, people, products etc. In this case I can easily
jump from article to subject pages where I have links to other related subjects
and resources.I think web can be much
more user friendly if we have additional resource layer - subject pages.
Posted at 11:45 PM
Read More
Tue - October 19, 2004
Google desktop search or ... "Where are my subjects?"
I installed Google desktop search (beta) last week.
I have mixed feelings about it.
I really like concept of desktop and Internet search
integration. What I do not like is that I cannot organize search results around
subjects and subject categories. So now in addition to all those links on
Internet I also have thousands links from my hard drive and web pages cache.
I can just repeat my comments and
recommendations from
Apple's
Spotlight, What do we search for and ... Topic Maps
and
Smart
search, blinkx and ... Topic Maps
with probably one
addition.In short term it can help if
Google desktop search can allow users to define virtual folders/categories. When
users define virtual folder/category they can specify which URI patterns
should go to this folder. Folder rules also can use metadata. So, for example,
if I trust Wikipedia as a
source of my "subject - centric " pages, I can specify that pages from Wikipedia
under specific category should go to specific virtual folder in my desktop
search. Same is true for "local subjects". I can run desktop Wiki (and/or topic
map engine) and use it to manage my own subjects and produce local subject
centric pages. Now, if I have virtual folders/categories defined and I type
keywords in a search text field, I should get search results organized by
virtual folders.
Posted at 07:33 PM
Read More
Sun - October 10, 2004
Smart search ... and specifics of Topic Maps approach
I read several interesting materials regarding
search improvement last week: comments about Microsoft Search Champs conference,
notes from Web 2.0 conference and Thomas B. Passin's book "Explorer's Guide to
the Semantic Web". A lot of different ideas and approaches! These
thought-provoking readings inspired me to elaborate on Topic Maps approach for
solving search problem.
I would like to start with reference to Lars Marius
Garshol's work "Metadata?
Thesauri? Taxonomies? Topic maps!" and his summary
of benefits of using Topic Maps for search. What is not probably
obvious is how Topic Maps can be used by "traditional" search/directory engines
such as Google, Yahoo, MSN to implement new generation of search. And also how
Topic Maps can be used for building Internet-scale search infrastructure.
1. Topic Maps - based approach shares
understanding that people are searching mostly for information, not for
"documents/resources".2. Information
is "hidden" in resources. Resources are optimized for reading by humans, not
computers.3. Automatic extraction of
information from resources is expensive operation with limited reliability.
4. Automatic reliable matching queries
with information in resources is expensive operation.
5. Context plays important role in
search. People can play different roles and can switch area of
interests.6. Personalization is
important. Each person can have own topics of interests and requirements for
information retrieval. Personal interests are relatively stable.
Instead of concentrating on advanced
general algorithms for 3 and 4 using 5 and 6, Topic Maps approach breaks
tradition of working in a "resource world" and suggests to shift efforts to a
world of "topics of our interests" or subject
proxies.Topic Maps approach
concentrates more on a question of how to create a distributed network of
information providers and consumers based on interchange standard for managing
"maps" of subject proxies linked with
resources.Topic Maps approach is based
on explicit management of subject proxies which represent "topics of our
interests". With Topic Maps we also explicitly represent summary of information
about subjects and their
relationships.We also connect "world
of subjects" with "world of resources" using explicit links. Topic Maps
approach does not really define how these links are created. It can be done
manually by person, by sophisticated linguistic or statistical algorithms or
combination of available methods.Topic
Maps approach is supported by ISO standard which helps to create, interchange
and merge topic maps and in the future query and constraint topic
maps.Any person, organization, company
can be provider of information using Topic Maps interchange
standard.Topic Maps standard does not
force information providers to use topic maps for internal representation.
Information suppliers can use relational, XML, object databases with different
schemas to represent information. The only requirement is to provide "topic map
view" using interchange standard.If
topic map views are available information from multiple suppliers can be
aggregated. This aggregation can be done by aggregators (such as Google,Yahoo,
MSN), and/or directly on desktop. This reminds us a world of RSS with exception
that we are interested in distributing and aggregating topic maps instead of RSS
feeds. Inspiring preview of desktop topic map aggaregation can be found in
Steve Pepper's presentation "Seamless
Knowledge with TMRAP"
Problem of search against network of
resources is replaced by a problem of search against network of subject proxies
and resources. Second approach can provide better user experience because it
effectively bridges the gap between resources, subjects and users.
Posted at 12:52 PM
Read More
Fri - October 1, 2004
Smart search, Yahoo, iTunes and... Topic Maps
I was thinking who can provide this kind of subject
proxy service which I described in previous posting
First company which came to my mind was Yahoo. They
already manage huge directory of subject proxies. So, for example, if I am
interested in philosophy there is already subject proxy for
this topic and subject proxy for class
Philosophers Web page which represents this subject proxy has a
list of philosophers with some comments. Very subject centric! (almost, there
are some links to resources there) Let's click on Martin
Heidegger link, for example. Hmmm... I see links to several
resources related to Martin Heidegger with some comments. In subject centric
environment I expect more. I would like to see summary of facts about this
philosopher with links to other subject proxies. These facts can be aggregated
from several different sources. This set of facts is the first thing I would
like to know and see. After that I should have links to different resources:
original publications, comments, reviews, related works, news, pictures, blogs,
RSS feeds etc. And with Yahoo personalization I should have ability to specify
that Martin Heidegger is a topic of my interests so I can easily get access to
facts and resources about this
topic.What about Google (directory)?
From perspective of subject centric computing it is very close to Yahoo
directory. For example, there is page for Philosophers
and a page for Martin
Heidegger. Unfortunately, again, mixture of "subject" and resource
links, no "facts".My second thought
was about Apple's iTunes music store. True,
it has limited "ontology". But it is very subject centric. We can find "subject
pages" for artists, albums, genres. We can get biography, links to influencers
and contemporaries. Search provides results grouped by subject classes: albums,
artists, songs.I do not mind to have
iTunes-like subject centric service as part of extended .Mac. .Mac also can
provide some interesting ideas about subject proxy synchronization. Think
about iDisk
extended to idea of subject proxies. I can use my "subject proxies" in
disconnected scenario. I can add new proxies, facts, links to resources. It can
be synchronized with "subject proxy server" and with local copies on my
different computers. And with Apple's seamless network connectivity I even can
share some of my subject proxies, comments and resources with my friends sitting
somewhere in Starbucks.
Posted at 08:46 PM
Read More
Tue - September 28, 2004
Smart search, blinkx and ... Topic Maps
I was experimenting with blinkx
recently and I tried to understand how close/far it is from
subject centric computing model.
Blinkx "attaches" nicely itself to Internet
Explorer, Word and several other programs. When you look at resource using these
programs you can select some phrase and blinkx will try to find available
resources which are related to concept(s) in selected phrase. Blinkx has
several channels for resources: local drive, internet, news, products, video
clips and web logs. List of channels can be extended. That's nice.
What is not so nice, I think, is that
concept of subjects is hidden and not available for users. When I select some
phrase blinkx tries to find "ideas not keywords" behind this phrase. But I
cannot really see what blinkx's guess is. I only can see resources which somehow
are related to blinkx's guess.
I think
better results can be achieved if we introduce subject proxies explicitly and
allow user to manage subjects of his/her interests. I would split channels into
two groups. First group represents subjects which I am interested in: People,
Projects, Technologies, Products etc. Second group represents resources such as
News, Reports, Video Clips etc.
When I select some phrase on a web page or
document, entries in all channels (resources and subjects) can be activated
with different "relevancy" level. I personally will go in most cases to
"subject" channels and will jump to "subject page" which is a summary of
information about specific subject combined from different sources. So when I
select word "Troy" on a web page I would like to have activated subject proxy
for city "Troy" in my City channel and subject proxy for movie "Troy" in Movie
channel and in my DVD channel... well... DVD "Troy". Relevancy can be assigned
based on whatever algorithms are available for web page "subject scanning".
As a next step I can jump, for
example, to Troy-movie. My smart search client (with my help) "knows" now
better what I am really interested in. It can go to server(s) and retrieve
summary information about Troy-movie. Troy-movie is in my focus now, it changes
activation level of different resources and subject
proxies.
So if I look at People channel
I will probably see several names related to this movie. I can jump to one of
this names and activation level of resources and subject proxies will be changed
again.
Now, let's say I selected a name
of a person on a web page and my smart search client cannot give me any
reasonable suggestions. Well... I drag and drop this name to People channel and
new local subject proxy is created. As a next step my smart search client goes
to server environment and tries to find well known subject matching my local
subject. If there are some suggestions and I agree with one of them my local
proxy becomes connected with "world wide" subject proxy network.
Server environment can monitor "false"
subject requests and create new public subject proxies for subjects which
become "popular".
That's how I see
smart search...
Posted at 08:25 PM
Read More
Wed - September 1, 2004
Apple's Spotlight, What do we search for and ... Topic Maps
I recently enjoyed watching "Tiger" presentation and
specifically presentation of a new Apple's search technology -
"Spotlight"
As many other people I would like to have this kind
of search now on OS X, Windows and Linux computers. I also would like to have
this kind of search for enterprise document
repositories.
What I cannot find in
this demonstration is an explicit concept of "subjects" or "topics". If I
select a name of a person in email, for example, I can find all emails,
presentations, calendar entries, documents, images etc. which have reference to
this name in a file name, metatags or in document content. But can I find all
projects which I manage? Can I find all applications which I am responsible for?
Can I find all servers which I have to check from time to time or all
technologies which I am interested in? Projects, applications, servers,
technologies are subjects in my area of interests.
When I
do search, I would like to search not only for resources which reference my
favorite subjects, but also for other subjects which are connected with subject
in focus.
So I will probably add topic
map engine to Spotlight on my OS X computer as soon as Tiger will be available.
How will I use Topic Map engine? I will use it to define subjects which are not
covered by standard OS X applications. I will use it to manage relationships
between subjects in my area of interests. I will also create a script which
creates pseudo-documents (in html format?) for each subject. Each
pseudo-document will have all names, inline occurrences and associations. I can
also create document proxies for external resources which are not located on my
hard drive (if Spotlight/Safari do not allow to attach custom metatags for
bookmarked URIs).
It seems that
Spotlight allows to define custom document categories/types. So I can define
pseudo-document types for my subject classes, such as "projects",
"applications", "people", "servers", "companies", "technologies" etc. Now I
can use standard system-wide Spotlight engine to search subjects and resources.
And I can use Safari to navigate between different subjects.
Posted at 09:03 PM
Read More
Tue - January 27, 2004
(About) IBM's content management strategy
I recently had a chance to look at IBM's direction
in content management. There are several solutions for content management in
IBM's portfolio right now. Some of them can work together. Some of them have
overlaps. And some of them do not have good integration with
others...
Strategic plan is (as I understood it) to leverage
JSR 170 to provide general unified interface to all content storages/products.
As addition to JSR 170, WebDAV can be used to some
extent.
It looks like JSR 170 will
address basic features important for content management in Java world. I
personally is interested in comparison JSR 170 with Microsoft's WinFS (and both
with Topic Maps).
Relevancy to Topic
Maps?
JSR 170 (and WinFS) allows to
represent metadata and relationships between various subjects.
There is a basic (content) schema and
ability to extend basic schema.
What I
think is interesting for Topic Maps community is ability in future to leverage
metadata initiatives supported by IBM and Microsoft. I hope that Topic Map
engines will be able to import (may be virtually) metadata from both systems.
Stay tuned for Topic Map Thoughts about some improvements required for TMDM to
support this kind of import.
I am sure
it will be also possible to represent topic maps as "content/metadata provider"
for JSR 170 (and WinFS) based systems.
Posted at 02:45 PM
Read More
|
Quick Links
Calendar
| | Sun | Mon | Tue | Wed | Thu | Fri | Sat
|
Categories
Archives
XML/RSS Feed
Statistics
Total entries in this blog:
Total entries in this category:
Published On: Oct 01, 2005 02:56 PM
|