Podilicious:  A Prospectus

An Imagined Social Search Engine & Clips Manager for the Podosphere

 

What is Podilicious?

Podilicious is an imagined social search engine and clips manager for the Podosphere. The design of Podilicious is based on successful social software such as del.icio.us (its namesake), Flickr, and Furl.  I present this prospectus with the hope that someone who is an actual software developer will find it inspiring.  Those of us who are interested in folksonomies, in producing rather than simply consuming media, and in the future of the web as a "Commons," would benefit greatly from the development of such an application.  It is with this hope that I offer this prospectus under a Creative Commons license

[I can be emailed here.]

Why is Podilicious necessary?

Google and others are attempting to convince Big Media to put content online so as to make it available to the public and to various search engines. It is likely that any deal between Google et al., and Big Media will involve complex licensing and DRM schemes that will restrict the ability of the public to take full advantage of that content. Leaving aside for the moment the question of "fair use" of copyrighted material (a legal terrain more treacherous now than ever), those of us who desire more control over the media we consume must look to the universe of works in the following categories:

  1. Work in the public domain.
  2. Work with Creative Commons licenses, with varying degrees of restriction.
  3. Work with GPL, Copyleft, or GNU licenses.

Most podcasts fall into one of these three categories, since the ethos of the podcasting community is (mostly) diametrically opposed to the sort of IP restrictions that characterize the business model of the corporate networks. It is therefore sensible to begin building a system of rich media search and distribution (syndication) on the basis of this wealth of relatively unrestricted material, one that will provide both the framework and the tools necessary to further the goals of intellectual, cultural, and artistic production privileged in the US constitution.  Podilicious is a reasonable first step. It leverages the generous podcasting ethos in respect to intellectual property, and it provides to podcast producers tools for searching, sorting, and acquiring material for use in the production of their own podcasts. As podcasting and other on-demand, time-shifted, open-protocol based (RSS et al.) rich media delivery systems begin to mature, Podilicious will mature right along with them. Podilicious may be expected to increase its scope to include other media types, including video. It is important to stress, however, that Podilicious will only ever aggregate and search material that is 1) coded in open file formats which can be played on various media players, and 2) licensed in such a way as to allow others to redeploy parts of it for their own purposes. What follows in this paper (and in the accompanying illustrations) is a rough outline of how an alpha version of Podilicious might look.

 

 

General Terms

Podosphere: The totality of all MP3 files made available on the internet through enclosure enabled RSS 2.0 XML.

Podcasts: The "networks" or "programs" that broadcast MP3 content via RSS 2.0 in a serial or episodic fashion.

Entries: The term that aggregators use to refer to individual podcast episodes.

Objects

Sources: What aggregators refer to as "entries": single episodes of a podcast.

Clips: For any given source, the domain of sequences identified by users, each less-than or equal-to the length of the source. For any given source there are a large but finite number of possible clips.

Sets: statistically significant groups of overlapping clips associated with the same metadata which are clustered in regions of a single source.

Users: everyone with a Podilicious account.

Owners: A user who has made a clip is known as the "owner" of that clip.

Borrowers: A user who adds a clip which she has not herself made to a "clip bin" is a "borrower" of that clip.

Rules

  1. One and only one user may own a clip.
  2. Any user may own one or more clips.
  3. Any user may access every clip in Podilicious with all assigned metadata, regardless of owner, using search.
  4. Any user may add any clip made by any other user to her "clip bin" (this is known as "borrowing").
  5. Any user may view the "clip bins" of any other user.
  6. A clip bin may only contain one instance of a given clip.


Main Pages / Functions

[Overview Illustration]

Source Splitters [illustration]: allow users to

  1. monitor a designated list of podcasts for new sources,
  2. store sources for the purpose of clipping,
  3. perform clipping operations on sources,
  4. assign metadata to clips,
  5. add clips to a "clip bin."

 

 

Clip Search [illustration]: allows users to

  1. perform searches of the Podilicious database of clips according to metadata in the form of tags,
  2. sort the search results according to chronology, "clipcount" or podcast,
  3. preview the content of clips,
  4. add clips to a "clip bin."

 

Clip Bins [illustration]: allow users to

  1. store clips for the purpose of reviewing, organizing, and archiving,
  2. download clips via file transfer and / or RSS feeds.

 

Smart Bins [illustration]: allow users to

  1. set up highly customizable condition lists for the purpose of aggregating relevant clips,
  2. download clips via RSS feeds.


Search Engine Algorithm

How will Podilicious determine how to order search results? It will use a yet to be specified algorithm which will, for each clip relevant to the search, determine a "clipcount" score.

Podilicious may take into consideration the following:

  1. Podilicious may calculate the number of clip bins that contain the clips relevant to the search. Quite simply, a clip which is present in more bins will be given a higher clipcount score.
  2. Podilicious may also consider the value of the source or even the podcast (i.e, the number of clips clipped from a given source or from sources belonging to a given podcast over a period of time) in determining the score of a derivative clip, and / or the historical association of the source or podcast with metadata relevant to the search.
  3. Podilicious may look for statistically significant groups of overlapping clips clustered in regions of sources associated with the same metadata. Accordingly, it would recognize such a cluster of clips. It would calculate the distribution of clips in terms of frequency, mean, median, and standard deviation. It would determine which clip from the cluster is the most "moderate," or the best representative of all the other clips. It would aim to assign a unique value to each clip in a given cluster. It is important to stress that even in this scenario Podilicious would be concerned with returning to the searcher real clips made by real human clippers, not clips produced by the algorithm. The clips that the search engine returns to the user would not be clips made by a computer, but by a user. It is an underlying premise of such a system that while computers can help to sort out the collective intelligence of 10's, 100's, or 1,000's of clippers, that the finer distinctions involved in clipping source material (deciding the precise in-points and out-points) are a properly human activity and so felicitous search results would need be discerned from among the selection of clips offered by users.


Identification

Sources

  1. MP3 file metadata
  2. RSS XML metadata.

Clips

  1. Source metadata
    1. MP3 file metadata
    2. RSS XML metadata
  2. Owner assigned metadata
  3. Borrower assigned metadata.


Users

  1. Unique account ID with password
  2. Access to proprietary
    1. Source Splitters
    2. Clip Bins
    3. Smart Bins.
 

 


Last Updated: January 15, 2005