Tool: DEVONThink


A program to organize your PDF files.

If you're anything like me, you probably have a few thousand PDF files living somewhere on your computer (hopefully reasonably well organized). This is very convenient, and is a revolution in time-savings compared to the old cycle of looking up a paper, going to the library to find it, photocopying it and filing it somewhere (in physical space). Since I like reading papers on the computer screen, I almost never actually print any papers—in fact, I wrote my whole Ph D thesis without printing more than a dozen papers. (I realize not everyone has the same approach)

So here's the problem. What do you do to actually manage all this information? Due to the uncoordinated way in which electronic journals developed, no one ever bothered to include any kind of "meta data" in the files to help us index them at a later date (when we have thousands of them on our disks). Fortunately I (and no doubt many others) realized that some sort of logical naming system could help a little without being too annoying. What I did was to use a scheme like this:

nameabbrev-JournalAbbrev-vol-year-page.pdf

Then I would file these away in the folder for that author. Now, you might ask, "yeah, but which author?" And you'd have a very good point. I usually went with the big shot (she/he is more constant). What if someone who became a "big shot" later was once in a lab of another big shot? What if two big shots are collaborating? Should I copy papers into more than one folder?

This ridiculous system actually worked pretty well while I was a graduate student and I was deeply focused on one very particular subject. Now, I always tried to read stuff outside of my immediate hyper-specialty, but still, at the time, filing things by name worked pretty well.

Things got out of control as I went along in my postdoc. I started to branch way out and I was reading things about not just my particular area of physical chemistry, but also proteins and more biochemistry. So I started a separate filing system in parallel with the name-based one, only this one was subject-based. I was less interested in assembling the "collected works" of a particular big shot, than I was in learning about new subjects. What I really needed was a program to help sort this mess out.

I am sure there are many programs that are up to the task of organizing and viewing PDF and other format files, but I'll briefly describe one in particular, and you can think of this quasi-review as motivation for you to deal with the mounting information overload problem (you must have one too, no?). This program is only available on the Mac. Sorry, if there is something similar for Windows, good for you. We've been hearing this big lie for years about no software for the Mac; it wasn't true in the past, and it's not true now. Maybe some clever windows programmer can copy this program.

DEVONThink (DT) can be pretty complicated, but I'll just describe it at the level where I'm using it right now. DT can make a database of your files much the same way that, say iPhoto or iTunes does, but it does not require (or even encourage) you to copy the files to a special file structure. In fact, your files can stay put right where they are on your disk. Naturally, this means you're in a bit of a hole if you move or delete them on your computer. You can import them simply by dragging individual files, groups of files, or whole folders of files right into the DT window (what else would you expect). It will respect your files and treat them as "Groups" in DT (similar to playlists or folders in iTunes or iPhoto). You can have nested Groups (this was critical for me).

As your files are being imported, DT is actually indexing their contents as well. This is where the power really is. Of course, you can click on a file, and it will be displayed right away (no waiting for Preview to load the file), and it's really speedy about it. In addition to this basic functionality—which is almost already enough for me—there are four features that I really like and I will describe them below.

• Find across all files
• Classify
• See Also
• Wiki

You can type whatever you want in the ubiquitous find box (similar to Google box in Safari or other broswers, or the Finder). DT will quickly give you a list of all the files matching the criteria with a kind of ranking. This works very well and is very fast.

The "Classify" feature is a great surprise. When you are viewing a file, you can click the Classify button and it will show you a list of Groups that it thinks the file might belong in. It typically shows about ten groups. Not all of them are appropriate, but I think as the file collection grows, it will become more accurate. There is also an autoclassify option, but I don't trust it yet. Maybe I'll post an update about this later.

The "See Also" button is also a pleasant surprise. Here you will be presented with not a list of groups, but one of other files that DT thinks are relevant. Both of these features should allow you to download zillions of papers, and loosely file them away and then use DT to help you make connections between the different papers. Since finding connections between information from diverse subjects is the whole point of The Plexus, this program is certainly going to be one of our favorites.

The last feature is one that has been sort of sneaking up on me lately in several different contexts. This is the Wiki paradigm. Now, like you, I end up quite often finding myself at the wikipedia (usually through Google, but sometimes I right there first). Wiki is a user-modifiable documentation system that is very popular with open source projects, but also in school class projects. I've tried three different systems now: the original wikimedia, the wiki features in the online learning package called moodle, and finally in DT. DT's implementation is very simple.

In DT, doing wiki is really simple. You start with a blank RTF file, and you just type in it. After you write something (can be just notes or whatever), you highlight an important word or phase. You right click (or control-click) and choose "Make Link". Now it becomes a link. If your selected text is exactly the same as a group you have in your database, clicking on the (now created) link will immediately transport you to that group. If you don't have anything in your database to link to, DT will create a new RTF file when you click on that link. Now you start typing in that file and so on, and so on. Eventually you'll have a highly linked set of notes.

This has been a very long post, but I hope that by trying to explain my information overload problem, and by pointing out one approach to solve it that you will take heart that there is a path out of this mayhem-rich jungle. DT is not perfect, and things get a little funky once you realize that there is an internal filing system that DT will copy some files into (if you drag them directly in from, say, Preview). But really, it's no big deal. If you're lucky enough to use a Mac, you can download DEVONThink and try it for 15 days (of course I'm not affiliated with them in any way, I'm just a customer). It's not that expensive, and you'll save so much time. Now we can just start downloading files like crazy and make all kinds of new connections that we (or maybe no one) ever thought of!

(I would also like to point out that it is possible to attach files to bibliographic entries in EndNote and Bookends, I used to do this and it's too cumbersome. Nirvana would be automatically linking the PDFs in DT to either of these two reference managers. We can dream, can't we?)

Posted: Fri - March 11, 2005 at 09:39 PM         |


©