Sunday
Sep252005
Organizing Electronic Documents GTD-Style?
Sunday, September 25, 2005 at 9:47PM
Over at Lifehack.Community user anithri asks "How do I organize a large and growing collection of Electronic documents?":
Current Filing Techniques Aren't Relational
The two suggestions given in response to the Lifehack article ("use Spotlight as the tagging system", and "look into Google Desktop") are based on an IR-style index-and-search approach, also discussed in The Death of Folders? and The File Manager Is Dead. Long Live the Lifeblog. However, I think these approaches are missing one of the fundamental concepts about our information: It is connected. Among other things, documents relate to:
A simple alpha filing system for electronic documents?
The other idea this question stimulated is applying David Allen's GTD filing system to the digital realm. I'm currently testing this for email, and it has worked pretty well so far. Briefly, in addition to @action and @waiting-for, I have a top-level email directory for each letter of the alphabet, each of which contains email archive files (mbox files on my unix machine) for each project (e.g., n/nsf-site-visit-2005, p/personal-information-web). Finally, each of those latter files contains the relevant messages. Here's the conceptual map (vertical dimension is 'containment', with the outer-most container at the top):
This works OK - Filing is pretty fast, for the same reasons as with the analog GTD version: Quick to dream up a name, only a few places I might have put it, etc. However, due to the email client I use (pine), textual search is pretty difficult. (Side note: When will someone write a simple lucene Java app to index mbox files? JavaMail has been around forever!) What I'd like to know is how well an analogous system would apply to documents. Maybe I'll give it a try, at least for new ones. However, compared to most people my electronic document needs are pretty basic - I seem to rely mostly on email, printed documents, Manila folders, and letter size paper. (Yes, it's about as low tech as possible.)
As always, comments are welcome.
I have a collection of 200+ PDF's, Word docs, text files...It's easy to find one if I know the name of what I'm looking for already, but opening a large number of them looking for what I'm currently interested in is getting very old very quickly.This is a big problem that's near and dear to my heart, and one that hasn't been adequately addressed yet. It's a huge topic (the British Computer Society recently called "Memories for life" one of the Grand Challenges in Computing), but I wanted to briefly: a) observe that current techniques are missing the point (relationships), and 2) ask if a GTD-style A-Z reference system apply to the digital realm.
What I'd ideally like is an application that allows me to "tag" my files ala del.icio.us or flickr.com and then allow me to pull up lists of all files with a particular tag.
Current Filing Techniques Aren't Relational
The two suggestions given in response to the Lifehack article ("use Spotlight as the tagging system", and "look into Google Desktop") are based on an IR-style index-and-search approach, also discussed in The Death of Folders? and The File Manager Is Dead. Long Live the Lifeblog. However, I think these approaches are missing one of the fundamental concepts about our information: It is connected. Among other things, documents relate to:
- people (e.g., about them (incl. photos), received from them, or sent to them),
- events (e.g., prepared for, or received during), or
- projects (e.g., supporting information or output artifact)
- AutoFocus
- Stuff I've Seen
- Using Properties for Uniform Interaction in the Presto Document System
- 640KB ought to be enough for anyone
- Keeping Found Things Found
- MyLifeBits Project
- Offloading Your Memories
A simple alpha filing system for electronic documents?
The other idea this question stimulated is applying David Allen's GTD filing system to the digital realm. I'm currently testing this for email, and it has worked pretty well so far. Briefly, in addition to @action and @waiting-for, I have a top-level email directory for each letter of the alphabet, each of which contains email archive files (mbox files on my unix machine) for each project (e.g., n/nsf-site-visit-2005, p/personal-information-web). Finally, each of those latter files contains the relevant messages. Here's the conceptual map (vertical dimension is 'containment', with the outer-most container at the top):
paper | |
---|---|
filing cabinet | email system |
A-Z divider | a-z top-level directory |
file folder | mbox email file |
piece of paper | email message |
This works OK - Filing is pretty fast, for the same reasons as with the analog GTD version: Quick to dream up a name, only a few places I might have put it, etc. However, due to the email client I use (pine), textual search is pretty difficult. (Side note: When will someone write a simple lucene Java app to index mbox files? JavaMail has been around forever!) What I'd like to know is how well an analogous system would apply to documents. Maybe I'll give it a try, at least for new ones. However, compared to most people my electronic document needs are pretty basic - I seem to rely mostly on email, printed documents, Manila folders, and letter size paper. (Yes, it's about as low tech as possible.)
As always, comments are welcome.
Reader Comments (25)
I've struggled with the same problem. I have so many documents and email. Plus, it's never easy to predict what or when I'll need something. For those times when I'll want to browse through documents and for easy in cleaning up old stuff, I use David Allen's simple alpha filing system. To get through so many documents quickly when the need arises, I rely on MSN Desktop Search. It's almost as good as tagging the content.
Thanks for the information about your usage, GadgetComa. It would be great to get some detail about how you adapted the alpha system.
my hobby-project
http://www.livejournal.com/users/cactusinside/40422.html
http://marksearch.narod.ru/index.ht
ml
sorry, only Russian descriptions
The screen shots look really interesting, Обзоры софта. Thanks for the pointer. You might be interested in this post, which talks about meta-data and photos:
[ Photo Blogs, Wikis, and Memories for Life | http://www.matthewcornell.org/blog/2005/04/photo-blogs-wikis-and-memories-for.html ]
I've been wondering about if the GTD's alpha system would work of digital form, but I was a bit afraid to make the switch. I'm glad to know it worked for you, and I'll be doing it, starting only with my personal email. If the results are good, I'll go for my work email also.
I'd love to hear how your experiment goes, Ricardo.
Unless I am mistaken, Microsoft's new Visa software will allow you to view your files in a tagging like enviroment.
http://www.microsoft.com/windowsvista/clear.mspx
An example of the vitual folders is located at http://www.microsoft.com/presspass/presskits/windowsvista/images/image002.jpg
While it may not be perfect it does appear to be a start.
On a related note, thank you for reminding me that MSN Desktop Search existed GadgetComa. I'll check it out.
Thanks for the screen shot and link, Joseph - very interesting. I've had WinFS on my mind for a while, and it's nice to see things finally moving ahead. And I'm pleased if I was able to help on GadgetComa.
matt
I just got back to my own comment and saw your question about how I use the alpha filing system in conjunction with MSN Desktop Search. Basically, I just file my documents the same way David Allen suggests filing paper documents. I use the term that means the most and create a folder for it. The desktop search tool lets me find anything I want with ease, so there's really no need to file simply to allow ease retrieval. What the filing does give me is the option to browse the content either to purge old files or to refer to for ideas.
Thanks for the detail, gadgetcoma.
matt
Havent heard anyone mention google desktop yet. I have a document collection that's in the 4000's and an email archive of 3+ GB (the result of working over 5 years at the same company). Google desktop has more than once allowed me to get that one specific email or document in less than a minute rendering most file-system or folder based ordering moot.
And it does index all kinds of files.
Get a mac
It includes Spotlight, searches whole computer
Thanks for the reminder about Spotlight, anonymous. I think search engine-based solutions like Spotlight are useful, but I'm starting to believe that, without explicit connections between information, they're limited at the kinds of uses I need.
There might be company policy or privacy or copyright issues with this, but you could always email all the documents to a Gmail account, with very descriptive subject lines, and create the tagging system of your dreams there. You can tag an email with multiple tags (like del.icio.us).
Thanks for the Google suggestion, Bookworm. That would solve the tagging, and maybe using tags consistently would be a form of linking. I would like explicit linking between concepts, so that I could place documents in a personal information network. This would allow finding information in novel (and hopefully useful) ways.
Love the blog, BTW...
Vista does support os-wide tagging, see:
http://blogs.msdn.com/pix/archive/2006/06/15/632677.aspx
and from a pr:
"• Tagging Files. Windows Vista’s powerful new search and organization features extensively utilize file properties (metadata) to provide users with an even more dynamic way to interact with their information. Users can tag photos in the Windows Photo Gallery, music in Windows Media Player 11, and documents in the Documents Explorer; it’s simple and provides more flexibility in file organization."
Tagging is a very powerful way of organizing Matt, but you call it "impoverished". Why?
Hi Bob,
Tagging is a very powerful way of organizing Matt, but you call it "impoverished". Why? - It goes back to the (controversial?) idea that "it's the links, silly." In other words, like Google's founders realized, the interrelationships between data items is crucial to representing the kinds of real-world data people generate.
Tags are a kind of weak tagging - I can "link" two data items (say two files) by tagging them the same, e.g., 'UMass.proposal.2006-10' (a project). But I can't add attributes to the actual connection between the files, right? Therefore I loose information...
I can say more, but not right now! Maybe it would be best to chat. Thanks for reading, and for your comment, Bob.
Adding to all that's been said, I don't think a-z directories do any good for electronic documents, neither I use individual folders for documents when I can just rename them as I want (if needed, I keep the original name between parenthesis).
I'm now experimenting with a desktop wiki (moinmoin desktop) to help me organize my documents. Moinmoin makes it easy to attach documents to a page (you just have to upload them to the "pagename"\attachments directory), and allows you to comment each "folder" (really a wiki page). Then you can categorize, tag, or even link these pages. The other advantages of wiki is that it's platform independent, and even scalable to the web, if you need to share your files. Seems to be promising but, as with any file organization system, only time and volume will prove it effective.
Hi pgoes,
I don't think a-z directories do any good for electronic documents
Love to hear more about it.
neither I use individual folders for documents when I can just rename them as I want
Are you saying you keep one big "flat" directory? A fine approach on Unix, but Windows suffers...
experimenting with [ moinmoin desktop | http://moinmoin.wikiwikiweb.de/DesktopEdition ] to help organize documents. comment each "folder", categorize, tag, or even link these pages
Neat! I'd love to hear how it works out
Thanks for the comment.
There's also a great software tool, for Mac, called Together that I use for grouping files in various ways. It's like the iTunes for file organization. It can be found from Reinvented Software: http://reinventedsoftware.com/together/.
Thanks for the tip, Spencer. I have a Mac-based client right now who might be interested. I wonder if there's an equivalent for Mail.app?
I have an a-z file structure for my digital documents and have been using it for about a year now. For starters I am a mac user and I use Midnight Inbox as my GTD software app. I use mail.app for email and I only have action, hold, and archive folders. In my documents folder I created an a-z file structure with quicksilver triggers to the folders I use the most. Documents are filed in the a-z folders based not on the document name but on the project name. After a year of doing this. I say I like it but some of my recurring project files are quite big and specific files get harder to find.
Hi Clayton - Thanks for the story. I'm guessing the 'hold' email folder is for Waiting For items? And I've experienced the flat folder growth problem too, at least the "too many to find" part. Having a single huge project folder... I'd suggest breaking the folder down into subfolders, e.g., budget, travel, slides, ...
You would probably get a lot out of Mark Hurst's book "Bit Literacy" (I interviewed him [ here | http://www.matthewcornell.org/blog/2008/01/conversation-with-mark-hurst-web.html ] ). He covers document structuring schemes.
Thanks for the comment!