Sunday, November 7, 2010

Emacs, the Wiki and the Idiot.

Sir C.A.R. (Tony) Hoare once wrote:
Inside every large problem, is a small problem—struggling to get out.
Someone else, stealing from H. L. Mencken, changed it to:
Inside every large problem, is a small problem—struggling to get out. The solution to that problem is simple, elegant and wrong.
I tend to agree with the Thief. The simple reality is that many problems don't have elegant solutions, try as we may to find them. I ran into this when I tried to come up with a simple personal wiki. Enjoy my exploits.

Dale Writes a Wiki

I think the wiki is one of the great software concepts of the last decade or so. That's not hyperbole, I really think that making it easy to link documents together, in a casual and non-invasive way, is a brilliant and under-appreciated concept.

Most wikis are web based. This makes sense if you want the world to have access to the pages. I was looking more for a personal solution.

There are personal wikis that are stand alone applications, but most of them use a special format for your data. This strikes me as, well, stupid. Why would I lock up my data in a special format when a directory of text files should easily get the job done?

I do a lot of my text editing in Emacs, so I started looking around for an wiki based in Emacs. There are a few, but most of them are either unsupported, too elaborate or too invasive. I wanted a way to link text from any kind of file, not just wiki files. I wanted to be able to link comments in source code to documentation files to cake recipes if that's what works best. I also wanted it to leave the rest of my Emacs environment alone. I didn't want to enter Wiki-Mode just to put links in a Perl program. I wanted to stay in Perl mode.

The joy of being a programmer is, if you can't find it, you can always write it. That's what I started out to do.

At first I decided to play with [[bracket style links]]. Emacs Org Mode uses them and I like much about Org Mode, so I tried to emulate its style.

The attempts where not too successful. I pulled out some of Org Mode's handler code and started playing with it. It's pretty invasive and doesn't work well with old versions of Emacs. I use an old version of Emacs at work, and I wanted something small and easy to understand.

My next idea was to write my own bracket style linking library. At it's heart it would grab the text between [[]], convert it into a file name and then open the resulting file name. 194 lines of Elisp later and I had a working minor mode. It could open files, web pages, internal links and even run commands. I was very happy with it. It was reasonably small and reasonably easy to understand and reasonably noninvasive.

Because I'm a professional, I started to document my results. The problem is, I'm a little bit insane. Sometimes, when I'm working on a project, a little idiot voice calls from the fog of experience and tells me what I'm doing is wrong. The Idiot never tells me what's right, it just picks at the back of my head until I accept, eventually, that my perfectly working code is "wrong". Arg!

The more I documented, the louder the Idiot got. I got so frustrated that I put the code aside. I'm a big believer in documentation by Idiot. You take your code and ignore it for a few weeks or even months. Then you come back and try to read it. Every time to look at a bit of code and say "What idiot wrote this?" you either re-factor or document. Maybe the Idiot could figure out what I couldn't.

The Idiot was insidious. It would go away for a week or so and then come out of the shadows to jeer hints. It asked questions that I should have asked. Do I need brackets? Do I need to open web pages and run commands from links? Are my needs the needs of others? What's more important, abstract power or agility? What is the DAO of the problem? Why won't you see it!?

This went for over a month. Part of me was trying to find the right solution. Part of me wanted me to finish what I had and get on with my life.

Then, on a Sunday night, as a long work day loomed ahead, I was laying in my bed with my beloved wife and Charlie the metal eating dog. From nowhere, it came to me. All I really want is a way to take the word under the cursor and open a file with the same name. Then it came to me. What I really want is a way to grab an arbitrary blob of text under the cursor and then open up a file based on the text. Then it came to me. What I really want is a way to grab an arbitrary blob of text under the cursor and do something with it. The Idiot had spoken!

It's a cliche, but I really wanted to leap out of bed and start hacking away. Alas, the days of 12 hour hackfests followed by 10 hour workdays are a thing of the past. I had to go to sleep or I would die at my desk. I had to go to work or I would be thrown into the street. I had to get this program written or I would crack up.

After work I lit in to my task. One of the greatest pleasures in programming is simplifying. The more I wrote, the smaller the program became. Irrelevance fell like rain. I was beginning to understand. By the end, the entire program is 15 lines long. 24 lines if you include the code to hook it up to the key of your choice.

It was simple, elegant and worked like a charm. The Thief was wrong. The Idiot stopped jeering.

Hooking Up the One Function Wiki

The simple solution is to take the text at the end of this article and append it to your .emacs.el file.

To test it, fire up Emacs, and type "This is CamelCase text.". Move your cursor to the link text "CamelCase" and type C-cC-o. It should open up the file "CamelCase".

If you want to use other link styles besides CamelCase or want to open files in different way, read the documentation. Doing things like opening files in a specific directory or opening "CamelCase.txt" are trivial to implement. You just need to do a little Elisp programming and away you go!

The Text to Append.

(defun link-to(link-re link-start-re &optional handle-link)
"Grab the \"link\" under the cursor and open a file based on that link.

LINK-RE is a regular expression which matches the link text.
LINK-START-RE is a regular expression which matches the beginning, or text
just before the link.
HANDLE-LINK is an optional function which takes the link text and opens the
link. If not provided, `find-file' is called with the link text.

This function is usually called via local-set-key with mode specific
regular expressions.

This example will grab a CamelCase link and open a file with the same

(global-set-key (kbd \"C-c C-o\")
(lambda ()
\"<\"))) Note: The \"<>\"s above should have \"\\\" in front of them, but emacs
thinks I want to print a key map when I try to include them in the help

This example will grab an alphabetic string and do a google query on it.

(global-set-key (kbd \"C-c C-o\")
(lambda ()
\"[a-z]+\" \"[^a-z]\")
(lambda (x)
(concat \"\" x)))))"
(let ((here (point)) link-text
(case-fold-search nil))
(or (re-search-backward link-start-re nil t)
(goto-char (point-min)))
(unless (re-search-forward link-re nil t)
(error "No tag found to end of file"))
(setq link-text (match-string-no-properties 0))
(if (or (< (point) here)
(> (- (point) (length link-text)) here))
(error "No tag found under cursor")))
(if handle-link
(funcall handle-link link-text)
(find-file link-text))))

(kbd "C-c C-o")
(lambda ()
"\\<[A-Z][a-z]+\\([A-Z][a-z]+\\)+\\>" "\\<"
;; This regexp pair matches file names.
;;"[a-zA-Z0-9/~_][a-zA-Z0-9/._]*[a-zA-Z0-9]" "[^a-zA-Z0-9/.~_]"
;; This will open text files in your ~/Wiki directory.
;;(lambda (x) (find-file (concat "~/Wiki/" x ".txt")))


Atle Iversen said...

Nice solution :-) !

Of course, later on you want to get rid of the CamelCase, then you want to automatically identify which words are linked to files, then you want *multiple* words to link to files...

And 5 years later you have a brilliant personal wiki with automatic linking of multiple words which saves in a simplified markdown format :-)

I would really like your feedback on our product, PpcSoft iKnow for Windows:


Dale Wiles said...

CamelCase keeps things simple. Very few editors will try to word wrap the middle of a CamelCase word. I orignially used [[bracket links]] so I could have multiple words and they can easily be implemented. You just have to handle spaces and characters that offend the underlying operating system.

"Brilliant personal wikis" have their place, but this little function allows you to link things like source code in your text editor with no additional overhead.

I checked out your program and it looks pretty slick. Automated tagging is a potentially very interesting field.

One kibitz (I'm a programmer after all): If I'm understanding your video correctly, you use incremental searching as a way to find things. Traditional incremental searches stumble when you can't figure out the first letter of what you're searching for. "Cat" vs "KittyCat" for example. You may want to try "letter distance" searching. It would include all the words that have "C" somewhere, followed eventually by "a" and later a "t". Words that start with "C" would be listed before words that have "C" as the second letter and so on. If 2 words have "C" in the same position, then it would sub-order them by the distance from the "C" to the "a" (and then "t").

I've had good results with it.