Sunday, June 22, 2008

Threading in mailnews

I was cleaning up my WHATWG mailing list folder—a task which mostly involves looking at the subject of a message and deciding whether or not I cared to keep this piece of correspondence—when I thought about how threading interacted. If you haven't subscribed to this mailing list (which I doubt most readers have), the main WHATWG author (Hixie) writes a message which is a reply to several messages at once.

A brief aside, if I may: for completely unrelated reasons (responding to a new bug which turned out to be a dupe, stupid me for checking validity for duplication), I was perusing RFC 2822, specifically the In-Reply-To field (§ 3.4.6). Interestingly enough, the case of a message having multiple parents is quite well-defined in the spec (and Hixie violates the spec on this point). I did a brief check of the code on this point, and the code will handle the theoretically correct case fine (using In-Reply-To in lieu of References, which is not quite correct, but works for the purposes of threading).

Anyways, the thing that caught me the most was that I often cared more about Hixie's catch-all reply than the earlier message to which the reply had been attached. In essence, I wished for the ability to reroot threads. I thought a little more, and listed other threading enhancements I wanted. But there already is a mammoth chart of threaded view issues—see bug 236849 for a sublist of many of these.

At the core of threading, one can distinguish several levels of threading. The basest is none at all; this is represented by turning threaded view off. Second is relying on subject: one can only tell that two messages are related by this methods, but not which is a reply to the other. Third is typical threading, relying on In-Reply-To and References, which works well. Fourth is what I like to think of as über-threading: parsing the message text to determine the quoted replies and use that to determine the parent of a message. Fifth, and highest, is the ability to redefine threading as the user sees fit. Note that most of these are orthogonal, so that one can have a combination of the inner three to determine a thread's parent.

The utility of redefining your own threading is hard to over-state. How many of you have received email where people blithely hit "Reply to All" and start a new message like that, but others in the same category legitimately use reply features? I myself have one thread like that composing 20 different real threads. Other times you hit those cases where someone one a borked client (*cough*Yahoo!*cough*) and someone changes the thread subject, or a confluence of mailing lists and forwarding and replies (four threads where one is warranted, again in my inbox).

There are touchier areas with respect to threading. For example, the notion of subthreads is powerful (there are RFEs to implement practically every "Apply xxx to whole thread" as also an "Apply xxx to subthread"), but it is a pain in the backend, not least of which is the fact that we have some other bugs inducing loops into the thread hierarchy there. Similarly, the question of what do with multiple parenting (both how to represent it and how to generate it) can be touchy on the UX end. A final thorn I would like to specifically direct your attention to is the idea of dummy thread headers, as referred to in jwz's algorithm, the seminal work on the matter (ignore his anti-NS 4 rant, however, he lives in the glory days of NS 2).

On the other hand, don't expect me to implement any of these improvements soon, nor anyone else for the matter. I merely wanted to express my opinions as Thunderbird drivers debate UI on a higher level, with a tendency that seems to be somewhat towards ignoring some of the finer aspects of good message threading. Ah well....

1 comment:

Unknown said...

*** This post has been marked as a duplicate of bug 36024 ***