[courier-users] Server side sorting

Discussion:

Michelle Konzack

2017-03-22 10:19:09 UTC

Hello *,

I write currently a Forum Software where I use courier-mlm as base.

So, the Forum Software has an EMail address which is subscribed to all
the Lists and a MDA is sorting the Mails into a structure like

INBOX.Forum_1/
INBOX.Forum_1.Thema_1/
INBOX.Forum_1.Thema_2/
INBOX.Forum_1.Thema_3/
INBOX.Forum_2/
INBOX.Forum_2.Thema_1/
INBOX.Forum_2.Thema_2/
INBOX.Forum_2.Thema_3/

and if you know about Forums, this structure should be clear. Curently
I use THIS list, for testing...

So every new subject is logical a new Thema_X.

Mostly it is working IF the Mailinglist user let the "References:" Header
and/or "Subject:" intact.

Now with thread sorting I have a nice problem.

in a sorted bei "Date:" view, it is no problem, because it is sorted by
the arival time on my server not the senders date, because many peoples
do not care about the correct time on there computers.

But how can I do thread sorting?

The squirrelmail sourcecode is really weird and I do not understand how
it works.

Question: Can I use for this the "IMAP CAPABILITY" server side sorting?

I access the imap account with php5 and php-imap.

Thanks in avance

--
Michelle Konzack Miila ITSystems @ TDnet
GNU/Linux Developer 00372-54541400

Sam Varshavchik

2017-03-22 10:35:16 UTC

Permalink

Post by Michelle Konzack
But how can I do thread sorting?
The squirrelmail sourcecode is really weird and I do not understand how
it works.
Question: Can I use for this the "IMAP CAPABILITY" server side sorting?
I access the imap account with php5 and php-imap.

The IMAP command is THREAD REFERENCES. The response consists of message
numbers, with parenthesis indicating various threads and subthreads:

a THREAD REFERENCES UTF-8 ALL
* THREAD (1)(2)(5 3 4)(6 7)(8)(9)(10 11 12)(13)(14)(15 16 17)(18)(19)(20 22)
(21)(23 (24)(25))(26)(27)(28)(29)(30)(31)(32)(33)((34)(37)(39))(35)(36)(38)
(40)(41)((42)(43))(44)(45 46)(47 48)(49 50 ((51)(52)))
a OK THREAD done.

The complete specification is a somewhat of a big pill to swallow. See
https://tools.ietf.org/html/rfc5256

Michelle Konzack

2017-03-22 12:05:59 UTC

Permalink

Hello Sam,

Post by Sam Varshavchik
The IMAP command is THREAD REFERENCES.

OK, this is fine since I can setup ANY commands.

Post by Sam Varshavchik
The response consists of
message numbers, with parenthesis indicating various threads and
a THREAD REFERENCES UTF-8 ALL
* THREAD (1)(2)(5 3 4)(6 7)(8)(9)(10 11 12)(13)(14)(15 16 17)(18)(19)
(20 22)(21)(23 (24)(25))(26)(27)(28)(29)(30)(31)(32)(33)((34)(37)(39))
(35)(36)(38)(40)(41)((42)(43))(44)(45 46)(47 48)(49 50 ((51)(52)))
a OK THREAD done.

Hmm, question:

If I understand it right, the

a) (1) mean a singel messages

but what does

b) (5 3 4)

mean? This would look like the message 5 came before 3 and 4 and then

c) (23 (24)(25))
d) ((34)(37)(39))
e) (49 50 ((51)(52)))

which look very courious to me.

Post by Sam Varshavchik
The complete specification is a somewhat of a big pill to swallow.
See https://tools.ietf.org/html/rfc5256

It seems I have to suck it!

--
Michelle Konzack Miila ITSystems @ TDnet
GNU/Linux Developer 00372-54541400

Sam Varshavchik

2017-03-22 15:55:09 UTC

Permalink

Post by Michelle Konzack
Hello Sam,

Post by Sam Varshavchik
The IMAP command is THREAD REFERENCES.

OK, this is fine since I can setup ANY commands.

If I understand it right, the
a) (1) mean a singel messages
but what does
b) (5 3 4)
mean? This would look like the message 5 came before 3 and 4 and then

Message numbers are assigned to messages the first time they're seen. If the
IMAP server hasn't logged on for a while and is now seeing a bunch of
messages for the first time, the order in which the files get read from the
directory may not necessarily match the order they were delivered to. So
the server may see a reply before the original message, and the REFERENCES
sort will rearrange them in chronological order.

Post by Michelle Konzack
c) (23 (24)(25))
d) ((34)(37)(39))
e) (49 50 ((51)(52)))
which look very courious to me.

Post by Sam Varshavchik
The complete specification is a somewhat of a big pill to swallow.
See https://tools.ietf.org/html/rfc5256

It seems I have to suck it!

Yes.

This is one of the more âŠinvolved parts of IMAP. There's a lot of history
and legacy involved. It's my understanding that some of the original actors
have suffered health problems in recent past; so I don't want to say
anything on that account.

But I'll say this. I believe that server-side sorting was a mistake. The
most sensible usage model for IMAP is for the client to sync and cache with
the server. An IMAP client should sort and thread messages using its cached
message metadata and don't hassle the server with it.

A server is a shared resource. It never made any sense to me to offload as
much processing as possible to the server. It makes more sense for most of
the processing to be done on the client side, with the server's role limited
to feeding the raw data to the client. There are more clients than there are
servers. Clients, collectively have more shared processing power. A CPU
currently busy sorting some knucklehead's ten year mail archive can't do
anything else, for other clients. That never made any sense, but that's how
IMAP is overall designed, to push as much processing to the server.

And, of course, it's much easier for some hacked-together IMAP-over-web
client to send a single command and parse the response, than to do the job
by itself.

Alessandro Vesely

2017-03-22 16:47:00 UTC

Permalink

A server is a shared resource. It never made any sense to me to offload as much
processing as possible to the server. It makes more sense for most of the
processing to be done on the client side, with the server's role limited to
feeding the raw data to the client.

Thunderbird client (apparently) features server-side searches. I never found a
check-box to try server-side threading. Anyway, IME, client-side searches
became reliable after the client sticked with maintaining an indexed database
of messages.

Some IMAP servers use indexed files too. Courier does not. What is the
rationale behind that design choice?

There are more clients than there are servers. Clients, collectively have
more shared processing power. A CPU currently busy sorting some
knucklehead's ten year mail archive can't do anything else, for other
clients. That never made any sense, but that's how IMAP is overall designed,
to push as much processing to the server.

Sometimes it makes sense to use disposable clients...

And, of course, it's much easier for some hacked-together IMAP-over-web client
to send a single command and parse the response, than to do the job by itself.

Yes.

Ale
--

Sam Varshavchik

2017-03-22 17:47:10 UTC

Permalink

Post by Alessandro Vesely
Some IMAP servers use indexed files too. Courier does not. What is the
rationale behind that design choice?

I expected â as I said â for clients to handle their own caching and
indexing. Indexing adds complexity. More code, more opportunities for bugs.
Furthermore, there is no preset recipe for indexing. IMAP allows the client
to request, and search, on any mail header, and on anything in the body of
the email. There's nothing obvious to index. One could take the approach of
indexing common mail headers; only to discover that one's own mail client
doesn't search or request them. One could take the approach of indexing all
headers only to get a client that caches everything itself, and thus never
requests the same message twice; so now you're doing a lot of work creating
an index that will never be used.

Another factor that the fact that maildirs are open to anyone. Anyone can
come in and add or remove messages from a maildir. Allowing for this
immediately increases the complexity of any indexing solution. It's one
thing for an IMAP server that maintains its own private mail store, and all
access to the mail has to go through the IMAP server. That makes it much
easier to implement some kind of indexing. It's no longer as straightforward
when anyone can come in and simply delete the message, that you previously
indexed. This means that even if you have an index, you still have to go and
check that the message still exists, before returning search results to the
client. That, pretty much, takes back a good chunk one expected to gain,
from indexing.

Alessandro Vesely

2017-03-23 09:43:17 UTC

Permalink

Post by Alessandro Vesely
Some IMAP servers use indexed files too. Courier does not. What is the
rationale behind that design choice?

Yes. AFAIK, databases are not yet so smart as to cache required indexes on
demand. They require well designed schemata to work with.

Another factor that the fact that maildirs are open to anyone. Anyone can come
in and add or remove messages from a maildir. Allowing for this immediately
increases the complexity of any indexing solution. It's one thing for an IMAP
server that maintains its own private mail store, and all access to the mail
has to go through the IMAP server. That makes it much easier to implement some
kind of indexing. It's no longer as straightforward when anyone can come in and
simply delete the message, that you previously indexed. This means that even if
you have an index, you still have to go and check that the message still
exists, before returning search results to the client. That, pretty much, takes
back a good chunk one expected to gain, from indexing.

That factor sounds questionable considering IMAP keyword implementation. A
full-fledged addition of messages to a maildir had better depend on a proper MDA.

Their deficiencies notwithstanding (e.g. FAMPending: timeout), file systems are
way more mature than DBMS. Even writing in PHP or Python at times requires to
consider the brand of the underlying database, let alone C/C++. And DBs make
automating installation even harder than programming, IME. How much does that
state of affairs condition current development?

Ale
--

Sam Varshavchik

2017-03-23 10:35:55 UTC

Permalink

Post by Alessandro Vesely
Their deficiencies notwithstanding (e.g. FAMPending: timeout), file systems are
way more mature than DBMS. Even writing in PHP or Python at times requires to
consider the brand of the underlying database, let alone C/C++. And DBs make
automating installation even harder than programming, IME. How much does that
state of affairs condition current development?

Well, you've made my argument for the IMAP server to be little more than a
translator between IMAP and the underlying filesystem. For that task, the
current state of affairs is that the IMAP server is doing a pretty good job.