-
Notifications
You must be signed in to change notification settings - Fork 2
M4 parse lines
Email messages where originally expected to be stored in (mbox) files, hence read from files. However, nowadays, many messages are kept in databases or arrive in the program some other way. The trick for these situations, a work-around via Mail::Box::FastScalar, is slow. By implementing Mail::Box::Parser::Lines, and in-memory message parser, we avoid the trick. Will this pay-off?
Each average of 10 runs on a random 62MB mailbox with 4122 messages. The changes are actually not in Mail-Box, but in the Mail-Message distribution 3.018.
User System
Mail-Box release 3.011 from file 9.846s 0.172s
Mail-Box 3.012 to be from file 9.531s 0.167s 3.3% faster
Separate messages in-memory 3.011 11.496s 0.007s
Separate msgs in-mem 3.012 to be 11.088s 0.010s 3.7% faster
Separate msgs in-mem 3.012 ref str 11.192s 0.008s 2.7% faster
In this (old) case, the in-memory messages are parsed via the pseudo-file interface offered by Mail::Box::FastScalar. This object simulates a scalar to be a file, and therefore it is slower than a real file.
Not included in the last three numbers is the 0.3 - 0.5 seconds to read the mailbox file, and split it in separate messages.
With the new parser, which does reimplement message parsing without the file interfaces, we get
::Lines with str 5.859s 0.007s 49% faster (2x)
::Lines with ref str 5.868s 0.000s 49% faster (2x)
Concluding: the file-handle simulation was simple but expensive.
This new implementation is useful!