Skip to content

M4 charset regex

Mark Overmeer edited this page Jul 2, 2025 · 1 revision

Regex optimization in charset detection

Detecting the charset is surprisingly expensive when the text is not utf8: in the order of 1.7 seconds for 20MB. This does not look painful, but it is considerable when you are sending-out 100.000 newsletters.

30% preformance gain fixing this minor omission: https://github.com/markov2/perl5-Mail-Message/issues/22

Clone this wiki locally