How to crash an email server with a single email
Five of the most popular email parsers for Node.js have recently been found to be susceptible to a trivial denial of service (DoS) vulnerability. The vulnerability can be exploited by packing a few million empty attachments in a email that will bypass typical email size limits (usually 20 MB or less). When the email is sent to a vulnerable email server, it will freeze the Node.js event loop for several seconds due to the sheer number of attachments. Memory usage will explode to 2 GB or more due to the internal objects created for each attachment, which is typically enough to bring down the entire server with an out-of-memory crash. So, does your Node.js server parse email? Do you know which email parser are you using? Before you check, let’s see who this affects.
Before we continue, here’s the obligitary XKCD.
A Denial of Service Shouldn’t be this Easy, Right?
The vulnerability is easy to explain, easy to exploit, and affects thousands of systems. The mailparser library, for example, receives as many as 249,400 monthly downloads and is used as a dependency by 214 other projects including Sendgrid. Haraka is another affected library which has been used by Craigslist, Fort Anti-Spam and ThreatWave.
The fix is an easy one-liner. It’s not like you need Cloudflare. You just need to validate user data. This could be done by counting the number of attachments (including text parts) and reacting if the attachment count is over 1000, or so. When did you last see an email with 10,000 attachments? When did you ever need to send 100,000 attachments? And if you’re still parsing after a million attachments… well, you know you’ve gone too far!
Hang on, how did we miss this? If it’s as easy to fix as we claim, how did at least five implementations all get it wrong? Additionally, how did we find the vulnerability and what can we do to get better?
Imagine you’re writing an email parser…
You know how many RFCs you need to read (and interpret). You know how many tests you need to write to make sure you’re as compliant with the RFCs as you can be. You’ve heard the software mantra of “make it work, then make it fast”. But email parsers are hard and you’ll end up being happy if it just works. Once you’ve written one, you start to understand why you might not do a quick back-of-the-envelope calculation to estimate how much memory a single multipart object in your design might allocate.
If you perform a complexity analysis:
You’re likely to measure in terms of CPU only, rather than memory.
You will likely not benchmark the typical memory footprint.
You end up being agnostic towards any SMTP environment, so you don’t attempt any fast paths in your parser, even though 90% of the emails parsed will be spam.
You avoid enforcing too many strict policy decisions in your parser.
Your users might not appreciate a limit on the maximum number of attachments per email.
You would rather parse everything and leave the user to reject the email during the SMTP transaction if necessary. In fact, you’re not the email server administrator after all.
Imagine you’re running an email server…
One of the first things you will likely do is decide on your email size limit. The bigger the email, the less chance there is that other servers will accept it. You set your email size limit as low as 20MB, thinking that should also keep your parsing time within reasonable bounds. You decide to use a popular, battle-tested email parser. You trust your email parser’s complexity to be linear or O(N) in the size of the email. You benchmark CPU usage on a maximum email size of 20MB and you expect your server to handle thousands of messages per second. You expect all 20MB emails to take the same amount of time, approximately. You figure as little as 8GB RAM should be enough for 200 concurrent 20MB emails a second, as you don’t expect your user base to grow that quickly. If someone were to wager a bet with you – whether your server can handle 10 concurrent 20MB emails –you’d be confident. The last thing on your mind is a 0-byte attachment. What harm has an empty file ever done?
Test your applications for vulnerabilities
And then you see this:
MIME-Version: 1.0 From: <email@example.com> To: <firstname.lastname@example.org> Subject: MIME Multipart Attack Date: Sat, 30 Jun 2018 15:51:58 +0000 Message-ID: <email@example.com> Content-Type: multipart/mixed; boundary="0" --0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable --0 --0 --0 --0 --0 --0 --0 --0 [× 4 million]</firstname.lastname@example.org></email@example.com></firstname.lastname@example.org>
How did we find the vulnerability?
Ronomon is an email startup in private beta. Back when I got started, I had a nasty encounter with incoming emails suddenly causing V8’s garbage collector to block the Node.js event loop for tens of seconds at a time. I spent two days of my vacation flipping feature-flags every few minutes to cut the load while trying to figure out what was happening. Eventually, with the help of Vyacheslav Egorov, I commented out the inner workings of V8’s CollectAllAvailableGarbage function, which was happily doing an arbitrary 7-times-over collection of a massive multi-gigabyte heap. Looking back, this was a great education. I’ve become extremely wary of allocating objects on the heap and of blocking the event loop.
At the beginning of last year, I set about writing a new email parser, which has since been open-sourced as @ronomon/mime. The goal was to be 10× faster than the previous parser, with a minimum number of allocations, operating on raw buffers, and RFC-compliant with 100% test coverage, including fuzz tests. This was tough to achieve, and meant doing things like eliminating buffer-to-string conversions and reducing branch mispredictions caused by 78-character line-wrapped Base64.
Along the way, I learned that some policy decisions might be better made in the email parser, rather than in the email server, and vice-versa. This included policy decisions such as rejecting obviously malicious, corrupted or truncated Base64, Quoted-Printable or character encodings, rejecting duplicates of critical headers, limiting the number of multiparts, and limiting backtracking due to false positive multipart boundaries.
Earlier this year, I got in touch with Jamie Davis, who wrote an excellent guide to not blocking the Node.js event loop. Jamie was actually doing research on event loop attacks and I suggested that some email parsers out there might be vulnerable to a multipart attack. I was surprised to see the hypothesis confirmed in every Node.js email parser I tested.
Ten Things we Could do Better
Here’s a list of ten tips and good practices you should consider:
- DoS attacks exploit resource scarcity. Show mechanical sympathy and remember you’re writing for a machine. Appreciate the mechanical resources you’ve been entrusted with. Be a good steward. Waste not, want not. Don’t just be O(N) when it comes to CPU usage, but also be O(N) when it comes to memory allocations and every resource you touch. If your code is efficient in terms of CPU, memory, disk and network, you’ll be less vulnerable to resource starvation, and more likely to set reasonable limits on all your system resources.
- Do back-of-the-envelope calculations across all resource dimensions from the outset of any design. This will expose poor designs sooner rather than later and keep you from attempting “impossible” implementations. Performance and security are not things that you can optimize or bolt on later. They need to be designed in from the start. Don’t wait until your module gets popular.
- Balance your resource usage across all dimensions. You may have enough CPU to meet your throughput goals but will you run out of memory before then? Again, you need back-of-the-envelope calculations to keep your usage in proportion across resource dimensions and to avoid bottlenecks in your design.
- Remember that a performance issue is just a DoS waiting to happen. Especially when you’re running an event loop. The next time a user reports a performance issue, see it as a chance to prevent a security issue.
- Validate all user data, not just “how much” but also “how many”. In fact, it’s often the small things that you really need to watch out for, because the small things often occur more and have more room to be multiplied and amplified against you.
- Keep asking yourself, what do I expect to be realistic? Don’t allow anything 10× past your realistic threshold. Embed your expectations into the code you write.
- Mind the gap between module boundaries. Don’t assume “someone else will do it”. Don’t let policy decisions fall through the cracks. You may need to understand your dependencies better.
- Treat obviously bad data as toxic. Don’t touch it with a ten foot pole. Get rid of it as soon as you can.
- Ask yourself, what will a malicious user do? Don’t just review your code. Actively try to exploit your code. Think through and fix at least three exploits in every module before you publish. Set a goal and find them. They’re always out there. You’ll be surprised.
- Fuzz tests have fantastic imagination. Write your own simple fuzz tests to generate a random spectrum of valid and invalid arguments. Test your function return values against another implementation for correctness where valid, and for exceptions where invalid. Fuzz tests running millions of function argument permutations are like Linus’s Law in the extreme, simulating the bug-catching ability of thousands of eyeballs in just a few seconds.
Private and public disclosure timeline
The vulnerability was privately disclosed to owners of the affected modules on April 23rd, 2018. A few days before the 90-day public disclosure deadline, owners were provided an opportunity to delay public disclosure for any reason. In addition, the most active dependent modules were contacted (where contact details were available on GitHub) and readied for public disclosure on June 25th, 2018.
Thanks to Karen Yavine, Simon Maple and Danny Grander of Snyk for assistance with public disclosure, suggesting this blog post and conducting further investigation. Thanks also to Matt Sergeant of Haraka in particular for responding promptly.
haraka (versions < 2.8.19)
April 23rd, 2018 - Initial private disclosure to package owner April 24th, 2018 - Initial response from package owner June 15th, 2018 - Vulnerability fixed but not yet published to npm June 25th, 2018 - Public disclosure June 27th, 2018 - Version 2.8.19 published with fix
mailparser (ALL versions)
April 23rd, 2018 - Initial private disclosure to package owner April 24th, 2018 - Initial response from package owner June 25th, 2018 - Public disclosure There is as yet no fix for mailparser.
emailjs-mime-parser (ALL versions)
April 23rd, 2018 - Initial private disclosure to package owner April 24th, 2018 - Initial response from package owner June 25th, 2018 - Public disclosure There is as yet no fix for emailjs-mime-parser.
mailsplit (versions < 4.2.1)
April 23rd, 2018 - Initial private disclosure to package owner April 24th, 2018 - Initial response from package owner June 25th, 2018 - Public disclosure July 23rd, 2018 - Version 4.2.1 published with fix
mailparser-mit (ALL versions)
April 23rd, 2018 - Initial private disclosure to package owner April 24th, 2018 - Initial response from package owner June 25th, 2018 - Public disclosure There is as yet no fix for mailparser-mit.