Revert recent change regarding 400 Bad Request on non-working email addresses
Mailgun provides an excellent service. However, a recent change was made to the API: 400 Bad Request is returned if the address is not valid.
The validation is a deep check that is not commonly performed (in my experience), including checking the domain for a MX record and effective TLD names (using https://publicsuffix.org/list/effective_tld_names.dat).
The issue with this is
1) There are no clear distinction between an invalid API request and invalid content. E.g. you get a 400 bad request if a required parameter is missing and if the email address is invalid
- Currently the documentation states: "Bad Request - Often missing a required parameter" (https://documentation.mailgun.com/en/latest/api-intro.html#errors)
2) The API message (HTTP body) contains no error codes. Which makes it less suitable for containing a vast amount of different poorly related errors
3) Implementation making the request to the API gets mixed with implementation handling bounces and dropped messages
- In my case, we have a stable implementation that uses the API. A bad request is *not* expected and a serious indication of a breaking change to the API, since the *usage of the API* is stable
- Now we need to parse strings such as "'to' parameter is not a valid address. please check documentation" to detect problems with the content, even though we also have API hooks established
- The lack of separation of issues is a problem
The suggested solution
a) Accept the message (200 OK from the API)
- Keep the distinction clear: a bad request is only about the API (the abstract meta only) - never about content itself
b) Drop the message immediately and invoke API hooks
- Please note that this is not about "drop immediately or wait and retry"
- The only difference is: how is a immediate drop communicated? It should be communicated via the established API hooks (single channel with single purpose) and not via API response codes "just because the API connection is still active"
- Compare this to suppressions: whether an address is suppressed or not is something known when the API is called. But it's *not* communicated via a 400 Bad Request or a 418
Finally, quoting RFC 7231 (https://tools.ietf.org/html/rfc7231#section-6.5.1)
"The 400 (Bad Request) status code indicates that the server cannot or will not process the request due to something that is perceived to be a client error (e.g., malformed request syntax, invalid request message framing, or deceptive request routing)"
Please keep the layers separated.
In reply to Stenmark below
> "I do understand that there are issues with trying to deliver obviously invalid emails and I don't really mind that the request returns a 400 error."
I agree that 400 Bad Request can be the response of an email address that is *obviously* invalid. E.g. "@" would be obviously invalid.
The problem here is the depth of the check that Mailgun performs. E.g. checking the DNS record of the domain to verify that it is associated with an email server. In my opinion that is extreme. It is also a check that isn't deterministic, i.e. it suddenly matters *when* you validated the address.
Doing such checks before actually sending is fine, of course.
I completely agree that this breaks HTTP standards. A 400 error should be returned for a badly formed request to the API, not also for a bad email - those are two separate things.
I thought you could perform a separate call to track the email's progress through the system? As outlined here: https://bradgignac.com/2014/05/12/sending-email-with-python-and-the-mailgun-api.html
How am I meant to properly test or log these API calls when I can't distinguish a badly formed request from an invalid email?
Patrik Stenmark commented
I wasn't aware that this was something that changed, but I agree that it is unexpected.
I do understand that there are issues with trying to deliver obviously invalid emails and I don't really mind that the request returns a 400 error. However I would really appreciate a more machine-oriented reply in the API and documentation of which errors can occur. Now we have to string match for the correct errors and there are no way of knowing which errors can actually occur which is clunky and unreliable