PROXY Protocol Support
When put behind a “proxy” / load balancer, server programs can no longer “see” the original client’s actual IP Address and Port.
This also affects aiosmtpd
.
The HAProxy Developers have created a protocol called “PROXY Protocol” designed to solve this issue. You can read the reasoning behind this in their blog.
This initiative has been accepted and supported by many important software and services such as Amazon Web Services, HAProxy, NGINX, stunnel, Varnish, and many others.
aiosmtpd
implements the PROXY Protocol as defined in the documentation accompanying HAProxy v2.3.0;
both Version 1 and Version 2 are supported.
Activating
To activate aiosmtpd
’s PROXY Protocol Support,
you have to set the proxy_protocol_timeout
parameter of the SMTP Class
to a positive numeric value (int
or float
)
The PROXY Protocol documentation suggests that the timeout should not be less than 3.0 seconds.
Important
Once you activate PROXY Protocol support, standard (E)SMTP handshake is no longer available.
Clients trying to connect to aiosmtpd
will be REQUIRED
to send the PROXY Protocol Header
before they can continue with (E)SMTP transaction.
This is as specified in the PROXY Protocol documentation.
handle_PROXY
Hook
In addition to activating the PROXY protocol support as described above,
you MUST implement the handle_PROXY
hook.
If the handler
object does not implement handle_PROXY
,
then all connection attempts will be rejected.
The signature of handle_PROXY
must be as follows:
- handle_PROXY(server, session, envelope, proxy_data)
- Parameters:
server (aiosmtpd.smtp.SMTP) – The
SMTP
instance invoking the hook.session (Session) – The Session data so far (see Important note below)
envelope (Envelope) – The Envelope data so far (see Important note below)
proxy_data (ProxyData) – The result of parsing the PROXY Header
- Returns:
Truthy or Falsey, indicating if the connection may continue or not, respectively
Important
The
session.peer
attribute will contain theIP:port
information of the directly adjacent client. In other word, it will contain the endpoint identifier of the proxying entity.Endpoint identifier of the “original” client will be recorded only in the
proxy_data
parameterThe
envelope
data will usually be empty(ish), because the PROXY handshake will take place before client can send any transaction data.
Parsing the Header
You do not have to concern yourself with parsing the PROXY Protocol header;
the aiosmtpd.proxy_protocol
module contains the full parsing logic.
All you need to do is to validate the parsed result in the handle_PROXY
hook.
Enums
- class aiosmtpd.proxy_protocol.AF
- UNSPEC = 0
- IP4 = 1
- IP6 = 2
- UNIX = 3
For Version 1,
UNKNOWN
is mapped toUNSPEC
.
- class aiosmtpd.proxy_protocol.PROTO
- UNSPEC = 0
- STREAM = 1
- DGRAM = 2
For Version 1,
UNKNOWN
is mapped toUNSPEC
, andTCP
is mapped intoSTREAM
- class aiosmtpd.proxy_protocol.V2_CMD
- LOCAL = 0
- PROXY = 1
ProxyData
API
- class aiosmtpd.proxy_protocol.ProxyData(version=None)
- Attributes & Properties
- version: int | None
Contains the version of the PROXY Protocol header.
If
None
, it indicates that parsing has failed and the header is malformed.
- family: AF
Contains the address family.
Valid values for Version 1 excludes
AF.UNIX
.
- protocol: PROTO
Contains an integer indicating the transport protocol being proxied.
Valid values for Version 1 excludes
PROTO.DGRAM
.
- src_addr: IPv4Address | IPv6Address | AnyStr
Contains the source address (i.e., address of the “original” client).
The type of this attribute depends on the
address family
.
- dst_addr: IPv4Address | IPv6Address | AnyStr
Contains the destination address (i.e., address of the proxying entity to which the “original” client connected).
The type of this attribute depends on the address family.
- src_port: int
Contains the source port (i.e., port of the “original” client).
Valid only for address family of
AF.INET
orAF.INET6
- dst_port: int
Contains the destination port (i.e., port of the proxying entity to which the “original” client connected).
Valid only for address family of
AF.INET
orAF.INET6
- rest: ByteString
The contents depend on the version of the PROXY header and (for version 2) the address family.
For PROXY Header version 1, it contains all the bytes following
b"UNKNOWN"
up until, but not including, theCRLF
terminator.For PROXY Header version 2:
For address family
UNSPEC
, it contains all the bytes following the 16-octet header preambleFor address families
AF.INET
,AF.INET6
, andAF.UNIX
it contains all the bytes following the address information
- tlv: aiosmtpd.proxy_protocol.ProxyTLV
This property contains the result of the TLV Parsing attempt of the
rest
attribute.If this property returns
None
that means either (1)rest
is empty, or (2) TLV Parsing is not successful.
- whole_raw: bytearray
This attribute contains the whole, undecoded and unmodified, PROXY Header. For version 1, it contains everything up to and including the terminating
\r\n
. For version 2, it contains everything up to and including the last TLV Vector.If you need to verify the
CRC32C
TLV Vector (PROXYv2), you should run the CRC32C calculation against the contents of this attribute. For more information, see the next section, Note on CRC32C Calculation.
- tlv_start: int
This attribute points to the first TLV Vector if exists.
If you need to verify the
CRC32C
TLV Vector, you should run the CRC32C calculation against the contents of this attribute.The value will be
None
if PROXY version is 1.
Methods- with_error(error_msg: str) ProxyData
- Parameters:
error_msg (str) – Error message
- Returns:
self
Sets the instance’s
error
attribute and returns itself.
- same_attribs(_raises=False, **kwargs) bool
- Parameters:
_raises (bool) – If
True
, raise exception if attribute not match/not found, instead of returning a bool. Defaults toFalse
- Raises:
ValueError – if
_raises=True
and attribute is found but value is wrongKeyError – if
_raises=True
and attribute is not found
A helper method to quickly verify whether an attribute exists and contain the same value as expected.
Example usage:
proxy_data.same_attribs( version=1, protocol=b"TCP4", unknown_attrib=None )
In the above example,
same_attribs
will check that all attributesversion
,protocol
, andunknown_attrib
exist, and contains the values1
,b"TCP4"
, andNone
, respectively.Missing attributes and/or differing values will return a
False
(unless_raises=True
)Note
For other examples, take a look inside the
test_proxyprotocol.py
file. That file extensively usessame_attribs
.
ProxyTLV
API
- class aiosmtpd.proxy_protocol.ProxyTLV
This class parses the TLV portion of the PROXY Header and presents the value in an easy-to-use way: A “TLV Vector” whose “Type” is found in
PP2_TYPENAME
can be accessed through the .<NAME> attribute.It is a subclass of
dict
, so all ofdict
’s methods are available. It is basically a Dict[str, Any] with additional methods and attributes. The list below only describes methods & attributes added to this class.- PP2_TYPENAME: Dict[int, str]
A mapping of numeric Type to a human-friendly Name.
The names are identical to the ones listed in the documentation, but with the
PP2_TYPE_
/PP2_SUBTYPE_
prefixes removed.Note
The
SSL
Name is special. Rather than containing the TLV Subvectors as described in the standard, it is abool
value that indicates whether the PP2_SUBTYPE_SSL
- tlv_loc: Dict[str, int]
A mapping to show the start location of certain TLV Vectors.
The keys are the TYPENAME (see
PP2_TYPENAME
above), and the value is the offset from start of the TLV Vectors.
- same_attribs(_raises=False, **kwargs) bool
- Parameters:
_raises (bool) – If
True
, raise exception if attribute not match/not found, instead of returning a bool. Defaults toFalse
- Raises:
ValueError – if
_raises=True
and attribute is found but value is wrongKeyError – if
_raises=True
and attribute is not found
A helper method to quickly verify whether an attribute exists and contain the same value as expected.
Example usage:
assert isinstance(proxy_tlv, ProxyTLV) proxy_tlv.same_attribs( AUTHORITY=b"some_authority", SSL=True, )
In the above example,
same_attribs
will check that the attributesAUTHORITY
andSSL
exist, and contains the valuesb"some_authority"
andTrue
, respectively.Missing attributes and/or differing values will return a
False
(unless_raises=True
)Note
For other examples, take a look inside the
test_proxyprotocol.py
file. That file extensively usessame_attribs
.
- classmethod from_raw(raw) ProxyTLV | None
- Parameters:
raw (ByteString) – The raw bytes containing the TLV Vectors
- Returns:
A new instance of ProxyTLV, or
None
if parsing failed
This triggers the parsing of raw bytes/bytearray into a ProxyTLV instance.
Internally it relies on the
parse()
classmethod to perform the parsing.Unlike the default behavior of
parse()
,from_raw
will NOT perform a partial parsing.
- classmethod parse(chunk, partial_ok=True) Dict[str, Any]
- Parameters:
chunk (ByteString) – The bytes to parse into TLV Vectors
partial_ok (bool) – If
True
, return partially-parsed TLV Vectors as is. IfFalse
, (re)raiseMalformedTLV
- Returns:
A mapping of typenames and values
This performs a recursive parsing of the bytes. If it encounters a TYPE that ProxyTLV doesn’t recognize, the TLV Vector will be assigned a typename of “xNN”
Partial parsing is possible when
partial_ok=True
; if during the parsing an error happened, parse will abort returning the TLV Vectors it had successfully decoded.
Note on CRC32C Calculation
Neither the ProxyData
nor ProxyTLV
classes implement PROXYv2 CRC32C validation;
the main reason being that Python has no built-in module for calculating CRC32C.
To perform CRC32C, third-party modules need to be installed,
but we are uncomfortable doing that for the following reasons:
There are more than one third-party modules providing CRC32C, e.g.,
crcmod
,crc32c
,google-crc32c
, etc. Problem is, there is no known clear comparison between them, so we cannot tell easily which one is ‘best’.Some of these third-party modules seem to be no longer being maintained.
Most of the available third-party modules are binary distribution. This potentially causes problems with existing binaries/libraries, not to mention possible (albeit unlikely) vector for malware.
We really don’t like adding dependencies outside those that are really needed.
In short, we have strong reasons to NOT implement PROXYv2 CRC32C validation, and we have plans to NEVER implement it.
If you absolutely need PROXYv2 CRC32C validation,
you should perform it yourself in the handle_PROXY()
hook.
To assist you, we have provided the whole_raw
, tlv_start
, and tlv_loc
attributes.
You should do the following:
Choose a CRC32C module of your liking, install that, and import it.
Find the “CRC32C” TLV Vector in
whole_raw
; it would start at bytetlv_start + tlv_loc["CRC32C"]
Zero out the 4-octet Value part of the “CRC32C” TLV Vector
Perform CRC32C calculation over the modified
whole_raw
Convert the result to big-endian bytes, and compare with the
.CRC32C
attribute of the ProxyTLV instance
Example:
# The int(3) at end is to skip over the "T" and "L" part
offset = proxy_data.tlv_start + proxy_data.tlv.tlv_loc["CRC32C"] + 3
# Since whole_raw is a bytearray, we can do slice replacement
proxy_data.whole_raw[offset:offset + 4] = "\x00\x00\x00\x00"
# Actual syntax will depend on the module you use
calculated: int = crc32c(proxy_data.whole_raw)
# Adjust first part as necessary if calculated is not int
validated = calculated.to_bytes(4, "big") == proxy_data.tlv.CRC32C
Good luck!