On Sun, Aug 29, 2010 at 3:12 PM, Thomas Mangin <thomas.mangin@exa-networks.co.uk> wrote:
However to make sense you would need to find a resynchronisation point to only exclude the one faulty message. Initially I thought that the last received KEEPALIVE (for the receiver of the error message) could do - but you find yourselves with races conditions - so perhaps two KEEPALIVE back ? Each TCP packet can contain multiple message, so the messages would have to be then split and ACK individually to find the faulty one and then ACK individually. EOR could be used for that purpose.
Every BGP message header has a portion that starts with 16 all-bits-1 octets, for compatibility. This is distinctive enough an implementation can guess where the next message starts. However, suppose you have an attacker.. if for example, a BGP speaker passes on too short a length value for an attribute... and the attacker knows what length will be sent instead of the right one. Places an entry into the Data portion, that will appear to the other peer to be "the rest" of the malformed update, Result: the "malformed" update is received and appears to be perfectly valid. The next thing the attacker inserts into the data portion of the attribute is the 16 all-bits-1 octets, BGP header, update message, and their malicious update. This will appear properly formed, when the buggy BGP speaker sends it. As far as the buggy BGP speaker is concerned, it has propagated 1 route update. As far as the buggy BGP speaker's other peers are concerned, they have received 3 messages from the buggy speaker. * The update "completed" in the attribute data section. (This is "malformed", but intentionally not detectable as malformed) * The maliciously injected route. (This isn't supposed to exist. The buggy speaker is unaware of its existence, there is a disagreement between peers about how the message is interpreted) * A malformed message that does not make any sense. If the injection were perfect, nothing would be detectable as malformed. But alas, the attacker does not know exactly what other attributes or prepending buggy router will add to the message before passing it on. They could work this out through trial and error, however, some admin will hopefully notice all the CEASEs, before the attacker achieved complete success. In this case, by the time the other speakers detect something as malformed, the two preceding updates are already in the table, and possibly even propagated further. A "CEASE" rolls this back, by rolling back the entire session. Peers could (perhaps) safely re-synchronize in this case is if there was an extension to partially roll back some of the updates in a session and request a portion of the messages to be resent. Or if an extension such as authentication is used to make it impossible to inject BGP messages within the value of an attribute. Through data quarantine: requiring all BGP speakers to disallow the all-bits-1 sequence in any attribute value. Or through peer-specific authentication mechanisms, or checksums and digital signature, in the message header portion of each BGP message. -- -J