I did read your comment on BGP lending itself to SMP. Can you elaborate on where you might have seen this? It has been a pretty monolithic implementation for as long as I can remember. In fact, that was why I asked the question, to see if anyone had actually observed a functioning multi-processor implementation of the BGP process.
I can make the SMP statement with some authority as I have done the internal design of the OpenBGPd RDE and my co-worker Claudio has implemented it. Given proper locking of the RIB a number of CPU's can crunch on it and handle neighbor communication indepently of each other. If you look at Oracle databases they manage to scale performance with factor 1.9-1.97 per CPU. There is no reason to believe we can't do this with the BGP 'database'.
Neat! So you were thinking you would leave the actual route selection process monolithic and create separate processes per peer? I have seen folks doing something similar with separate MBGP routing instances. Had not heard of anyone attempting this for a "global" routing table with separate threads per neighbor (as opposed to per table). What do you do if you have one neighbor who wants to send you all 2M routes though? I am thinking of route reflectors specifically but also confederation EIBGP sessions. I think you hit the nail on the head regarding record locking. This is the thing that is going to bite you if anything will. I have heard none of the usual suspects speak up so I suspect that either this thread is now being ignored or no one has heard of an implementation like the one you just described.