Every tool has its use. Also, they have several different sized appliances. How much CPU use you get depends on how many cores you throw at the problem. They can use multiple cores/processors. The result given in one test might not match someone else's test if they have higher end hardware, maybe better than the appliances Vyatta ships.
It's actually rather hard with current pc hardware to get to multiple cores engaged in paralell per input interfaces. while you can plan for various cases the the one to account for is the small packet performance not overwhelming the capabilities of a single cpu core.
Not anymore. Linux will do processor per flow and it will remember which processor handed it traffic outgoing and try to route the reply back to the same CPU so you reduce cache misses. If you have multiple queues on the NICS, multiple CPUs can be operating on the NIC at the same time. The current servers we are using in production have eight queues, the older ones had four. So I can have eight different cores handing traffic to the NIC and the driver remembers which CPU it was and when a packet is received on a flow, sends the interrupt to the CPU that started it. But again, if you have a 10 or a 100 meg link into an office, I don't care how small the packets are, a linux box will handle the traffic just fine. Sure, it isn't going to saturate a 10G interface and do firewalling and VPN and NAT but that isn't what we are talking about here. We are talking average office connectivity. The firewall to the WAN. REF: http://lwn.net/Articles/382428/ but it has come a long way in the past year.