It's not just AS_PATH, a lot of the reason so many duplicate updates occur (nearly 50% of all updates at times, and often more during the busiest times) is because on the other end implementations don't keep egress advertisement state per attribute (e.g., if cluster_list length just triggered an internal transition then a new update is sent to external peers with no new information because the determining internal attributes are stripped before transmitting the new update), yet those *prefixes* might well be suppressed as a result of the implementation and/or network architecture on the other end of the BGP connection.
Then you couple what Joe was pointing out, where intermediate nodes with consistently unstable links or "paths" result in penalizing an entire prefix, not just the unstable paths, and it makes for more brokenness than benefit when route flap damping is employed.
It's not that people haven't studied and understand why this occurs, the issue is that implementation optimizations seem to always win out today over systemic state effects (i.e., that "be conservative in what you send" thing doesn't seem to apply in practice, unfortunately).
might some of this be that the implementations use router-id to fill in an unconfigured rr cluster-id? randy