Re: new BGP hijack & visibility tool “BGPalerter”

16 Aug 2019

      On Fri, Aug 16, 2019 at 5:02 AM Robert Kisteleki <robert@ripe.net> wrote:
...
Hi,
On 2019-08-15 17:38, Christopher Morrow wrote:
...
This looks like fun!
(a few questions for the RIPE folk, I think though below)
What is the expected load of streaming clients on the RIPE service? (I
wonder because I was/am messing about with something similar, though
less node and js... not that that's relevant here).
One of the (IMO) most useful features is that you can filter what you
want to receive. In fact this makes the service useful :-) So unless you
want to tune in to a significant portion of BGP chatter, the load should
not be substantial.
yup, I can see a usecase clearly for: "This is my prefix set, and my
transit-as-set, tell me when there are deviations" (which is probably
2 different connections with 2 different filters to the not-fire-hose
feed - oh the docs say you can provide more than one filter, ok...
cool)

The firehose is perhaps more friendly for folk like an ISP that could
offer some form of monitoring for their customer's prefixes?
It's also useful (to me anyway) to tell me: "I see prefix-A picked up
a new Origin? odd?" or "Wow, someone 7007'd themselves!"

which isn't clearly (to me anyway) simple to do in the 'not firehose'
version of the stream/service...

The firehose also looks like a great feed to add to my other internal
route monitoring things:
  1) get bgp data from my firewall's upstream devices
  2) get bgp from my internal network
  3) eat bmp from my PE/CE device set
  4) add rislive-firehose
  5) add routeviews/ris update data when available (poll each 15m min,
process mrt && ingest data)

determine what patterns/filters/thigns I want to monitor: "did prefixX
just change upstream ASN and I should bias traffic differently toward
that prefix?" etc...
...
...
I hadn't seen the ripe folk pipe up anywhere with what their SLO/etc
is for the ris-live service? (except their quip about: "used to run in
a tmux session I had to occassioanlly ssh into <foo> and restart when
<foo> rebooted" I believe the end of that quip in Iceland was: "and
now its' running as a real service")
It's in between those. We now have a conscious setup which should also
be able to scale up, but bits and pieces (like full monitoring of the
service) are still being developed.
ok cool! as with my question to John Curran about ARIN service SLOs
I'm really asking:
  "Hey, if I inputting this data into my business process I want to
know what to expect from a performance/scalability/outage/reliability
perspective"

if that's not written down and published then some folks MAY chose to
believe: "Well, it's available now, and now and now.. so 'always,
100%!!' seems sane!"
or others may choose to believe; "Well, nice toy you have there... let
me know when it's ready for me to ingest into my production
monitoring/etc systems" <toddle off to the corner to play ball with
cartman...>
...
...
Also, one of the strengths to the 'monitoring as a service' folks is
their number of collection points and breadth of ASN to which they
interconnect those points/ RISLive, I think, reports out from ~37 or
so RIPE probes, how do we (the internet) get more deployed (or better
interconnection to the current sets)? and maybe even more
imoprtantly... what's the right spread/location/interconnectivity map
for these probes?
RIS Live provides data from RIS, which has a bunch of collectors around
the world (see
https://www.ripe.net/analyse/internet-measurements/routing-information-servi...)
with many hundreds of peering sessions. But it is by no means complete
in terms of coverage.
If and how the community (NANOG or RIPE or else) should work on optimal
data collection is indeed a useful discussion to have.
ok, cool! :)
...
Cheers,
Robert
...
thanks! for showing what's possible with tooling being developed by
like minded individuals :)
-chris