Discover topology using LLDP
by Sylvain Baubeau, 09/10/2018We recently added support for automatic topology discovery using Ansible. During last hackathon, we discussed about adding an LLDP probe in a dynamic way by creating a LLDP probe in Skydive.
Let’s see how it works.
How to use it
Enabling LLDP in Skydive is very easy, you only need to enable the lldp
probe in
the agent configuration file (section agent -> topology -> probes). By default, Skydive
will listen for LLDP packets on - almost - all interfaces. You can explicitely
specify the list of the interfaces using the interfaces
attribute in section agent -> topology -> lldp
.
When started, the probe will start listening for LLDP traffic on his list of interfaces.
When a LLDP packet is received, Skydive will parse the packet and retrieve LLDP information. It will
then create one node for the chassis
and an other node for the port
. It will attach
LLDP information as metadata of both nodes.
Why not using an external daemon
- The self-contained aspect of Skydive is very important for us. We want to keep the ability to just copy the binary and have everything working as much as possible.
- Capturing and analyzing traffic is something Skydive already does obviously :-)
- The library we use for flow dissection gopacket has pretty good LLDP support
The probe is pretty small (around 500 lines of codes) and simple. That makes it a nice example of how to add a Skydive topology probe.
A minimal probe
A Skydive probe only has to implement the Probe
interface that defines only 2 methods:
Start
and Stop
. So our very first version of our probe could be:
We then modify the agent/probes.go
file to instantiate our probe:
At this point, Skydive should be able to run with this empty probe.
Probe principles
When starting the probe, some interfaces may not be present yet, or in a wrong state.
So we want to start listen LLDP traffic when they appear or when their state is changed.
The Skydive netlink
probe will populate the graph with a node when an interface appears and
will update its metadata when its state changed.
Our probe will therefore listen for graph events and consider only the nodes - matching capable network interfaces - and start capturing LLDP traffic.
A graph listener must implement 6 methods:
- OnNodeAdded(n *graph.Node)
- OnNodeUpdated(n *graph.Node)
- OnNodeDeleted(n *graph.Node)
- OnEdgeAdded(e *graph.Edge)
- OnEdgeUpdated(n *graph.Edge)
- OnEdgeDeleted(e *graph.Edge)
We are not interested in all the events. DefaultGraphListener
implements an empty version
of every one of them. So we can embed DefaultGraphListener
in our probe and only
implement the ones we are interested.
The events we care about are:
- OnEdgeAdded: when an interface appears, the
netlink
probe will create a node for the interface and link it to its owner: the host machine - OnNodeUpdated: when the state of an interface changes
For both of the events, we decide if we should start a capture. We should start a capture when
- no LLDP capture is running for this interface
- when its first packet layer is Ethernet and it has a MAC address
- when the interface is listed in the configuration file or we are in auto discovery mode
Starting LLDP capture
To get the LLDP frames, we capture traffic using AFpacket and read at most
lldpSnapLen
(which is set at 4096) bytes of the packet. We specify a BPF filter
to only receive LLDP packets.
We can now receive LLDP packets by calling the Run
method with a callback that
will receive the packet as argument.
We now need to handle the LLDP packet and enhance the graph with this information. Every LLDP frame is associated by a chassis and a port. So it makes sense to create a node for each. LLDP information will be stored inside the metadata. Let’s prepare the metadata for the chassis:
We then retrieve more LLDP info - such as VLANs, Link Aggregation - but we’ll
skip this part as it mostly copying fields from the gopacket
structures
to the metadata map.
If it’s the first LLDP packet that we received, a new node for this chassis should be created. Otherwise we can just retrieve this existing node and update the metadata with fresh values. To create a node, an ID must be specified along with the metadata. In this case, we need to carefully generate the ID.
The LLDP probe is executed by the Skydive agent. LLDP information is added to the agent graph as described above. The agent will forward his graph to a Skydive analyzer. The analyzer receives the graph of multiple agents. These agents could be running on machines connected to the same network switch, and therefore receive LLDP from the same chassis. We need to avoid having multiple nodes in the Skydive graph corresponding to the same chassis. One way is to generate the chassis node with an ID that will be the same for all the agents. In our LLDP probe, we use the couple chassis ID/chassis ID type:
The node for the port is handled pretty much the same way.
Once we have our two nodes created, we create ownership and L2 links between them:
Conclusion
The probe is really new and may contain a few bugs but it already provides useful topology discovery. Please report any issue you may be facing with it.
I hope this post will help you writing your own post probe for Skydive :-) Contributions are always more than welcome.