On 10 August 2017 at 01:52, Kasper Adel <karim.adel@gmail.com> wrote:
We are pretty new to those new-age network orchestrators and automation,
I am curious to ask what everyone is the community is doing? sorry for such a long and broad question.
What is your workflow? What tools are your teams using? What is working what is not? What do you really like and what do you need to improve? How mature do you think your process is? etc etc
The wheels here move extremely slowly so it's slowly, slowly catchy monkey for us. So far we have been using Ansible and GitLab CI and the current plan is to slowly engulf the existing network device by device into the process/toolset.
Wanted to ask and see what approaches the many different teams here are taking!
We are going to start working from a GitLab based workflow.
Projects are created, issues entered and developed with a gitflow branching strategy.
GitLab CI pipelines run package loadings and run tests inside a lab.
Yes that is the "joy" of GitLab, see below for a more detailed breakdown but we use docker images to run CI processes, we can branch and make merge requests which trigger the CI and CD processes. It's not very complicated and it just works. I didn't compare with stuff like BitBucket, I must admit I just looked at GitLab and saw that it worked, tried it, stuck with it, no problems so far.
Tests are usually python unit tests that are run to do both functional and service creation, modification and removal tests.
For unit testing we typically use python libraries to open transactions to do the service modifications (along with functional tests) against physical lab devices.
Again see below, physical and virtual devices, and also some custom python scripts for unit tests like checking IPv4/6 addresses are valid (not 999.1.2.3 or AA:BB:HH::1), AS numbers are valid integeters of the right size etc.
For our prod deployment we leverage 'push on green' and gating to push package changes to prod devices.
Thanks
Yeah that is pretty much my approach too. Device configs are in YAML files (actually multiple files). So one git repo stores the constituent YAML files, when you update a file and push to the repo the CI process starts which runs syntax checks and semantic checks against the YAML files (some custom python scripts basically). As Saku mentioned, we also follow the “replace entire device config” approach to guarantee the configuration state (or at least “try” when it comes to crazy old IOS). So this means we have Jinja2 templates that render YAML files into device specific CLI config files. They live in a separate repo and again many constituent Jinaj2 files make one entire device template. So any push to this Jinja2 repo triggers a separate CI workflow which performs syntax checking and semantic checking of the Jinja2 templates (again, custom Python scripts). When one pushes to the YAML repo to update a device config, the syntax and semantic checks are made against the YAML files; they are then “glued” together to make the entire device configs in a single file, the Jinja2 repo is checked out, the entire YAML file is used to feed the Jinja templates and configs are built and now the vendor specific config needs to be syntax checked. This CD part of the process (to a testing area) is a WIP still, for Junos we can push to a device and use “commit check” for IOS and others we can’t. So right now I’m working on a mixture of pushing the config to virtual IOS devices and to physical kit in the lab but this also causes problems in that interface / line card slot numbers/names will change so we need to run a few regex statements against the config to jimmy it into a lab device (so pretty ugly and temporary I hope). When the CD to “testing” passes then CD to “production” can be manually triggered. Another repo stores the running config of all devices (from the previous push). So we can push the candidate config to a live device (using Ansible with NAPALM [1]) and get a diff against the running config, make the “config replace” action, then download the running config and put that back into the repo. So we have a local stored copy of device configs so we can see off-line the diff’s between pushes. It also provides a record that the process of going form YAML > Jinaj2 > to device produces the config we expected (although prior to this one will have had to make a branch and then a merge request, which is peer reviewed, to get the CD part to run and push to device, so there shouldn’t be any surprises this late in the process!). Is it fool proof, no. It is a young system still being design and developed. Is it better than before, hell yes. Cheers, James. [1] Ansible and NAPALM here might seem like overkill but we use Ansible for other stuff like x86 box management so this means configuring a server or a router is abstracted through one single tool to the operator (i.e. playbooks are use irrelevant of device type, rather than say playbooks for servers but python scripts for firewalls). Also we use YAML files as config files for x86 boxes also living in GitLab with a CI/CD process so again, one set of tools for all.