By Vern Paxson, creator of Zeek (Bro) and co-founder of Corelight
Having worked on Zeek (Bro) for well over two decades now, it’s hugely gratifying – and frankly still somewhat amazing – to see how widely it is used in today’s enterprises. Zeek’s real-time analysis capabilities, extensible scripting, community-contributed packages, and rich, detailed logs clearly provide a great deal of value to sites looking for industrial-strength illumination of their network activity.
While Zeek is fully open-source, running it on your own in a production environment can be very difficult due to performance and management challenges. Indeed, a key value proposition we offer at Corelight is removing those roadblocks for our customers while still providing them with all the richness and power of open-source Zeek. We also know of at least 10 companies that include Zeek in their products, which is terrific – the more companies that use Zeek, the better!
In that light, a recent blog post by Vectra, Don’t do it: Rolling your own production Zeek deployment, immediately caught my eye, as it seems to be confusing Zeek – a powerful, programmable platform – with a subset of the data fields that Zeek can extract in its default configuration. I found the post to be potentially misleading. It begins:
In a previous blog, we wrote about the benefits that come with Zeek-formatted metadata. This blog builds on that thread by discussing why our customers come to us as an enterprise solution to support their Zeek deployments.
Zeek users will heartily agree with the first sentence, that Zeek’s logs provide a great deal of value, both in substance and in how they’re formatted. (The actual blog post referred to here, presumably Why network metadata is just right for your data lake, doesn’t actually tell you much about the metadata. However, you can read about using Zeek logs here, see all of the types of logs supported by open-source Zeek here, and follow a detailed example of how to employ Zeek logs to get insight into encrypted traffic here.)
The second sentence (“… support their Zeek deployments”) intrigued me, because from our past service engagements at Corelight, I know that providing support for enterprise Zeek deployments is itself challenging in many regards. I was interested to learn about how Vectra structures such support engagements.
Reading further, however, it becomes clear that what’s being described is not support for a customer’s existing Zeek deployment, but rather replacing the deployment with Vectra’s solution. That’s certainly a fine business to be in – but one that I know immediately raises a range of technical questions. If I were replacing my existing Zeek deployment, I would want to know:
- What version of Zeek does the vendor provide?
- What is the depth of their expertise on / support for Zeek? Do they contribute to the open source project?
- Which of the more than 60 log types provided by open-source Zeek are supported? (You can find a subset of Corelight’s list here.)
- Does the vendor support Zeek packages contributed by the community or custom Zeek scripts?
- Do those scripts run using any form of isolation to prevent them from interfering with Zeek’s core operation?
- Which of Zeek’s frameworks – such as those for supporting configuration customizations, extracting files, controlling other network elements, or incorporating threat intelligence – are provided?
The blog post doesn’t discuss any of this information. This had me wondering whether Vectra is truly providing a Zeek-based solution, or employing a different, non-Zeek engine with a much more limited set of functionality. The fact that I don’t find any changes attributed to Vectra listed among open-source Zeek contributions, nor any mention of Vectra in the three years of Zeek mailing list archives I checked – nor even “Bro” or “Zeek” topic tags on Vectra’s blog – leads me to conclude that the alternative explanation of a non-Zeek engine is very likely the reality. On re-reading the blog post, I see what now strikes me as sleight-of-hand in phrasing certain elements to avoid putting this basic fact front-and-center in the description of their technology. This is not a small distinction: as Corelight’s Greg Bell blogged recently, Zeek Is much more than a data format.
I’m all for innovators coming up with new approaches and new capabilities. That’s been at the heart of my 20+ years in research. A fundamental rule in doing so is to be clear what you’re doing, what you’re not doing, and how you are distinct. These are basic requirements when innovating – forms of clarity that, in Vectra’s blog post, I find lacking.
Vern Paxson is creator of open source Zeek; professor of Computer Science at UC Berkeley; co-founder and Chief Scientist of Corelight; leader of the Networking and Security Group at the International Computer Science Institute; winner of numerous awards including the IEEE Internet Award, ACM Grace Murray Hopper Award, Facebook Internet Defense Prize, and ACM SIGCOMM Test of Time Award; former Transport Area Director of the IETF; and former chair of the Internet Research Task Force.