Corelight Blog

We make the world's networks safer.

Network Security Monitoring: Your best next move — December 11, 2018

Network Security Monitoring: Your best next move

By Richard Bejtlich, Principal Security Strategist, Corelight

Welcome to the first in a regular series of blog posts on network security monitoring (NSM).

In 2002 Bamm Visscher and I defined NSM as “the collection, analysis, and escalation of indications and warnings to detect and respond to intrusions.” We were inspired by our work in the late 1990s and early 2000s at the Air Force Computer Emergency Response Team (AFCERT), and the operations built on the NSM software written by Todd Heberlein. Although NSM methodology applies to any sort of evidence or environment, these posts will largely describe NSM for network traffic in the enterprise.

As might be appropriate for the first post in a series on NSM, I will explain why I believe NSM is the first step one should take when implementing a security program. This may sound like a bold claim. Shouldn’t one collect logs from all your devices first, or perhaps roll out a shiny new endpoint detection and response (EDR) agent? While those steps may indeed benefit your security posture, they are not the first steps you should take.

In 2001 Bruce Schneier neatly summarized a shared vision for security: “monitor first”. I concur with this strategy, because I advocate basing security decisions on evidence, not faith. In other words, before making changes to one’s security posture, it is more efficient and effective to determine what is happening, and address the resulting discoveries first. My 2005 post Soccer Goal Security expands on this concept.

If one accepts the need to gather evidence, and identify what is happening in one’s environment as a necessary precursor to making changes, then we must determine how best to gather that evidence. Elsewhere I have advocated for four rough categories of intelligence, which I repeat here. They are ordered by increasing difficulty of implementation, but also likely increasing granularity of information.

The first way to identify what is happening in your environment is to rely on third party notification. As Mandiant’s M-Trends reports have been documenting for years, as of 2018, 38% of the firm’s incident response workload began with the victim learning of an intrusion via a third party. This is a cheap way to get insights into your security posture, as law enforcement, or worse, reporter Brian Krebs, is acting as your free threat intelligence provider. However, you are already days or weeks behind the intruder, and you must soon hire a consultancy to instrument and protect your network. It is important to maintain good relations with law enforcement and the media, but you should not rely on them for network intelligence

The second method, and the focus of this blog series, is network security monitoring. Begin by deploying a NSM sensor collecting, at a minimum, Zeek data at the gateway connecting your environment to the public Internet. This will see so-called “north-south” traffic (visibility for “east-west” traffic will be covered in a later post). By collecting NSM data, one has not interrupted daily IT operations or users, other than perhaps a brief outage to install a network tap. If administrators decide to (temporarily) use a switch SPAN port to see network traffic, users will suffer no interruption of service whatsoever. With a simple deployment, security teams gather a wealth of data about their environment and threat activity. I will address the specific benefits in future posts.

The third method is to collect logs from systems, servers, architecture, and other devices throughout the network. This step requires deploying not only a log management platform to collect, store, and present the data, but also reconfiguring each device to send its logs to the log management platform. Unlike the NSM deployment, installing and configuring a log management system is a demanding project. While the benefits are ultimately worthwhile, the project is much more involved, hence its status as the third step one should take.

The fourth way to learn about threat activity in the enterprise is to instrument the endpoints with an EDR agent. This is even a bigger project than the log collection effort, as the EDR agent could interfere with business operations while trying to observe and possibly interdict malicious activity. As with log management, I am not arguing against EDR. EDR is a tool that yields wonderful benefits for visibility and control. EDR is especially attractive the more mobile and distributed one’s workforce is, and the greater the amount of encrypted network traffic one encounters. However, the level of effort and return associated with NSM means I prefer network-centric visibility strategies prior to installing log management or EDR.

At this point you may ask “isn’t third party visibility the first step when trying to learn about threat activity? You listed NSM as second!” That is true, but I don’t consider third parties as a reliable method, or an especially proactive one. When called by the FBI, one should be able to reply “yes, thank you for calling, but I already detected the activity and we are handling it now.”

Some of you may also ask “how can NSM be first, when I already have a security program?” In that case, I suggest you make “NSM next!” In other words, augment your existing environment with NSM, and let the data help guide future security decisions.

Finally, you might ask if this is a workable solution. Has anyone ever done this? I’ve used or recommended the methodology in this blog series to dozens of organizations, from small start-ups of less than 100 people, to the largest corporate entities of half a million identities under management with global presence.

In future posts I will expand upon all things NSM. I look forward to you joining me on this journey.

The last BroCon. It’ll be Zeek in 2019! — November 5, 2018

The last BroCon. It’ll be Zeek in 2019!

By Robin Sommer, CTO at Corelight and member of the Zeek Leadership Team

I’m back in San Francisco after the last ever BroCon! Why the last BroCon? Because the Bro Leadership Team has announced a new name for the project. After two years of discussion, no shortage of suggestions, and a final shortlist going through legal review, it was time to commit: It’ll be Zeek! For an explanation of the rationale & background behind the choice, make sure to read Vern Paxson’s blog post or watch him skillfully revealing the new name at the conference.

By holding BroCon in the Washington DC area this year, we were hoping to broaden participation—and that worked: 260 people attended, up over 35% from last year.  We also had the support of eleven corporate sponsors—more than ever!-—which we deeply appreciate. These companies offered attendees a chance to learn about a variety of products and services helping people use and implement Zeek, either in its open source form or as part of commercial offerings.

I think BroCon’s program was particularly strong this year. Marcus Ranum kicked it off with an entertaining and provocative keynote. The main technical program then offered a terrific set of presentations covering a variety of organizations and topics. Some of the conference highlights for me were:

  1. The sheer number of use cases. In the sessions, we saw things like:
    1. using weirds to diagnose split routing problems
    2. using the conn_long log to identify exfiltration / C2 / rogue IT activity
    3. using JA3S to extend SSL fingerprinting to the server side
    4. using SMB logs to find named pipes in the Belgacom attack.  
  2. Watching Salesforce and Morgan Stanley stand up and explain how they use Bro to defend themselves was inspirational.
  3. The depth of technical expertise among attendees was really impressive. Folks keep pushing the boundary of how to scale Zeek clusters and come up with clever use cases of its various frameworks.
  4. Selling Bro posters to benefit Girls Who Code was fantastic.
  5. Vern’s “Zeek” name reveal moment and the positive reception of the name change by the broader community.

We received permission to record most of the talks and are currently editing the material to synchronize videos with slide sets. As soon as that’s finished, we’ll upload them to the Bro YouTube channel.

As we look to next year, the Zeek Leadership Team will begin planning the 2019 event soon. If you have attended this year, please take a moment to fill out the attendee survey; you should have received a link to provide us with feedback about program and logistics. In 2019, we’ll also do another European workshop as well. Registration details will come soon, but you can save the date already: We’ll be at CERN, Switzerland, from April 9-11.

Lastly, it will take some time to really make the change from Bro to Zeek. The soon-to-be-released version 2.6 will still be “Bro”—from then on it’ll be “Zeek.” Over the coming weeks and months you will start seeing changes, but rest assured we’ll be careful: There’s a lot to update, and we certainly don’t want to break your deployments.

Thanks for attending the last ever BroCon!

IMG_2198 2

Log enrichment with DNS host names — October 25, 2018

Log enrichment with DNS host names

By Christian Kreibich, Senior Engineer, Corelight

One of the first tasks for any incident responder when looking at network logs is to figure out the host names that were associated with an IP address in prior network activity. With Corelight’s 1.15 release we help automate the process and I would like to explain how this works.

Zeek (formerly known as Bro) provides a logging framework that gives users great control over summarization and reporting of network activity. Equipped with dozens of logs by default, it provides convenient features to extend these logs with additional fields, filter log entries according to user-defined criteria, create new log types, and hook new activity into logging events. Several log types provide identifiers that allow convenient pivoting from one log type to another, such as conn.log’s UID that many other log types use to link app-layer activity to the underlying TCP/IP flows.

Other information is only implicitly linked across log types, so analysts need to reveal it in manual SIEM-based post-processing. One example of such implicitly available information is host naming, which lets analysts look past IP addresses like to corresponding (and often revealing) DNS names like, a recent example from Spamhaus’s DBL. While Zeek’s dns.log closely tracks address-name associations, other logs do not repeat this information. Manually establishing the cross-log linkage can prove tedious since offline resolution of those names generally does not provide accurate results. Instead, one needs to identify historic name lookups that temporally most closely preceded TCP/IP flows to/from resulting IP addresses. (Other approaches, such as leveraging HTTP Host headers, also exist but here we were looking for the most generic approach.)

Zeek’s stateful network-oriented scripting language makes it ideally suited to automate such linkage: we can enrich desired logs with DNS host names in response to network events unfolding in real time. In Corelight’s 1.15 release we provide this ability via the Namecache feature. When enabled, Zeek starts monitoring forward and reverse DNS name lookups and establishes address-name mappings that allow subsequent conn.log entries to include names and the source of the naming (here, DNS A or PTR queries). For analysts requiring immediate access to host names, conn.log now readily provides this information. The following (slightly pruned) log snippet using Zeek’s JSON format shows an example:


Our data analysis shows that for the most relevant addresses — those outside of local networks — Namecache can establish names in more than 90% of log entries. In addition to the conn.log enrichment the feature adds a separate log, reporting operational statistics (powered by the SumStats framework) such as the cache hit rate in various contexts. Starting with the 1.16 release, you’ll see local vs non-local hit rates for your network as well.

None of the above required patching the core Zeek distribution. All functionality exists in form of new event handlers and state managed via the scripting language. Nevertheless, implementing Namecache posed some interesting technical challenges. Most immediately, Bro’s multiprocessing architecture and flow distribution mean that in a cluster setting (which we do use in our Sensors) the Zeek worker observing a DNS lookup most likely is not the one observing the TCP/IP connection to the resulting IP address. Moreover, since their respective processing is fully asynchronous we also cannot guarantee that processing the DNS query finishes prior to that of the subsequent TCP/IP connection. Finally, to approach global visibility of the address–name mappings, we need to communicate the mappings across the cluster via Bro events, raising questions about event communication patterns, sustainable event rates, and processing races.

One key observation immediately simplified the problem: Zeek writes conn.log entries only when it expires its state for a given flow, i.e., at the very end of the flow’s lifetime. This means we have at least several seconds to propagate naming information for this flow across the cluster before needing to access it.

This left the event flow to tackle. In a first iteration we decided to centralize mapping ownership in the manager process: workers communicate new mappings to the manager process, which propagates additions to other workers and tracks mapping size and age. When mapping state needs to get pruned, the manager sends explicit pruning events to the workers. This proved clearly inferior to a distributed approach where the workers manage mappings autonomously, including expirations, and only communicate new mappings to the manager. The manager in turn only relays additions across the workers, saving the memory needed for an extra copy of the mappings. This approach worked quite well but induced a few percent of packet loss on our most heavily loaded AP-1000 appliances. In a final tweak, we tuned the rate at which workers transmit mapping additions. With this change we no longer observed any operational overhead of the activated Namecache feature while preserving its effectiveness.

The Namecache feature is only one example of a wide range of log enrichments we envision. We’ll soon migrate the cluster communication to the new Broker framework, add improved multicast DNS support, and we’re considering other sources of naming as well as inverse mappings where names get enriched with corresponding IP addresses.

Network security monitoring vs supply chain backdoors — October 4, 2018

Network security monitoring vs supply chain backdoors

By Richard Bejtlich, Principal Security Strategist, Corelight

On October 4, 2018, Bloomberg published a story titled “The Big Hack: How China Used a Tiny Chip to Infiltrate U.S. Companies,” with a subtitle “The attack by Chinese spies reached almost 30 U.S. companies, including Amazon and Apple, by compromising America’s technology supply chain, according to extensive interviews with government and corporate sources.” From the article:

Since the implants were small, the amount of code they contained was small as well. But they were capable of doing two very important things: telling the device to communicate with one of several anonymous computers elsewhere on the internet that were loaded with more complex code; and preparing the device’s operating system to accept this new code. The illicit chips could do all this because they were connected to the baseboard management controller, a kind of superchip that administrators use to remotely log in to problematic servers, giving them access to the most sensitive code even on machines that have crashed or are turned off.

Companies mentioned in the story deny the details, so this post does not debate the merit of the Bloomberg reporters’ claims. Rather, I prefer to discuss how a computer incident response team (CIRT) and a chief information security officer (CISO) should handle such a possibility. What should be done when hardware-level attacks enabling remote access via the network are possible?

This is not a new question. I have addressed the architecture and practices needed to mitigate this attack model in previous writings. This scenario is a driving force behind my recommendation for network security monitoring (NSM) for any organization running a network, of any kind. This does not mean endpoint-centric security, or other security models, should be abandoned. Rather, my argument shows why NSM offers unique benefits when facing hardware supply chain attacks.

The problem is one of trust and detectability. The problem here is that one loses trust in the integrity of a computing platform when one suspects a compromised hardware environment. One way to validate whether a computing platform is trustworthy is to monitor outside of it, at places where the hardware cannot know it is being monitored, and cannot interfere with that monitoring. Software installed on the hardware is by definition untrustworthy because the hardware backdoor may have the capability to obscure or degrade the visibility and control provided by an endpoint agent.

Network security monitoring applied outside the hardware platform does not suffer this limitation, if certain safeguards are implemented. NSM suffers limitations unique to its deployment, of course, and they will be outlined shortly. By watching traffic to and from a suspected computing platform, CIRTs have a chance to identify suspicious and malicious activity, such as contact with remote command and control (C2) infrastructure. NSM data on this C2 activity can be collected and stored in many forms, such as any of the seven NSM data types: 1) full content; 2) extracted content; 3) session data; 4) transaction data; 5) statistical data; 6) metadata; and 7) alert data.

Most likely session and transaction data would have been most useful for the case at hand. Once intelligence agencies identified that command and control infrastructure used by the alleged Chinese agents in this example, they could provide that information to the CIRT, who could then query historical NSM data for connectivity between enterprise assets and C2 servers. The results of those queries would help determine if and when an enterprise was victimized by compromised hardware.

The limitations of this approach are worth noting. First, if the intruders never activated their backdoors, then there would be no evidence of communications with C2 servers. Hardware inspection would be the main way to deal with this problem. Second, the intruders may leverage popular Internet services for their C2. Historical examples include command and control via Twitter, domain fronting via Google or other Web sites, and other covert channels. Depending on the nature of the communication, it would be difficult, though not impossible, to deal with this situation, mainly through careful analysis. Third, traditional network-centric monitoring would be challenging if the intruders employed an out-of-band C2 channel, such as a cellular or radio network. This has been seen in the wild but does not appear to be the case in this incident. Technical countermeasures, whereby rooms are swept for unauthorized signals, would have to be employed. Fourth, it’s possible, albeit unlikely, that NSM sensors tasked with watching for suspicious and malicious activity are themselves hosted on compromised hardware, making their reporting also untrustworthy.

The remedy for the last instance is easier than that for the previous three. Proper architecture and deployment can radically improve the trust one can place in NSM sensors. First, the sensors should not be able to connect to arbitrary systems on the Internet. The most security conscious administrators apply patches and modifications using direct access to trusted local sources, and do not allow access for any reason other than data retrieval and system maintenance. In other words, no one browses Web sites or checks their email from NSM sensors! Second, this moratorium on arbitrary connections should be enforced by firewalls outside the NSM sensors, and any connection attempts that violate the firewall policy should generate a high-priority alert. It is again theoretically possible for an extremely advanced intruder to circumvent these controls, but this approach increases the likelihood of an adversary tripping a wire at some point, revealing his or her presence.

The bottom line is that NSM must be a part of the detection and response strategy for any organization that runs a network. Collecting and analyzing the core NSM data types, in concert with host-based security, integration with third party intelligence, and infrastructure logging, provides the best chance for CIRTs to detect and respond to the sorts of adversaries who escalate their activities to the level of hardware hacking via the supply chain. Whether or not the Bloomberg story is true, the investment in NSM merits the peace of mind a CISO will enjoy when his or her CIRT is equipped with robust network visibility.

Corelight: a recipe I couldn’t refuse — September 26, 2018

Corelight: a recipe I couldn’t refuse

By Joy Bonaguro, Head of People, Ops, and Data, Corelight

It’s hard to beat a mission like transforming government for the 21st Century. That’s what I’ve been doing for more or less my entire professional life. From building information systems in New Orleans both before and after Hurricane Katrina in 2005 to my latest role as Chief Data Officer of San Francisco, my professional life has been dedicated to public service.

So why the private sector? Why now? Why Corelight?

I first met Greg Bell during a meeting in 2011 when he was a division director at Lawrence Berkeley National Laboratory (Berkeley Lab). At that meeting, he turned an aimless discussion into a structured troubleshooting session. I gravitated towards him as a mentor.

Once he became CEO of Corelight, I started to watch closely because I knew that this company had three fundamental ingredients for success that made it worth joining:

Ingredient 1: An incredible technology with a mission that matters

Also in 2011, I first heard about open source Bro, the technology that Corelight is built on, when I had to describe how it worked as part of a job interview at Berkeley Lab. My immediate thoughts were a) awesome interview technique b) this technology sounds magical and c) why hasn’t someone built a company on top of it?

I spent the next few years working closely with the cyber team at Berkeley Lab and in that time I learned how real cybersecurity works. I discovered that it is something that extended far beyond compliance, checklists and appliance management and into a living system of dynamic response, continuous evolution, and learned resilience.

Bro empowered all of this. Whenever I try to describe Bro, I draw the following diagram. Bro extends well beyond signature based detection (SDS) to behavioral based detection and then to a proactive response. Bro is adaptive and scalable.

image 1

Signature based detection (SDS) is a subset of intrusion based detection (IDS). Bro encapsulates both of these and is truly an intrusion protection system (IPS).

Cyber threats are a daily news item. Bro, deployed at scale and with the reliability and ease of Corelight’s solution, is uniquely positioned to help our institutions solve the ever mutable threat of cybersecurity so prevalent in our world today. It’s a mission with a global scale.

Ingredient 2: A culture worth waking up to

Peter Drucker is quoted with saying “culture eats strategy for lunch.” When interviewing at Corelight, it was like a case study in how NOT to be a stereotypical “Silicon Valley” startup (You may have seen the popular HBO show…this isn’t that).

Yes, the Corelight team is insanely smart with world-class engineers and one of the founders is Vern Paxson, the inventor of Bro. But that’s not the whole story. The ethos of Corelight is meaningful collaboration and low ego. This philosophy is set at the top and reinforced throughout the team. Everyone jumps in and helps. Below are just two emblematic images from my first week at the job.

image 2

A broken elevator had everyone chipping in to help with deliveries–including our VP of Finance, Chief Products Officer, and UI Engineer.

image 3

Our VP of Engineering brought in some bike oil to tackle our squeaky bathroom door. No more squeaks!

When Greg asked me to help ensure this culture stuck at scale, I could hardly resist. Culture and organizational health are key differentiators in our modern world, where talent is both discerning and mobile.

Ingredient 3: It’s about empowerment, not fear-mongering

Corelight’s tagline is “illuminate your network.” Merriam-Webster’s dictionary defines “illuminate” as ‘to supply or brighten with light, to make luminous or shining.’ Fundamentally, Corelight is about offering a set of tools that empower cybersecurity professionals to do their jobs more effectively and efficiently.

So much of cybersecurity marketing and branding is dominated by fear-mongering: “Do this or you will be in TROUBLE. Bad things are lurking EVERYWHERE. You CAN’T FIX this alone – you need us and we will solve this for you.” In contrast, Corelight is about acknowledging the challenge and empowering you to solve it.

Corelight does this by providing our customers with elegant, beautifully structured, comprehensive data for analysis and response (and much more soon). We don’t conceal the data to create a dependence on Corelight for insights. Instead, we expose it to the professionals who need it – reflecting our open source heritage in the very nature of our product.

The above ingredients added up to a recipe that I could not refuse. I am thrilled to be joining the Corelight team – a team with the talent and skills to continue to build a technology that will empower enterprises around the world. So if you want an amazing, challenging mission PLUS a healthy and empowering culture, join us! We’re always hiring! 😉


Twenty years of network security monitoring: from the AFCERT to Corelight — September 11, 2018

Twenty years of network security monitoring: from the AFCERT to Corelight

By Richard Bejtlich, Principal Security Strategist, Corelight

I am really fired up to join Corelight. I’ve had to keep my involvement with the team a secret since officially starting on July 20th. Why was I so excited about this company? Let me step backwards to help explain my present situation, and forecast the future.

Twenty years ago this month I joined the Air Force Computer Emergency Response Team (AFCERT) at then-Kelly Air Force Base, located in hot but lovely San Antonio, Texas. I was a brand new captain who thought he knew about computers and hacking based on experiences from my teenage years and more recent information operations and traditional intelligence work within the Air Intelligence Agency. I was desperate to join any part of the then-five-year-old Information Warfare Center (AFIWC) because I sensed it was the most exciting unit on “Security Hill.”

I had misjudged my presumed level of “hacking” knowledge, but I was not mistaken about the exciting life of an AFCERT intrusion detector! I quickly learned the tenets of network security monitoring, enabled by the custom software watching and logging network traffic at every Air Force base. I soon heard there were three organizations that intruders knew to be wary of in the late 1990s: the Fort, i.e. the National Security Agency; the Air Force, thanks to our Automated Security Incident Measurement (ASIM) operation; and the University of California, Berkeley, because of a professor named Vern Paxson and his Bro network security monitoring software.

When I wrote my first book in 2003-2004, The Tao of Network Security Monitoring, I enlisted the help of Christopher Jay Manders to write about Bro 0.8. Bro had the reputation of being very powerful but difficult to stand up. In 2007 I decided to try installing Bro myself, thanks to the introduction of the “brolite” scripts shipped with Bro 1.2.1. That made Bro easier to use, but I didn’t do much analysis with it until I attended the 2009 Bro hands-on workshop. There I met Vern, Robin Sommer, Seth Hall, Christian Kreibich, and other Bro users and developers. I was lost most of the class, saved only by my knowledge of standard Unix command line tools like sed, awk, and grep! I was able to integrate Bro traffic analysis and logs into my TCP/IP Weapons School 2.0 class, and subsequent versions, which I taught mainly to Black Hat students. By the time I wrote my last book, The Practice of Network Security Monitoring, in 2013, I was heavily relying on Bro logs to demonstrate many sorts of network activity, thanks to the high-fidelity nature of Bro data.

In July of this year, Seth Hall emailed to ask if I might be interested in keynoting the upcoming Bro users conference in Washington, D.C., on October 10-12. I was in a bad mood due to being unhappy with the job I had at that time, and I told him I was useless as a keynote speaker. I followed up with another message shortly after, explained my depressed mindset, and asked how he liked working at Corelight. That led to interviews with the Corelight team and a job offer. The opportunity to work with people who really understood the need for network security monitoring, and were writing the world’s most powerful software to generate NSM data, was so appealing! Now that I’m on the team, I can share how I view Corelight’s contribution to the security challenges we face.

For me, Corelight solves the problems I encountered all those years ago when I first looked at Bro. The Corelight embodiment of Bro is ready to go when you deploy it. It’s developed and maintained by the people who write the code. Furthermore, Bro is front and center, not buried behind someone else’s logo. Why buy this amazing capability from another company when you can work with those who actually conceptualize, develop, and publish the code?

It’s also not just Bro, but it’s Bro at ridiculous speeds, ingesting and making sense of complex network traffic. We regularly encounter open source Bro users who spend weeks or months struggling to get their open source deployments to run at the speeds they need, typically in the tens or hundreds of Gbps. Corelight’s offering is optimized at the hardware level to deliver the highest performance, and our team works with customers who want to push Bro to the even greater levels.  

Finally, working at Corelight gives me the chance to take NSM in many exciting new directions. For years we NSM practitioners have worried about challenges to network-centric approaches, such as encryption, cloud environments, and alert fatigue. At Corelight we are working on answers for all of these, beyond the usual approaches — SSL termination, cloud gateways, and SIEM/SOAR solutions. We will have more to say about this in the future, I’m happy to say!

What challenges do you hope Corelight can solve? Leave a comment or let me know via Twitter to @corelight_inc or @taosecurity.

Corelight’s recent contributions to open-source Bro — August 9, 2018

Corelight’s recent contributions to open-source Bro

By Robin Sommer, CTO at Corelight and Bro development lead

When we founded Corelight in 2013, one of our goals was to build an organization that could sustain open-source Bro development long term. At that time, the core team behind Bro was still funded primarily through grants from the National Science Foundation. One of the underlying assumptions coming with that funding was that, with our work, Bro would become self-supporting: production-quality open-source software could generate a revenue stream to support its own development. Today, five years later, with many of the people who created and maintained Bro over the last two decades working at Corelight, that vision is becoming reality. Corelight is bridging the gap between the open-source software and enterprise environments looking for professional, supported products—they get the expertise of Bro’s creators packaged into a high-quality solution. In return, the success of Corelight enables us to invest heavily into advancing open-source Bro. You can rest assured that our team remains as committed to Bro as ever—no change there. In that spirit, I want to take the opportunity here to talk about a few of our more recent contributions to open-source Bro.

Getting Broker Ready for Production

One of Corelight’s main focus areas over recent months has been the Broker transition for Bro 2.6. While a series of Broker prototypes have existed for a while, the current Bro 2.5 still relies on a decade-old communication framework not designed for today’s network loads. Corelight’s Jon Siwek worked hard to tie together all the loose ends of this work, getting Broker ready for 2.6. He moved all of Bro’s standard scripts over to using Broker, and, during that process, improved many aspects of Broker’s API, implementation, documentation, and regression testing. Internally, Corelight is performing a series of tests to ensure that Bro & Broker, and hence the community’s Bro clusters, operate as expected.

Improving CAF

Jon also spent significant time on the innards of CAF, an actor framework library providing the low-level foundation for Broker. He tracked down and fixed a number of issues that were critically affecting stability & performance of Bro clusters. As Bro is gearing up for 2.6, CAF will be soon releasing a new stable version that incorporates all these changes, so that the trio of Bro, Broker, and CAF will smoothly work together. For easier installation, we also integrated CAF into Bro’s source code distribution, so that users don’t need to worry about getting a non-standard dependency in place before building Bro.

Supporting Dynamic Reconfiguration

Corelight recently contributed Johanna Amann’s Configuration Framework to Bro, which fills a long-time gap by providing an easy-use-to, unobtrusive infrastructure for changing Bro’s script-level tuning options dynamically at run-time.

Expanding Logging

We’ve made a number of logging improvements: restructuring DHCP logs (based on a community contribution); expanding connection histories that now capture TCP window closures, and also indicate repeats for a number of situations; and adding rate-limitations of “weirds” to address performance issues when your network exhibits a high level of non-conforming protocol usage. For geo-location, Bro now supports MaxMind’s new GeoIP2 databases, replacing their discontinued legacy format. For Bro’s TLS analysis, we reimplemented OCSP support, which was still missing from the recently contributed port of the analyzer to OpenSSL 1.1. We also added support for Cisco’s FabricPath and PPPoE over QinQ.

Improving Bro’s Scripting Language

We have been working on the scripting language as well: regular expressions can now be case-insensitive, and they can be created dynamically at runtime (which meant tracking down several memory leaks that had previously prevented Bro from allowing this). Sets and vectors have gained new operators, there’s support for bitwise operations, and we wrapped up & merged new functionality for type checking and type-safe casting.

Building & Installing Bro

Bro’s build system now knows to install Bro’s header files so that developers can build plugins without needing a copy of a Bro source tree laying around. We also added better support for cross-compiling Bro.

Bro Packages

As part of the broader Bro ecosystem, Corelight continues to maintain the increasingly popular Bro Package Manager.  We have open-sourced a number of new Bro packages as well:

  • QUIC analyzer/detector parses and detects Google’s implementation of QUIC.
  • Community ID provides a standardized way of labeling traffic flows in network monitors—an approach championed by the Bro and Suricata communities to enable correlation of flows across tools.
  • HTTP Stalling Detector finds stalling DoS attacks taking advantage of web servers’ inability to differentiate legitimate client connecting over slow links from attackers deliberately sending data slowly to cause extra work.
  • JSON Streaming Logs lets Bro write out JSON logs with additional attributes making life easier for external log shippers such as filebeats, logstash, and splunk_forwarder.

BroCon 2018 & Project Leadership

Speaking of community work, the Bro Leadership Team asked Corelight to host BroCon 2018, a task which our Events team has been happy to take on. As Bro’s NSF funding winds down, it is becoming quite challenging for the open-source project to organize large events on its own. Corelight employees are also active members of the Bro Leadership Team. In that role, Corelight staff have focused particularly on finding a new name for the project (still ongoing); moving the project back from Software Freedom Conservancy to ICSI; and being a liaison between the open-source project and the BroCon event team, providing history and context.

And all the little things that make Bro great

While not especially visible, Corelight has probably spent the greatest portion of its time on all the routine maintenance work that—while often hard to notice—is critical for any popular open-source project: fixing bugs & security issues; shepherding and merging community contributions; improving documentation & regression testing; and, crucially, providing users & developers with answers to questions. You can track much of this work on Bro’s public channels, such as mailing lists, the issue tracker, and Git repositories. Some work has to remain behind the scenes, however, such as discussion of security issues as well as interactions involving specifics of peoples’ environments.

We have never been more excited about the project, and are continually gratified and amazed at the way it has grown. Bro has always been popular among its fans, and we strongly believe that as it gets more usable and capable, deployment of Bro will continue to accelerate and really become a fundamental part of the modern security stack.

Databricks + Corelight – A powerful combination for cybersecurity, incident response and threat hunting — July 17, 2018

Databricks + Corelight – A powerful combination for cybersecurity, incident response and threat hunting

Screen Shot 2018-07-16 at 2.23.15 PM

By Alan Saldich, CMO, Corelight and Brian Dirking, Sr. Director Partner Marketing, Databricks

Incident response, threat hunting and cybersecurity in general relies on great data. Just like the rest of the world where virtually everything these days is data-driven, from self-driving cars to personalized medicine, effective security strategies also need to be data-driven.

Whatever security solution, service or business process you implement, it probably relies on data from just four sources: logs, networks, hosts and third party intelligence feeds. Some or all of that data is fed into an analytics stack like Apache Spark™ where advanced analytics, artificial intelligence, machine learning and other techniques can be applied.

When it comes to the data about network traffic though, most organizations are stuck between “not enough data” and “way too much.” The former is normally NetFlow data that provides basic information about network traffic, source / destination IPs, time and date, bytes sent / received and few other important (but sparse) pieces of information. The latter is typically PCAP (packet capture) where every bit on the wire is stored. Since big networks carry a tremendous amount of data, storing it all is non-trivial and can become expensive and cumbersome.

The Alternative

Screen Shot 2018-07-16 at 2.07.56 PMSo what’s the alternative? Well for over 20 years, an open source project called Bro has been used in production at some of the world’s largest organizations and biggest networks, namely government agencies, research universities and very large / web-scale companies.

Bro is a network monitoring framework that ingests a copy of all traffic on a network, parses and analyzes it in real time using its event processing engines and scripts, and then outputs data (“Bro logs”) to some external analytics system like Spark. There are dozens of Bro logs for most common protocols like SMTP, SSL, SMB, DHCP, DNS and many others. Each Bro log contains 10 to 40 fields describing that part of the network traffic.

img bro logs id tracking.png

Furthermore, Bro logs include key pivot points in the data that allow incident responders to follow their instincts or observations by taking advantage of unique identifiers for critical aspects of network traffic: all connections (Connection UID) and files (FUID) are uniquely identified. Along with consistent and precise timestamps across all logs, Bro gives incident responders access to everything they’ll need to resolve most security incidents quickly without having to resort to PCAP except in rare instances.

Corelight was founded by the creators of Bro to deliver network security solutions built on this powerful and widely used framework. Since Corelight does not provide an analytics application, we are excited to work with leading companies like Databricks to combine the power of Bro data with the intelligence of Apache Spark and all of its analytic capability.

The challenge of managing threats in a big data world

Staying abreast of the latest threat isn’t the only challenge. The increasing volume and complexity of threats require security teams to capture and mine mountains of data in order to avoid a breach. Yet, the Security Information and Event Management (SIEM) and threat detection tools they’ve come to rely on were not built with big data in mind resulting in a number of challenges:

  • Inability to scale cost efficiently
    Companies deploy logging and monitoring devices across their networks, end-user devices and production machines to help detect suspicious behavior. These tools produce petabytes of log data that need to be contextualized and analyzed in real-time. Processing petabytes of data takes significant computing power. Unfortunately, most SIEM tools were built for on-premises environments requiring significant build-outs to meet processing demands. Additionally, most SIEM tools charge customers per GB of data ingested. This makes scaling threat detection tools for large volumes of data incredibly cost-prohibitive.
  • Inability to conduct historic reviews in real-time
    Identifying a cybersecurity breach as soon as it happens is critical to minimizing data theft, damages, and creation of backlogs. As soon as an event occurs, security analysts need to conduct deep historic analyses to fully investigate the validity and breadth of an attack. Without a means to efficiently scale existing tools most security teams only have access to a few weeks of historical data. This limits the ability of security teams to identify attacks over long time horizons or conduct forensic reviews in real-time.
  • Abundance of false positives
    Another common challenge is the high volume of false positives produced by SIEM tools. The massive amounts of data captured in OS logs, cloud infrastructure logs, intrusion detection systems and other monitoring devices produce events that in isolation or in connection with other events may signify a compromised network. Most events need further investigation to determine if the threat is legitimate. Relying on individuals to review hundreds of alerts including a large number of false positives results in alert fatigue. Eventually, overwhelmed security teams disregard or overlook events that are in actuality legitimate threats.

In order to effectively detect and remediate threats in today’s environment, security teams need to find a better way to process and correlate massive amounts of real-time and historical data, detect patterns that exist outside pre-defined rules and reduce the number of false positives.

Enhancing threat detection with scalable analytics and AI

Screen Shot 2018-07-16 at 2.25.24 PMDatabricks offers security teams a new set of tools to combat the growing challenges of big data and sophisticated threats. Where existing tools fall short, the Databricks Unified Analytics Platform fills the void with a platform for data scientists and cybersecurity analysts to easily build, scale, and deploy real-time analytics and machine learning models in minutes, leading to better detection and remediation.

Databricks complements existing threat detection efforts with the following capabilities:

  • Full enterprise visibility
    Native to the cloud and built on Apache Spark by the original creators of Apache Spark, Databricks is optimized to process large volumes of streaming and historical data for real-time threat analysis and review. Security teams can query petabytes of historical data stretching months or years into the past, making it possible to profile long-term threats and conduct deep forensic reviews to uncover backdoors left behind by hackers. Security teams can also integrate all types of enterprise data – SIEM logs, cloud logs, system security logs, threat feeds, etc – for a more complete view of the threat environment.
  • Proactive threat analytics
    Databricks enables security teams to build predictive threat intelligence with a powerful, easy-to-use platform for developing AI and machine learning models. Data scientists can build machine learning models that better score alerts from SIEM tools reducing reviewer fatigue caused by too many false positives. Data scientists can also use Databricks to build machine learning models that detect anomalous behaviors that exist outside pre-defined rules and known threat patterns.
  • Collaborative investigations
    Interactive notebooks and dashboards enable data scientists, analysts and security teams to collaborate in real-time. Multiple users can run queries, share visualizations and make comments within the same workspace to keep investigations moving forward without interruption.
  • Cost efficient scale
    The Databricks platform is fully managed in the cloud with cost-efficient pricing designed for big data processing. Security teams don’t need to absorb the costly burden of building and maintaining a homegrown cybersecurity analytics platform or paying per GB of data ingested and retained.

How a Fortune 100 company uses Databricks and advanced cybersecurity analytics to combat threats

A leading technology company employs a large cybersecurity operations center to monitor, analyze and investigate trillions of threat signals each day. Data flows in from a diverse set of sources including intrusion detection systems, network infrastructure and server logs, application logs and more, totaling petabytes in size.

When a suspicious event is identified, threat response teams need to run queries in real-time against large historical datasets to verify the extent and validity of a potential breach. To keep pace with the threat environment the team needed a solution capable of:

  • Large data volumes at low latency: Analyze billions of records within seconds
  • Correct and consistent data: Partial and failed writes cannot show up in user queries
  • Fast, flexible queries on current and historical data: Security analysts need to explore petabytes of data with multiple languages (e.g. Python, SQL)

As an example of how customers are using advanced cybersecurity analytics, check out this recent video from Apple.

For another more general explanation of how Corelight and Databricks can be deployed together, Databricks produced this video that includes an explanation of how they ingest Bro logs as part of their security solution.

Screen Shot 2018-07-16 at 2.17.44 PM

The Challenge

It took a team of twenty engineers over six months to build their legacy architecture that consisted of various data lakes, data warehouses, and ETL tools to try to meet these requirements. Even then, the team was only able to store two weeks of data in its data warehouses due to cost, limiting its ability to look backward in time. Furthermore, the data warehouses chosen were not able to run machine learning.

Screen Shot 2018-07-16 at 10.20.15 AM.png

The Solution

Using the Databricks Unified Analytics platform the company was able to put their new architecture into production in just two weeks with a team of five engineers.

Their new architecture is simple and performant. End-to-end latency is low (seconds to minutes) and the threat response team saw up to 100x query speed improvements over open source Apache Spark on Parquet. Moreover, using Databricks, the team is now able to run interactive queries on all its historical data — not just two weeks worth — making it possible to better detect threats over longer time horizons and conduct deep forensic reviews. They also gain the ability to leverage Apache Spark for machine learning and advanced analytics.

Screen Shot 2018-07-16 at 10.20.21 AM

Final Thoughts

As cybercriminals continue to evolve their techniques, so do cybersecurity teams need to evolve how they detect and prevent threats. Comprehensive network traffic extracted by Bro is the highest quality data available to incident responders and threat hunters, and an essential ingredient to the Databricks analytics platform for cybersecurity. Big data analytics and AI offer a new hope for organizations looking to improve their security posture, but choosing the right platform is critical to success.

Download our Cybersecurity Analytics Solution Brief or watch the replay of our recent webinar “Enhancing Threat Detection with Big Data and AI” to learn how Databricks can enhance your security posture, including ingestion of Bro logs for network traffic monitoring.

Screen Shot 2018-07-16 at 5.13.05 PMScreen Shot 2018-07-16 at 2.17.44 PM

If you’re not familiar at all with Bro, watch Corelight’s two-minute video


How Bro logs gave one company better DNS traffic visibility than their DNS servers — June 11, 2018

How Bro logs gave one company better DNS traffic visibility than their DNS servers

By Howard Samuels, Director of Sales Engineering at Corelight

Bro provides enriched network visibility for top organizations around the world, and there are many use cases for Bro logs.   The security field uses Bro data for incident response and cyber threat hunting. But Bro log use cases don’t always have to involve finding bad actors, identifying breaches or attack blueprints.  The clean, structured, and enriched data from Bro can also be used to simply provide necessary protocol information not otherwise easily obtained from servers.

In this example, we show how a company can obtain DNS visibility and extend the lifespan of a large production deployment of DNS servers via a few Corelight sensors that generate Bro DNS logs. This use case features a utility company with over 50 DNS servers worldwide.  A federal governing body told them that there were DNS lookups going to known cyber threat sites originating from their network and therefore they weren’t sufficiently self-governing their DNS activity. Given the importance and focus on safeguarding utility companies from cyber attacks, getting in front of this DNS issue was of paramount importance.  

The company had limited DNS network visibility because their DNS servers could not effectively log activity.   The DNS logging service on their servers didn’t give enough functional information – and therefore visibility – to identify these malicious communications.  Moreover, the data their DNS servers did provide was a flat file with too much noise and the logging mechanism also had a negative impact on the DNS servers’ overall performance.    

They considered upgrading all their DNS servers, but given the cost they discarded this option.  They determined that a more comprehensive, faster, and cost-effective solution was to deploy Corelight sensors in their main data centers to obtain enriched Bro DNS logs.  With Corelight and Bro they could easily capture both DNS requests and answers to queries and quickly stream them to a SIEM. The benefit was immediately clear when an analyst who had previously tried to identify non-authoritative DNS lookup records was able to easily achieve this with the logs provided by Corelight sensors.

The utility company used packet broker to distribute raw traffic to the Corelight sensors which then do the heavy lifting of extracting the DNS information into a log or data stream for export to their SIEM.  The solution architecture is simply network taps sending traffic into a packet broker. Behind the broker is a Corelight sensor. The packet broker filters are configured to send only DNS traffic to the Corelight Sensor.  The enriched logs are spooled from the Corelight sensor to a datastore and ultimately consumed in a SIEM. Problem solved, customer happy, money saved and the technology team are now heroes. Their security team is now looking at network logs and will look at expanding their use to the SOC.  

Thinking more broadly beyond DNS, consider other crucial network services and the struggle to keep accurate time stamped logs and your ability to get enriched data from these services?  What if network service logging for DHCP, Kerberos, or Radius is set to OFF, FATAL, WARN or ERROR – i.e. not nearly enough data? Or, ALL, DEBUG or TRACE – too much noise? What if INFO misses the data you need?  What if there is only one logging service level and therefore not configurable? Bro provides better network data that can spare operational cycles and extend the life of network services, thereby saving money and just as importantly reducing operational headaches.

Another cool thing about Bro: SMB analysis! — May 29, 2018

Another cool thing about Bro: SMB analysis!

By James Schweitzer, Federal Solution Engineer at Corelight

If you’re reading this blog, you probably know that Bro can uncover indicators of compromise and discover adversary lateral movement by monitoring east-west traffic within the enterprise. But you may not know about one of the best sources of data for this purpose, the Bro server message block (SMB) logs.  Bro’s SMB protocol analyzer has undergone several iterations, and it is now a built-in feature that many Bro users might have overlooked. If you are running Bro 2.5, all that is needed is to manually load the SMB policy.

corelight-youtube-smb-schweitzer (1)

SMB is used for many purposes. Most users of Windows networks rely on SMB every day when accessing files on network drives, and network administrators use the same protocol when they perform remote administration. Unfortunately the adversary, whether script kiddies or nation-state actors, also uses SMB! By the way, do you know whether SMBv1 is running on your network… and how can you be sure?

The video that accompanies this blog provides an introduction to the power of Corelight’s advanced filtering and the content contained in Bro’s SMB logs to monitor SMB usage for remote scheduled tasks and file access. If you use Bro to monitor SMB, please share tips here so others can benefit – if you don’t use Bro, would you like to learn how it transforms raw network traffic into comprehensive, organized logs? If you are interested in learning more detail about Bro’s ability to detect malicious activity hidden in SMB, this SANS paper is a great place to start.

I hope you enjoy this short introductory video. Good luck and good hunting!