Reading ZIP (JAR) Files with YARA (part 1) – YARA Post #4

ZIP files have a well defined structure, this makes it possible to use YARA to match certain characteristics of files stored within the ZIP file. For example file name information is stored in both local file headers and in the central directory file headers within the Archive. Wikipedia has a decent write-up on the ZIP file format structure.

I first started experimenting with this to examine the contents of Java archives (JAR files, that are essentially ZIP archives) for specific files (in this case bundled log4j jars). To do this I first defined the marker for the local file header $zip_header = {50 4B 03 04} in the strings section, followed by the files I was interested in locating in the zip. See rule below:

rule vuln_log4j_jar_name_injar : log4j_vulnerable {
    strings:
        $zip_header = {50 4B 03 04}
        $a00 = "log4j-core-2.0-alpha1.jar"
        $a01 = "log4j-core-2.0-alpha2.jar"
        // …
        $a41 = "log4j-core-2.14.0.jar"
        $a42 = "log4j-core-2.14.1.jar"
            
    condition:
        // iterate over local file zip headers
        for any i in (1..#zip_header):
        (
            // match any of the file names
            for any of ($a*):
            (
                $ in (@zip_header[i]+30..@zip_header[i]+30+uint16(@zip_header[i]+26))
            )
        )
}

In the condition section we introduced some new capabilities. We iterate over the string matches for the $zip_header string. The variable #zip_header (note the #) gives us the count of the matches, for any i in (1..#zip_header):(…) iterates over the matches (populating i for each match), while the @zip_header[i] syntax (note the @) lets us reference the offset in the file for each match.

From the ZIP format specification we know that the file name is at offset 30 from the start of the local file header. The length of the file name is stored in the two bytes at offset 26. Given this information we can read the length of the file name using uint16(@zip_header[i]+26), and then read the file name beginning at offset 30 through to the offset specified in the file name length field and compare this to the file names we are looking for referenced by $a*.

In part 2 I’ll dig into some other interesting things we can look for in the zip headers.

XProtect & YARA – YARA Post #3

In yesterday’s post I covered an example of a YARA rule from AT&T’s Alien Labs used to detect payloads used by BlackCat Ransomware. Today I’ll take a quick look at one of the YARA rules used by Apple in XProtect to help protect macOS devices.

Apple’s XProtect uses YARA rules to deliver “signature-based detection and removal of malware” as described in their security guide Protecting against malware in macOS.

On my version of macOS these signatures are located in /Library/Apple/System/Library/CoreServices/XProtect.bundle/Contents/Resources/XProtect.yara . In this file Apple uses a technique called private rules to define reusable YARA rules that can be references in the conditions of other rules.

For example, one of these rules identifies Portable Executables:

private rule PE
{
    meta:
        description = "private rule to match PE binaries"

    condition:
        uint16(0) == 0x5a4d and uint32(uint32(0x3C)) == 0x4550
}

This rule is a little different from the condition uses in yesterday’s BlackCat rule that just used uint16(0) == 0x5A4D to identify the executable. Here the rule Apple are using does a couple of lookups to identify that this is a PE file using specific offsets defined in the PE file format specification:

After the MS-DOS stub, at the file offset specified at offset 0x3c, is a 4-byte signature that identifies the file as a PE format image file. This signature is “PE\0\0” (the letters “P” and “E” followed by two null bytes).

PE Format

If you are interested in seeing how Apple identifies MACH-0 files (executable format used by macOS and iOS) they also have a private rule to do that as well:

private rule Macho
{
    meta:
        description = "private rule to match Mach-O binaries"
    condition:
        uint32(0) == 0xfeedface or uint32(0) == 0xcefaedfe or uint32(0) == 0xfeedfacf or uint32(0) == 0xcffaedfe or uint32(0) == 0xcafebabe or uint32(0) == 0xbebafeca

}

These private rules can be used within the condition sections of the public rules that XProtect also uses. For example in the XProtect_MACOS_51f7dde rule we can see that the rules condition first references the Macho private rule, an upper bound on file size, and then the presence of multiple strings (that were defined using the Hexadecimal format in the strings section above):

rule XProtect_MACOS_51f7dde
{
    meta:
        description = "MACOS.51f7dde"
    strings:

        $a = { 63 6F 6D 2E 72 65 66 6F 67 2E 76 69 65 77 65 72 }
        $b = { 53 6D 6F 6B 65 43 6F 6E 74 72 6F 6C 6C 65 72 }
        $c1 = { 75 70 64 61 74 65 53 6D 6F 6B 65 53 74 61 74 75 73 }
        $c2 = { 70 61 75 73 65 53 6D 6F 6B 65 3A }
        $c3 = { 72 65 73 75 6D 65 53 6D 6F 6B 65 3A }
        $c4 = { 73 74 6F 70 53 6D 6F 6B 65 3A }
    condition:
        Macho and filesize < 2MB and all of them
}

This shows how commonly used elements of a rule can be defined separately and then composed with other conditions to simplify the resulting rules.

YARA BlackCat Payload Example – YARA Post #2

In my first post in this series, Day 1 of the 12 Days of YARA, I shared some resources that can help you to get started with YARA. YARA is a rules based language that allows for pattern matching against a file (or a processes memory). This is a useful tool that can describe a malware sample in a way can match on both a specific sample and (if properly generalized) on similar samples. A generalized rule may continue to match on related samples even if they change slightly from version to version, this is in contrast to cryptographic hash based approaches (SHA1, SHA256, MD5) to detection where minor changes in the malware create an entirely different hash value.

YARA rules are quite simple in terms of structure. The rule typically consists of the strings that will match file contents, and boolean conditions that determine whether the rule is matched or not. There’s a good overview of how to write a YARA rule in the documentation that it is worth reading if you are unfamiliar with how the rules are structured.

When I learn something I like to learn by reading the documentation as well as examples published by others. This helps me get a better sense of the idiomatic use of a language. Luckily there are lots of examples published that you can use to see different approaches people have taken.

In this post I’ll reference one of the YARA rules provided by AT&T Alien Labs in their post about BlackCat ransomware earlier this year:

rule BlackCat : WindowsMalware {
   meta:
      author = "AlienLabs"
      description = "Detects BlackCat payloads."
      SHA256 = "6660d0e87a142ab1bde4521d9c6f5e148490b05a57c71122e28280b35452e896"
    strings:
        $rust = "/rust/" ascii wide
        $a0 = "vssadmin.exe Delete Shadows /all /quietshadow" ascii
        $a1 = "bcdedit /set {default}bcdedit /set {default} recoveryenabled No" ascii wide
        $a2 = "Services\\LanmanServer\\Parameters /v MaxMpxCt /d 65535" ascii wide
        $a3 = ".onion/?access-key=${ACCESS_KEY}" ascii wide
        $b0 = "config_id" ascii
        $b1 = "public_key" ascii
        $b2 = "extension" ascii
        $b3 = "note_file_name" ascii
        $b4 = "enable_esxi_vm_kill" ascii
        $b5 = "enable_esxi_vm_snapshot_kill" ascii
    condition:
        uint16(0) == 0x5A4D and filesize < 5MB and $rust and 2 of ($a*) and 3 of ($b*)

}

Taking a few moments to examine this a few things begin to pop out.

First the meta section provides information about the origin of the rule, in this case information about the authorship, description, and the hash of the Windows BlackCat payload. While the hash in the metadata is not used in the rule itself, it is useful to be able to look up more information about the file on other sources (like VirusTotal).

Second, in the strings section the author has grouped related sets of strings together with the same prefix ($a,$b), when we look in the condition section we can see that YARA allows the use of wildcards to identify these different sets of strings and apply different conditional logic to them.

Third, in the condition section we can see how the author uses both the strings defined earlier, as well as some other conditional statements:

  • uint16(0) == 0x5A4D. The uint16(0) reads the unsigned 16 bit integer at the start of the file (offset 0) and compares it with 0x5A4D which is the magic number indicating this is an executable file on Windows. YARA does include a PE module that allows for fine-grained inspection of attributes of a portable executable files, but looking for specific markers at a known offset (as in this case) may be more efficient from an execution perspective if the rule doesn’t need that level of granularity.
  • filesize < 5MB. Here the authors include a filesize limit, this will help optimise the processing of very large files (above the 5MB specified) so that the scan can concentrate on the right set of files.
  • $rust and 2 of ($a*) and 3 of ($b*). Here the author uses the strings defined earlier. These sets of strings relate to characteristics that are less likely to change over time. For example, commands to delete shadow copies are a common characteristics of ransomware, so detecting strings related to that operation are less likely to change than strings related to non-core aspects of the malware.

Stay tuned for Day 3 tomorrow!

Kicking YARA Series – YARA Post #1

For the last few years I have been working in Product Management at Rubrik. One of the offerings I recently launched was the ability to scan backups of different systems looking for Indicators of Compromise (IOCs). These IOCs are intended to help identify systems that have been compromised and are showing malicious activity. The IOC is an indicator of such activity.

When an IOC is file based, if you have access to the backups of the system, you essentially have a time-series history of that system that you can scan for those IOCs. This can helps you to identify details about the initial infection, when it first landed etc., without relying on the primary system being available. At Rubrik we introduced support for scanning for IOCs, using YARA rules (and hashes and file patterns), against the system backups.

You can begin learning more about YARA from the project page, and from the documentation. In this series of blog posts I will share a somewhat eclectic collection of tips, tricks and resources all about YARA and a few things I’ve picked up a long the way.

Stay tuned and I hope you had a Merry Christmas!

Managing Data at Scale in VMware and Hybrid Cloud Environments

Thanks to the VMware User Group I was recently able to share some of my thoughts on managing data at scale across VMware and Cloud environments. In the session I shared some stories covering how operators were managing data using VMware capabilities like vSphere’s DRS and Storage Policies, as well as concepts like Rubrik’s SLA Domains. I covered some interesting topics and customer stories, including:

  • Imperative and declarative automation approaches
  • Policy driven management
  • Application of machine learning to data security
  • Managing data across edge, core, and cloud

If this sounds like your kind of thing then watch the webinar on Managing Data at Scale in a VMware and Hybrid Cloud Environment on-demand.

VMworld 2018 Session Recommendations

VMworld 2018 is just a few short weeks away at this point. Many of those reading this post would no doubt have already filled out their schedule, for those of you who have procrastinated however here are a few sessions that I am looking forward to. To make it interesting I’m limiting my recommendations to one per day, while at the show I fully believe you should take advantage of mingling with others in the community and browsing the show floor to get a sense of some of the innovation that is happening around the ecosystem.

Sunday – Demystifying vSAN Management for the Traditional Storage Administrator [HCI1475QU]

As a fan of vSAN and having listened to Pete Koehler on many topics, I’m sure this will be a great session for anyone looking for to get a handle on how vSAN differs from traditional storage.

Monday – Application modernization with VMware Cloud on AWS [HYP2145BUS]

I don’t think I’m going to be able to watch this one live due to other commitments but will be eagerly watching the replay. I’ve presented with Wen before and also watched Aarthi present so I know this will be a great session for anyone attending.

Tuesday – VMware NSX for Service Providers: A Technical View [HYP2406BU]

Service providers networking is an interesting beast. If networking is your thing then this promises to be an interesting session and you can always trust Ray to get into the details and I expect Tina to bring the service provider perspective into the mix.

Wednesday – Confluent Platform: Introduction and Deployment on PKS [CODE5593U]

There’s a lot of excellent sessions happening on Wednesday, one that is a little out of my ordinary area though is this one on running confluent on top of Pivotal Container Services. Should be an interesting change from the usual VMworld topics.

Wednesday Bonus – Ransomware Threat Recovery Using Rubrik Polaris [SAI3712BUS]

I’m going to cheat and share another session on Wednesday just because I know it’s going to be cool and cover one of Rubrik (my employer’s) latest capabilities presented by a couple of excellent presenters. Promises to be enlightening!

If you’d like to learn more about Polaris before this session check out the Polaris announcement blog post

Thursday – Architecting at the Tactical Edge with VMware vSAN and vRealize [HCI1691BU]

I’ve had a bit of an inside view into what has been happening behind the scenes for this session. It’s going to be interesting to hear about some of the more challenging aspects of this project, and how they were addressed. Promises to be an informative and interesting session with some good presenters!

Other Sessions

If the sessions above aren’t enough to fill your schedule there are several more excellent sessions being presented at VMworld this year. Here are a few of my favorite speakers, any of their sessions should be worth your time if you like to skew a bit more technical in your tastes:

  • Rebecca Fitzhugh – has an awesome array of presentations this year, all of which will no doubt be amazing
  • Duncan Epping – let’s just say he knows how to present and is not shy of addressing both the technical details and high level perspectives
  • Christian Dickmann – enjoy listening to his thoughts on simplifying operational management
  • Cody Hosterman – if vSphere storage is your thing, you’ll be at home
  • Christos Karamanolis – always interesting to listen to his forward looking thoughts

There are of course many other great presenters, but hey this list is getting long already!

If you’re attending VMworld this year have a great time! If you want to connect  with me at the conference feel free to reach out to me on twitter @BenMeadowcroft.

Thoughts on Product Management

On my last day at VMware I was pulled aside by Glenn Sizemore who interviewed about me for the “career day” episode of the vSpeaking podcast. Glenn asked me a few questions about my role and I thought it would be helpful to write my responses and add a bit more detail for people who are interested in the Product Management role.

How would I describe Product Management?

Product Management for enterprise software is about building the right products, products that provide real, demonstrable, value to the customer. As a Product Manager you have to be able to get to grips with some of the underlying business challenges faced by customers. Keeping everyone aligned on that north star ensures that, as a team, we spend our effort building something that customers both value and are willing to pay to unlock that value.

How did I decide to get into Product?

I trace my Product Management roots back to my time at a small startup company in the UK called Mobysoft. When I joined as the first full-time employee the company was much smaller than it is now. Being involved at an early stage gave me the opportunity to take on a lot of responsibility, build the engineering team, and develop a new service called RentSense.

In the early days of developing the new service I got to work closely with customers to understand the challenges they were facing and ensure that the product my team were developing was going to hit the mark. That was my first experience on the Product side of the fence and I definitely wanted more.

I decided at this point I was interested in transitioning my career from engineering to Product Management. I made the move to the USA to pursue my MBA and began transition into Product Management.

3 things I love about Product Management?

First, I love the satisfaction of seeing something through from beginning to end. Being able to work with customers to identify their needs and then work on bringing to market technology solutions to solve those challenges and close that loop is hugely satisfying for me.

Second, is the people. I get to work with some incredibly intelligent peers across multiple disciplines. As a former engineer I always appreciate being able to work with high caliber engineers and I have been incredibly honored to have worked with some exceptionally talented people during my times at AWS, VMware, and now Rubrik. Being able to share the context of the customers pain points with the engineering teams is one of the things that I think Product Managers absolutely need to do. Ensuring that customer empathy is baked into the product throughout its execution is how good products are forged.

Third, as a self confessed data geek, I love the opportunity to dive into data. Direct customer interaction is critical to gathering insights, but the qualitative insight has to be married to quantitative analysis. Without this combination it’s all too easy to fall into the trap of building a great solution for just one customer and being a consultancy versus a Product company.

Something people don’t tell you before taking the job?

Probably the biggest surprise for me was the opportunity to work collaboratively across many different teams. Not just cross-functionally with the teams that were involved in delivering the same product, but also with teams across the company working on a variety of different initiatives. Ensuring that as a PM you remain focused is critical, but being open to working with adjacent teams (both within and outside the company) can bring a lot of leverage in delivering value to customers.

Some Thoughts on VMware Cloud on AWS Stretched Clusters

Companies are considering a variety of migration strategies as they are looking to leverage the cloud. For VMware Cloud on AWS (VMC) migration is one of the key use cases that VMware have promoted (alongside Disaster Recovery). A key benefit touted by VMware for their offering is the ability to re-host applications without having to re-platform or re-architect, however, this is not without caveats when it comes to availability and resiliency.

For a customer migrating to the cloud, delivering the right level of resiliency and availability is a key concern. On AWS the Availability Zone is a key building block for designing available architectures. For customers who are willing to re-architect their application, designing the application to ensure resiliency in the face of an AZ loss is critical, as well as ensuring customers are eligible for AWS SLA credits in the event of an EC2 outage! But what options are available for delivering multi-AZ availability when pursuing a re-host migration strategy?

For VMware Cloud on AWS, delivering this re-host capability this is also one of the most significant limitations with what is currently available. When customers provisioned a new SDDC it could only be placed within a single Availability Zone (AZ). The combination of vSphere HA, vSAN’s erasure coding, and VMC’s auto-remediation of failed hosts ensured that failures of the individual bare metal EC2 instances could be handled well. However, there remained an issue of protecting against failures of entire Availability Zones.

With the unveiling of a technology preview of their new stretched clustering capability, VMware is presenting a differentiated offering. Stretched networking, by NSX, and stretched storage, from vSAN, combine with vSphere’s HA to deliver a platform that delivers resiliency against AZ failure, without having to re-architect or re-platform your application to take advantage of multiple Availability Zones. On the vSAN side, the increased costs of mirroring the storage are now offset by the introduction of deduplication and compression support. More details were shared during VMware’s recent Cloud Briefing event and I also spoke about VMware’s plans here during my VMC storage deep-dive session at VMworld.

It will be interesting to see how VMware’s customers evaluate this new offering when it moves out of tech preview status and into General Availability.

VMware Site Recovery VMworld 2017 Session

During VMworld 2017 I shared a tech preview, with GS Khalsa, of the VMware Site Recovery service that’s now available as an add-on to VMware Cloud on AWS. While we’ve already made several enhancements to the service, over and above what you’ll see in the tech preview, I think it still illustrates many of the exciting new options available with VMware Site Recovery today!

Check out the session online