MITRE ATT&CK: The Good, The Bad and the Ugly

Part 3: The Ugly - and a New Way to Assess Vendor Performance

Nov 19, 2024

Introduction

In the first two parts of our MITRE ATT&CK series, we explored the Good—how it has revolutionized security tool assessments—and the Bad, highlighting its lack of standardized scoring mechanisms. Now, let’s tackle the Ugly: how vendors cherry-pick their performance statistics in MITRE ATT&CK evaluations to inflate their capabilities.

Let’s be clear: MITRE ATT&CK is an incredibly valuable tool, but the way some vendors present their results can create a false sense of security. Many organizations may be unaware of how selective marketing can skew their perception of a vendor’s true capabilities.

This post will:

Seek to demystify vendor practices
Offer real-world examples of cherry-picking
Provides a DIY framework for interpreting results
Introducing a utility function to contextualize performance across the attack lifecycle.

The Ugly: Cherry-Picking MITRE ATT&CK Stats

When vendors participate in MITRE ATT&CK evaluations, they gain insights into their detection capabilities across different adversary techniques. However, the absence of a standardized scoring framework allows for selective reporting. Vendors often highlight their strongest performance areas while downplaying or omitting their weaknesses.

This creates a distorted narrative, where organizations are misled into believing a vendor provides comprehensive protection when, in reality, significant gaps remain.

Examples of Cherry-Picking

Example 1: Overemphasizing Initial Access Detection
Vendor A markets its superior detection of initial access techniques, boasting a 95% detection rate. However, their detection sharply declines in subsequent phases like lateral movement, where their rate falls to 50%. By focusing solely on initial access, Vendor A misleads customers into thinking they provide end-to-end protection.

Example 2: Ignoring Post-Compromise Failures
Vendor B highlights strong performance in credential dumping detection, claiming this makes them ideal for preventing account takeovers. But their persistence detection—the ability to identify attackers who maintain a foothold—lags far behind. Without acknowledging this weakness, their marketing creates a false sense of security.

Example 3: Selective Reporting on Lateral Movement
Vendor C focuses its marketing on lateral movement detection, showcasing an 85% success rate. However, its reliance on predefined attack signatures causes its behavioral analytics to falter when faced with novel tactics, significantly lowering its real-world effectiveness.

DIY MITRE: A Framework for Self-Service Analysis

Security teams must adopt a hands-on approach to MITRE ATT&CK evaluations to navigate the noise. Fortunately, MITRE provides interactive tools allowing users to independently analyze vendor performance.

Demystifying Vendor Stats: A Guide to Evaluating MITRE ATT&CK Results

Now that we’ve seen how vendors can cherry-pick their results, it’s time to equip yourself with the tools to see through the noise. Here’s an evaluation criteria guide to help you make sense of MITRE ATT&CK results and avoid falling into the marketing trap.

1. Context Is Everything

What was the vendor’s solution configured for? Vendors often tune their solutions specifically for the MITRE evaluation, which may not reflect real-world performance. Ask whether the solution was tested under standard deployment conditions or optimized for the test.

2. Don’t Focus Solely on Detection Rates

Look at detection across the entire kill chain, not just one phase. Many vendors perform well in early stages like initial access or privilege escalation but struggle in critical areas like lateral movement or persistence. Check the vendor’s performance across multiple stages to ensure they offer comprehensive coverage.

3. Beware of “Top-Line” Numbers

Ask for the full report, not just the marketing highlights. Vendors often pick one or two impressive statistics and promote those while ignoring weaker areas. Request the complete MITRE evaluation report and analyze the vendor’s detection capabilities across all techniques.

4. Check for Behavioral Analytics

Did the vendor rely solely on signature-based detection? Signature-based detection is great for known threats, but in today’s world of sophisticated attackers, behavioral detection is crucial. Look for how well the vendor’s solution handles novel or unknown attacks.

5. Understand the Trade-Offs

What did the vendor sacrifice to perform well? Sometimes, vendors tweak their configurations for maximum performance in the MITRE ATT&CK evaluation, but this may not be sustainable in a real-world environment. Ask how the vendor’s solution performs under day-to-day operations with real data and dynamic threats.

6. Look for Gaps in Specific Techniques

Are there any techniques or tactics the vendor struggled with? Vendors will often gloss over areas where their product didn’t perform well. Pay close attention to any gaps in detection for techniques like lateral movement, persistence, or exfiltration, as these are key stages for an attacker to escalate their operation.

7. Ask About Automation and Manual Intervention

Did the vendor rely on manual tuning during the test? Some vendors rely on heavy manual intervention to perform well in the MITRE evaluations. Ensure that their product offers automated capabilities in a real-world environment, especially if your team lacks the resources to manually fine-tune detections.

DIY MITRE Analysis

Start by accessing MITRE’s evaluation results for various scenarios. For example, the 2023 Turla Evaluation provides insights into vendor performance across multiple attack stages.

Through the interface, you can:

Filter by evaluation and scenario.
Select specific vendors for side-by-side comparisons.
Examine detection capabilities across stages like initial access, persistence, and lateral movement.

Figure 1: Landing Page

Figure 2: Side by Side Vendor Comparison (anonymized)

While it isn’t an exact science, educating yourself is the best way to ask direct questions and understand what it is that you’re looking for when it comes to your endpoint security solution. I highly recommend trying this out.

An Economical Approach: The Utility Function

Pardon the pun, but the Economics major in me couldn’t help it. There’s another way to look at MITRE evaluations, which is to assess the value to your organization through a utility function. Here’s a different, structured approach.

Enter the utility function:

U_Vendor = (Σ(W_i × D_i)) / Σ(W_i)

Where:
D_i: Detection rate for stage i.
W_i: Weight assigned to stage i, reflecting its criticality.
n: Total number of evaluated stages.

This formula allows organizations to calculate a weighted utility score for vendors, providing a holistic view of their performance.

Utility Function in Action
To illustrate, let’s look at two hypothetical vendors evaluated across key attack stages:

Utility Score Calculation:
Vendor A:
U_Vendor A = [(0.15×0.95) + (0.20×0.80) + (0.25×0.50) + (0.20×0.60) + (0.20×0.55)] / 1.00 = 0.6575

Vendor B:
U_Vendor B = [(0.15×0.90) + (0.20×0.70) + (0.25×0.85) + (0.20×0.65) + (0.20×0.75)] / 1.00 = 0.7675

Interpretation:
Vendor B’s higher utility score (0.7675) indicates stronger overall performance across critical attack stages compared to Vendor A (0.6575). By weighting stages based on their importance, the utility function provides a more balanced and realistic evaluation of vendor capabilities.

Conclusion: Don’t Let the Numbers Fool You

MITRE ATT&CK evaluations offer invaluable insights, but without proper context, the numbers can be misleading. Vendors often cherry-pick data to highlight their strengths while masking their weaknesses. By leveraging MITRE’s DIY tools and applying the utility function, security teams can bypass marketing spin and evaluate vendors on their true merits.

Stay informed, dig into the data, and always question the numbers. Because in cybersecurity, the Ugly truth often hides in plain sight.

Stay secure, and stay curious!
– Damien

Note: The views expressed here are my own and are not endorsements or critiques of MITRE or any specific vendor.

Damien’s Substack

Discussion about this post