SEPTEMBER 2024 I Volume 45, Issue 3
Cobalt Strike: A Cyber Assessment Challenge | ITEA Journal
SEPTEMBER 2024 I Volume 45, Issue 3
SEPTEMBER 2024
Volume 45 I Issue 3
Senior Operator
Director, Operational Test & Evaluation,
Cyber Assessment Program,
Advanced Cyber Operations Team
Alexandria, Virginia
Senior Operator
Director, Operational Test & Evaluation,
Cyber Assessment Program,
Advanced Cyber Operations Team
Alexandria, Virginia
Assessment of cyber tooling serves a critical role in the procurement process of red team tools; however, once a tool is vetted and approved for use by a red team, it is then incorporated into their steady state operations. As a result, approved tools may not undergo routine in-depth cyber assessments as newer versions are released. This presents a major concern for the red team community as new versions can change the operational security of those tools. Similarly, cyber defenders – either through lack of training or limited resources – have been known to upload red team payloads to commercial malware analysis platforms, which inadvertently releases potentially sensitive information about red team operations. In this paper, we discuss red team cyber tooling, in-depth analysis into Cobalt Strike versions 4.8+, and provide recommendations on evaluating cyber tooling.
Keywords: Cobalt Strike; Cyber Assessment; Red Team; Cyber Tooling
DoD Red Teams
As part of the Department of Defense (DoD) red team certification and accreditation process, each red team must adhere to a standard set of qualifications to become and remain a certified red team. This process includes having established policies and procedures in place for assessing cyber tooling. While the teams are certified against the same standards, each team has relative autonomy on how they operate. As a result, these processes can vary drastically from one red team to another. For some red teams, the process is very structured and formal while others have less stringent policies in place. Regardless of formality, the assessment process generally follows these steps:
This assessment process applies to initial tool acquisition as well as the reevaluation of cyber tooling for functionality and OPSEC considerations as new versions become available. Major software releases go through the team’s established review processes, whereas minor revisions will go through a shorter review process. During the tool reevaluation process, a varying level of OPSEC testing is performed. Primarily, this testing is performed to identify OPSEC concerns with tool behavior. For example, the evaluator may identify what artifacts are placed on the target system, information useful for cleanup after utilizing the tool, and other considerations which may impact the decision to utilize the tool during an operation.
Another consideration during the initial tool procurement or reevaluation periods is whether the tool is considered “established.” Established tools, those which are well-known and vetted, are typically purchasable without issue and allowed for use in red team operations. While established tools may still go through variable levels of assessment, depending on the team, it may not always be the most stringent of reviews. This may also be the case as major and minor updated versions of tools are released. For example, Cobalt Strike has been a crucial component in most red team’s toolkits since its initial release in 2012 (Mudge 2016a). Cobalt Strike is an adversary simulation software which allows red teams to generate payloads, called Beacons, and perform post-exploitation activities modeling the behavior of advanced adversaries (Fortra 2023a). Since its initial release, Cobalt Strike has become the de facto command and control (C2) platform for red teams across the DoD.
Cobalt Strike, like most software products, is regularly updated to new versions with new features including quality of life changes, OPSEC changes, and changes in usage. However, as new versions are released, red teams must ensure that Cobalt Strike is assessed for OPSEC changes that could potentially impact operations. Commercial software such as Cobalt Strike is closed source, meaning the source code is not available for analysis. As a result, red teams must rely on the vendor, in this case, Fortra, to identify what has changed between versions. Without adequate testing, the teams may not know the full measure of changes, including pertinent changes to the OPSEC of the tool.
Table 1 includes the following symbols to explain concepts relating to this research.
Symbol | Description |
⊕ | Exclusive-OR (XOR), a logical operation. If the inputs are the same, the product will equal 0 (false). If the inputs are different, the product will equal 1 (true). |
0x | Denotes hexadecimal values. |
Table 1: Symbols and their meanings used in this article.
In March 2023, Fortra released a minor version of Cobalt Strike, version 4.8. In this version, Cobalt Strike was updated with pertinent changes to the functionality and OPSEC of the tool. The first update changed how Beacon’s reflective dynamic-link library (DLL) import table is obfuscated. In previous versions of Cobalt Strike, a static one-byte exclusive-or (XOR) key was utilized for obfuscation. However, this static key allowed Beacon to be easily identified and caught by cyber defenders. In version 4.8, Beacon was updated to implement a multi-byte XOR key, specifically, a four-byte key. Unlike the previously static key, version 4.8 improves upon this process by ensuring the four-byte key is random for every Beacon. The second update of note in version 4.8 is the implementation of guardrails. Guardrails allow operators to configure the Beacon to execute and run within specific bounds (Darwin 2023a). If the system does not meet the configured guardrail conditions, the Beacon will not execute. The following guardrails can be configured:
Cobalt Strike version 4.9, a minor release launched in September 2023, expanded on the functionality added in version 4.8. Of the changes in version 4.9, the changes most relevant to this research involved updates to Beacon’s Sleep Mask which handles how the Beacon process is protected while running in memory (Fortra 2023d). Likewise, there were changes to Cobalt Strike’s Malleable C2 profiles which added options to replace strings and perform cleanup of Cobalt Strike payloads (Darwin 2023b). Cobalt Strike’s Malleable C2 is a program that offers operators the ability to configure how Beacon operates in memory, performs process injection, and communicates over Hypertext Transfer Protocol (HTTP)/ Hypertext Transfer Protocol Secure (HTTPS) (Fortra 2023b).
The release of Cobalt Strike versions 4.8 and 4.9 changed how Beacons are structured and obfuscated. These changes rendered existing tooling (see Table 2) useless for analyzing Beacons generated from these versions. To overcome this obstacle, the authors developed a Python script to analyze and extract information from beacons. The extracted information from a beacon is called the Beacon configuration.
The Beacon configuration is a core piece of Beacon which identifies key configurable elements such as the C2 address, process injection techniques, post exploitation options, HTTP communication techniques, and other key pieces of information. Specifically, the Beacon configuration instructs Beacon on where to connect and how to connect to the Cobalt Strike C2 server. This configuration is created when a beacon is generated and is stored obfuscated within all Cobalt Strike Beacons. Referenced in the Introduction, one specific case involved a security researcher that was able to extract the Beacon configuration from the Beacon that was submitted to the malware analysis platform by a DoD red team. Part of what prompted the research contained in this article were the sensitive details contained in the exposed Beacon configuration and the attribution to DoD red teams.
The Beacon configuration includes the following key pieces of OPSEC and Tactics, Techniques, and Procedures (TTPs) related information:
The Beacon configuration is 6,144 bytes in size and always starts with the same sequence of bytes depending on the type of Beacon. Each type of Beacon contains different information relating to its type. Beacons can have the following types: HTTP, HTTPS, Domain Name System (DNS), Server Message Block (SMB), Transport Control Protocol (TCP), and Secure Shell (SSH). The primary change among Beacons of different types is that the configuration will contain information related to its corresponding method of communication. For example, a DNS Beacon will contain configuration options and information relating to how Beacon communicates over DNS. The Beacon configuration is an important construct of the Beacon technology. Beacon utilizes this configuration to identify where a beacon needs to connect to for communication and how the beacon should execute on a target host.
In Cobalt Strike version 4.8, a new configuration was added to Beacon. This is a separate configuration that is obfuscated differently than the Beacon configuration. ACO’s research found that the Guardrail configuration is instantiated as follows:
0x8A ⊕ (Guardrail configuration ⊕ Reversed Beacon configuration)
In this process, the Beacon configuration is read in reverse – from the last byte to the first byte – and XOR’ed with the Guardrail configuration. The product of this operation is then XOR’ed with a static one-byte key, hex: 0x8A. This obfuscated data is then stored after the Beacon configuration. The total size of the Guardrail configuration is 2,048 bytes. Since the Guardrail configuration is relatively small, the rest of the Guardrail configuration space is filled with randomly generated bytes. Beacon’s Guardrail configuration typically contains the type of guardrails that were configured, whether a wildcard character was used, and a checksum of the guardrail data.
The Beacon Guardrail configuration added a new layer of complexity to the analysis of a beacon. The Guardrail configuration obfuscation process implements a static XOR key and utilizes the value of the guardrail in the obfuscation process. Pertinent OPSEC information can be gleaned from the Guardrail configuration if guardrails are actively used by a red team. This information can include and reveal targeted user accounts, domains, IP addresses, and hostnames. An adversary could leverage this information to identify sensitive information about DoD networks.
Commercial and DoD cyber defenders have historically relied on malware analysis platforms such as VirusTotal, which are associated with realized OPSEC risks.
In one specific case, a cyber defender uploaded a red team’s Cobalt Strike Beacon to a malware analysis website during an assessment, which led to the extraction of information from the payload. This information was subsequently shared online across multiple malware analysis platforms. A security researcher found the Beacon on a malware analysis platform and analyzed it. The researcher’s analysis attributed the activity to an unspecified DoD red team based solely on the extracted information. This situation brought to light two questions: 1) How are DoD certified and accredited red teams assessing their cyber tooling? and 2) Are red teams doing enough to evaluate their tooling for OPSEC?
The Office of the Director, Operational Test and Evaluation’s (DOT&E) Cyber Assessment Program’s (CAP) Advanced Cyber Operations (ACO) team was tasked with identifying OPSEC concerns surrounding Cobalt Strike’s Beacon after finding and identifying a beacon (B1) that was submitted to a commercial malware analysis platform during a red team assessment. During this process, a separate Beacon (B2), from a different red team assessment, was found on a commercial malware analysis platform. B2 was identified by the security researcher online which resulted in the Beacon configuration being exposed across multiple malware analysis platforms. This specific Beacon, B2, led to research into finding open-source Beacon analysis tools. ACO identified a list of analysis tools useful for analyzing Beacons, shown in Table 2.
Tool Name | Project Webpage |
1768 | https://github.com/DidierStevens/DidierStevensSuite/blob/master/1768.py |
CobaltStrikeParser | https://github.com/Sentinel-One/CobaltStrikeParser |
Cobalt Strike Configuration Extractor | https://github.com/strozfriedberg/cobaltstrike-config-extractor |
dissect.cobaltstrike | https://github.com/fox-it/dissect.cobaltstrike |
Table 2 – A list of open-source Cobalt Strike Beacon analysis tools.
Utilizing these tools, ACO was able to reproduce the results discovered by the security researcher for B2. B1, however, was generated with Cobalt Strike version 4.8 and none of the identified Beacon analysis tools were able to extract the configuration. ACO identified B1 as a stageless Beacon. A stageless payload is simply a payload which contains the Beacon configuration and the Beacon reflective DLL in a single executable file. This is in direct contrast to a staged Beacon which uses a stager, the initial payload, that connects to the Cobalt Strike server, downloads the Beacon payload, and executes it (Mudge 2016b). Further testing allowed ACO to extract the reflective DLL payload from the stageless B1. Using the open-source Beacon analysis tools, ACO was able to retrieve the Beacon configuration from the extracted reflective DLL in B1.
The changes in Cobalt Strike version 4.8 were substantial enough to render open-source Beacon analysis tools unreliable when analyzing stageless Beacons and, in some cases, the reflective DLL payload. The open-source analysis tools primarily include functionality to either brute force a single-byte XOR key or only test for the static, well-known single-byte XOR key utilized in versions prior to 4.8. The hex values of the previously implemented single-byte XOR keys include 0x69 and 0x2E (OA Labs 2022). As of 22 August 2024, these open-source Beacon analysis tools have not been updated to support this change to the random multi-byte XOR key in stageless Beacons. After in-depth analysis of Cobalt Strike Beacons generated with versions 4.8+, ACO identified several critical pieces of information regarding Beacon and the use of XOR keys to obfuscate the reflective DLL in stageless Beacons.
By default, a stageless Beacon generated with Cobalt Strike version 4.8 and 4.9 will use a randomly generated 4-byte XOR key for obfuscation. The obfuscated payload stage, or reflective DLL payload, still implements a static, one-byte XOR key (0x2E). Since this static key still exists, the open-source analysis tools can extract the Beacon configuration from the reflective DLL payload. This changes, however, if guardrails are configured by a red team operator. The Cobalt Strike Beacon function for bitwise XOR operations is the following:
IP Address ⊕ (Domain ⊕ (Computer name ⊕ (Username ⊕ (0x2E ⊕ Beacon configuration))))
If no guardrails are configured, the only operation performed is 0x2E ⊕ Beacon Configuration because the reflective DLL payload is only obfuscated using the 0x2E key. If guardrails are configured, this behavior changes the XOR key of the reflective DLL payload. The first operation takes the result of the Beacon configuration XOR’ed with 0x2E. Then, that result is XOR’ed with each configured guardrail string. For example, if a single guardrail is configured with the username ‘labadmin’, the operation would look like (‘labadmin’ ⊕ (0x2E ⊕ Beacon Configuration)). The result of this operation is: 0x424F4C4F4A434740. This product serves as the XOR key within the reflective DLL payload and can be used to de-obfuscate the Beacon configuration. When a single guardrail is configured, the resulting XOR key can be used to retrieve the original guardrail string. The reverse operation looks like 0x2E ⊕ 0x424F4C4F4A434740 and the result would return the string ‘labadmin’.
When multiple guardrails are configured, the XOR key can grow to be rather large depending on the number of configured guardrails. During testing, ACO found keys upwards of 120-bytes or more when configuring one of each guardrail. Through this research, ACO identified specific starting locations in the reflective DLL where the XOR key is stored. By identifying and searching these locations within the reflective DLL payload, ACO can identify the XOR key, the XOR key length, and ultimately retrieve the Beacon configuration. To do this, the ACO team developed a Python script named EXtraction Script for Cobalt strike Artifact Payload Encryption (EXSCAPE) which can reliably extract the Beacon configuration from Cobalt Strike Beacons generated with versions 4.8+. Unlike the open-source analysis tools, EXSCAPE can reliably identify the random 4-byte XOR key implemented in stageless Beacons, dump the reflective DLL out of the stageless Beacon, and extract the Beacon configuration.
In cases where protections have been implemented, such as the Cobalt Strike Artifact Kit (Fortra 2023c), EXSCAPE may not work as designed. This is due to the varying level of protections that can be enabled which modify how the stageless Beacon is protected. However, in conjunction with manual analysis, it is possible to execute the stageless Beacon and extract the reflective DLL payload from memory. Once obtained, this payload can then be analyzed with EXSCAPE to extract the Beacon configuration. ACO has worked closely with DoD red teams to analyze stageless Beacons that reflect Beacon configurations used in operations. The provided Beacons were generated using varying degrees of payload protections. In each of those cases, ACO was able to obtain the Beacon configuration and shared the process with the red teams. The process to defeat these protections was different in each case and required manual analysis and reverse engineering to retrieve the reflective DLL payload. Once extracted, ACO was able to utilize EXSCAPE to retrieve the Beacon configuration.
EXSCAPE has been updated to support the release of Cobalt Strike version 4.10 which implemented a refactor of Cobalt Strike and added new features (Burgess 2024). This research is still applicable to this newer version released in July 2024. EXSCAPE is currently held under a limited distribution license to United States Government organizations and hosted in the Red Team Foundry (RTF). For access to RTF, the research, and EXSCAPE, please contact Mr. Osvaldo “Ozzie” Perez at the following email address: osvaldo.l.perez.civ@mail.mil.
Research was limited to 64-bit Beacons due to most red teams using Cobalt Strike to perform operations leveraging 64-bit beacons. This research could have been further extended to support 32-bit Beacons; however, this currently remains untested. While the initial research primarily focused on HTTP and HTTPS stageless Beacons, ACO quickly realized that this limitation was unnecessary and could be expanded to additional Beacon types. Table 3 denotes the test cases of this research where ‘X’ identifies a completed test case, and ‘N/A’ identifies a test case which is not possible based on Cobalt Strike’s functionality. In the test cases denoted with an ‘X’, the researchers were able to successfully retrieve the Beacon configuration from the analyzed beacon.
Beacon Type | No Guardrails | Guardrails | Script Analysis |
Stageless | |||
HTTP | X | X | X |
HTTPS | X | X | X |
DNS | X | X | X |
SMB | X | X | X |
SSH | X | N/A | X |
Staged | |||
HTTP | X | X | X |
HTTPS | X | X | X |
DNS | N/A | N/A | N/A |
SMB | N/A | N/A | N/A |
SSH | N/A | N/A | N/A |
Table 3 – A table containing the 64-bit Beacon analysis test cases performed by ACO.
Of the testing performed, ACO identified that SSH Beacons are reflectively loaded into memory through an existing Beacon and do not inherit any of the configured guardrails. In the case of staged Beacons, there is no x64 stager for DNS Beacons. Given the nature of SMB Beacons, there is no staged option for SMB Beacons either. Likewise, SSH Beacons cannot be staged as they are a standalone reflective DLL. In these cases, the SSH Beacon and staged Beacons were manually analyzed by dumping the DLLs from memory prior to analyzing the payloads with the EXSCAPE script.
Over the course of this research, ACO has suspicions that red teams may not be doing enough to adequately protect their Beacons. Each DoD red team operates independently and, therefore, has their own formal and informal processes for evaluating their cyber tooling. This is especially true as it relates to Cobalt Strike. Changes to Cobalt Strike have resulted in net positives for red teams by offering enhanced protection capabilities but implementing these protections can be cumbersome and challenging to get right. It is up to each red team’s leadership to determine their risk tolerance and to consider the OPSEC risks surrounding Cobalt Strike. It is important to note that for a beacon to function properly, it must have a discernible way to de-obfuscate and read its configuration. Otherwise, a beacon will be unable to run on a target host since it cannot communicate with its C2. At some point in the execution process, the Beacon configuration will be exposed. With enough time and effort, a determined researcher or adversary will always be able to obtain the Beacon configuration. As a result, the red team community should implement the necessary precautions to protect their tooling.
In cases where a beacon is uploaded to a malware analysis web site, the extracted Beacon configurations provide enough information for researchers or threat actors to piece together relevant operational information, such as a timeline of supported assessments or how the Beacons are being utilized. The Beacon configuration also provides a license key identifier which is actively tracked by security researchers and included in threat intelligence (Abuse 2024a). This license key identifier has been used to track red team Beacons, but in those cases the activity was not tied to a specific team. Currently, the license key identifier alone is not enough to directly attribute activity to a specific red team unless the information in the Beacon configuration exposes or points to a specific team (Fortra 2024). However, Fortra, the developer of Cobalt Strike, can use the license key to identify the owner/purchaser of the associated Cobalt Strike license. This information is used by Fortra in cases where there has been a suspected compromise of a licensed or unauthorized copy of Cobalt Strike (Burgess 2024). It should be stated that there is no evidence that Fortra does or would share this information with anyone outside of Fortra. Another potential scenario with real-world ramifications is if a beacon is uploaded to a site like VirusTotal during a longer assessment. Information in the Beacon configuration, such as the C2 address, could expose the infrastructure in use by a specific team or enable the potential for abuse and co-opting of the infrastructure by legitimate adversaries.
As unlikely as these scenarios may appear, it is the DoD red team community’s responsibility to protect tradecraft, infrastructure, and TTPs. The DoD red team community’s infrastructure should always be non-attributable to the specific team or the United States Government. One of our goals is to help red teams make informed decisions and ensure they understand the risks associated with the Beacon configuration. Education is critical to disabusing cyber defenders of uploading payloads to malware analysis websites as a default action. Cyber defenders should follow established deconfliction processes as their primary triage response. Many malware analysis websites allow users to download submitted malware samples from the website with or without being a registered user. VirusTotal requires a paid subscription to download samples, but not every malware analysis website does. An alternative to uploading samples to malware analysis websites is to train cyber defenders to search for the file hash online, instead of simply uploading the samples for public consumption.
We reviewed how DoD red teams analyze their red team tooling for functionality and OPSEC; assessed the OPSEC considerations surrounding Cobalt Strike versions 4.8 – 4.9; and presented the EXSCAPE Python script which is useful for both red teams and cyber defenders in analyzing Cobalt Strike Beacons. The goal of this research is to shed light on areas surrounding cyber assessments of red team tooling that may often be overlooked – especially with a tool as heavily utilized as Cobalt Strike. While red teams have established testing procedures and perform OPSEC testing on their tools, there are OPSEC considerations that may go untested or are not considered during these cyber assessments. We offer an alternative point of view regarding Cobalt Strike that will assist red teams in making informed risk decisions about their tooling. Lastly, we have made the EXSCAPE script available to the teams to assist in their OPSEC testing and risk decision processes.
We thank Mr. Osvaldo “Ozzie” Perez, the ACO government lead, for approving, reviewing, and sponsoring this research. We thank the entire ACO team for their continued support and interest in this research. Additionally, we thank the DoD red team community for working with us, answering our questions, and supplying Beacons for our analysis. Lastly, we’d like to thank our anonymous reviewers for their feedback support. Any mention of commercial or open-source products are for explanation purposes only and do not constitute an endorsement by the United States Government. The opinions expressed in this article are solely the opinions of the authors and do not represent the policy or opinion of the United States Government.
OA Labs Research. 2022. Cobalt Strike Analysis. OA Labs. https://research.openanalysis.net (accessed March 13, 2024).
Darwin, Greg. 2023a. Cobalt Strike 4.8: (System) Call Me Maybe. Eden Prairie, Minnesota: Fortra. https://www.cobaltstrike.com (accessed March 11, 2024).
Darwin, Greg. 2023b. Cobalt Strike 4.9: Take Me to Your Loader. Eden Prairie, Minnesota: Fortra. https://www.cobaltstrike.com (accessed March 11, 2024).
Fortra. 2023a. Cobalt Strike Beacon. Eden Prairie, Minnesota: Fortra. https://www.cobaltstrike.com (accessed March 11, 2024).
Fortra. 2023b. Malleable Command and Control. Eden Prairie, Minnesota: Fortra. https://hstechdocs.helpsystems.com (accessed March 13, 2024).
Fortra. 2023c. The Artifact Kit. Eden Prairie, Minnesota: Fortra. https://hstechdocs.helpsystems.com (accessed March 14, 2024).
Fortra. 2023d. The Sleep Mask Kit. Eden Prairie, Minnesota: Fortra. https://hstechdocs.helpsystems.com (accessed March 13, 2024).
Fortra. 2024. License Authorization Files. Eden Prairie, Minnesota: Fortra. https://hstechdocs.helpsystems.com (accessed July 17, 2024).
Burgess, William. 2024. Cobalt Strike 4.10: Through the BeaconGate. Eden Prairie, Minnesota: Fortra. https://www.cobaltstrike.com (accessed July 16, 2024).
Mudge, Raphael. 2016a. A History of Cobalt Strike in Training Courses. Eden Prairie, Minnesota: Fortra. https://www.cobaltstrike.com (accessed March 13, 2024).
Mudge, Raphael. 2016b. What is a stageless payload artifact. Eden Prairie, Minnesota: Fortra. https://www.cobaltstrike.com (accessed March 13, 2024).
Abuse.Ch. 2024a. ThreatFox Database. Bern, Switzerland: Abuse.Ch. https://threatfox.abuse.ch/ (accessed March 15, 2024).
Nathan Wray, Sc.D., works for SIXGEN, Inc. and serves as a technical lead and senior operator on the ACO team under DOT&E CAP. Within his role, over the past eight years, Nathan has performed red teaming, developed offensive cyber operations capabilities, and assisted cyber teams across the DoD. He received a ScD in Cybersecurity from Capitol Technology University in 2018. His previous academic research has focused on ransomware detection using machine learning.
Sean Phipps works for SIXGEN, Inc. and serves as a senior operator on the ACO team under DOT&E CAP. Within his role, over the past seven years, Sean has performed sensitive red teaming of multiple boutique cyber capabilities, assisted red teams across the DoD, and has been an invaluable contributor to ongoing capability development efforts. Sean received a MS in Offensive Computer Security from Eastern Michigan University in 2012.
JUNE JOURNAL
READ the Latest Articles NOW!