Systems Programming

ARM64 Add Opcode Fix: Top 3 Preferred Choices for 2025

Discover the top 3 preferred methods for the ARM64 ADD opcode fix in 2025. Compare standard ADD, ADRP+ADD, and SUB for performance, and security.

D

David Chen

Principal Security Researcher specializing in low-level systems, reverse engineering, and ARM architecture.

7 min read3 views

Introduction: The Humble ADD in a Complex World

In the intricate world of ARM64 (or AArch64) assembly, the ADD instruction is a fundamental building block. It’s the cornerstone of arithmetic, pointer manipulation, and memory addressing. For most developers, using the standard ADD is an unconscious, automatic choice. However, for those working in systems programming, reverse engineering, security research, or high-performance computing, the choice of how to perform a simple addition is far from trivial. The term "opcode fix" doesn't refer to a flaw in the ARM architecture itself, but rather to a strategic solution for a specific problem you're trying to solve.

These problems can range from crafting shellcode that bypasses security filters to writing position-independent code that functions correctly under Address Space Layout Randomization (ASLR). As we head into 2025, with ARM64 dominating mobile, server, and desktop environments, understanding the subtle differences between addition techniques is more critical than ever. This guide explores the top three preferred choices for implementing an "ADD opcode fix," providing the context you need to select the right tool for the job.

Why an "Opcode Fix"? Understanding the Nuances

Before diving into the choices, it's crucial to understand why you might need an alternative to the most straightforward ADD instruction. The motivation typically falls into one of these categories:

  • Shellcode Development: Exploit payloads often have strict constraints. Certain byte values, like the null byte (0x00), can terminate strings and break the payload. A specific encoding of an ADD instruction might contain these "bad bytes," forcing a search for an alternative opcode that achieves the same result with a different byte sequence.
  • Polymorphic Code: To evade signature-based Intrusion Detection Systems (IDS) or antivirus software, malware authors (and security researchers) create polymorphic code. This involves using different instruction sequences to perform the same task, making the code's signature variable and harder to detect. Swapping an ADD for an equivalent SUB is a classic polymorphic technique.
  • Binary Patching and Instrumentation: When patching a compiled binary without access to the source, you may have limited space or need to work around existing instructions. Choosing a more compact or functionally different addition method can be the only way to insert your logic.
  • Position-Independent Code (PIC): Modern operating systems heavily use ASLR, which randomizes the memory locations of programs and libraries. To function correctly, code must not rely on absolute memory addresses. Certain addition techniques are essential for calculating addresses relative to the current instruction pointer, making them fundamental to PIC.

Choice 1: The Standard ADD (Immediate/Register) - The Reliable Workhorse

This is the instruction every ARM programmer learns first. It's direct, fast, and optimized for the most common arithmetic operations. It comes in two primary flavors: adding two registers or adding a register and an immediate (a constant value).

How It Works

The syntax is clean and simple. To add the values in registers X1 and X2 and store the result in X0:

ADD X0, X1, X2

To add an immediate value, like 16, to register X1 and store the result in X0:

ADD X0, X1, #16

The processor is highly optimized for these operations, making them extremely efficient in terms of execution cycles.

When to Use It

This is your default choice for over 95% of use cases. For general application development, compiler-generated code, and any situation where you are performing straightforward arithmetic within a function, the standard ADD is the optimal choice for its clarity and performance.

Potential Pitfalls

The primary limitation is with the immediate value. The ADD instruction can only encode a 12-bit immediate, optionally shifted left by 12 bits. This means you can add any value from 0 to 4095, or larger values in multiples of 4096. If you need to add a large, arbitrary constant, you'll need to load it into a register first, which requires an extra instruction.

Choice 2: ADRP + ADD - The Position-Independent Pro

This two-instruction sequence is the cornerstone of modern, relocatable code on ARM64. It's how you access variables, strings, and other data in a way that isn't dependent on the absolute memory address where your code is loaded.

How It Works

The magic is in the division of labor:

  1. ADRP Xd, : The "Address of Page" instruction calculates the address of the 4KB memory page containing the , relative to the current instruction's page. It stores this high-level address in the destination register Xd.
  2. ADD Xd, Xd, #: The ADD instruction then adds the lower 12 bits of the address—the offset of the within its 4KB page—to the page address already calculated by ADRP.

Together, they compute the full 64-bit address of relative to the program counter (PC), making the code inherently position-independent.

When to Use It

This pattern is non-negotiable for writing shared libraries (.so files on Linux/Android, .dylib on macOS/iOS) or any executable compiled as a Position-Independent Executable (PIE). It is the standard, compiler-preferred method for accessing data in the .data or .rodata sections.

Performance Considerations

While it requires two instructions instead of one, modern ARM CPUs are adept at fusing or co-issuing such common patterns, minimizing the performance penalty. The flexibility and correctness it provides in the face of ASLR far outweigh the minor cycle cost.

Choice 3: SUB with Negative Immediate - The Stealthy Alternative

This is a clever trick rooted in the nature of two's complement arithmetic, which is used by modern computers to represent negative numbers. Mathematically, adding a number is identical to subtracting its negative counterpart: A + B is the same as A - (-B).

How It Works

The concept is simple. An instruction like:

ADD X0, X1, #8

Can be replaced with its functional equivalent:

SUB X0, X1, #-8

While the result stored in X0 is identical, the machine code generated is completely different. The opcode for ADD (immediate) is different from the opcode for SUB (immediate), and the encoding for +8 is different from -8. This provides a different byte signature for the exact same operation.

When to Use It

This technique is a staple in the security world. Its primary applications are:

  • Bypassing Filters: If a firewall or exploit mitigation blocks shellcode containing the ADD opcode's byte pattern, using SUB can sneak the logic past the filter.
  • Polymorphism: To make static analysis harder, a code generator can randomly choose between ADD X, #N and SUB X, #-N to create variation.
  • Avoiding Bad Bytes: In some specific cases, the byte representation of the SUB instruction might avoid a problematic byte (like 0x0A, the line feed character) that the equivalent ADD instruction contains.

A Word of Caution

While powerful, this method can harm code readability. A human reading the assembly might be momentarily confused by the intent of subtracting a negative number. It should be used deliberately and only when its specific benefits—stealth and signature variation—are required.

Head-to-Head: ARM64 ADD Opcode Choices

Comparison of ARM64 Addition Techniques
TechniquePrimary Use CasePerformancePosition IndependenceStealth Factor
Standard ADDGeneral-purpose arithmetic, fast calculationsHighestNo (by itself)Low
ADRP + ADDAccessing data in relocatable code (PIC/PIE)High (often fused)Yes (its purpose)Low
SUB (negative)Evasion, polymorphism, avoiding bad bytesHighest (same as ADD)No (by itself)High

Making the Right Choice in 2025

Choosing the correct method for addition on ARM64 is entirely context-dependent. There is no single "best" way, only the most appropriate way for your specific goal. Here’s a simple decision tree for 2025:

  • Are you writing standard application logic? Stick with the Standard ADD. The compiler will use it correctly, and it's the most performant and readable choice.
  • Are you accessing global or static data? You must use the ADRP + ADD pattern to ensure your code is compatible with modern operating systems and security features like ASLR. Your compiler will handle this for you in most high-level languages.
  • Are you operating in a constrained security environment? If you're writing shellcode, analyzing malware, or developing exploit mitigations, the SUB with a negative immediate is a powerful tool in your arsenal for evasion and polymorphism.

By understanding the strengths and applications of each of these three choices, you can move beyond a superficial understanding of the ARM64 instruction set and start making deliberate, expert-level decisions in your low-level code.