3 Common ARM64 ADD Opcode Mistakes to Avoid in 2025
Master ARM64 assembly in 2025! Uncover 3 common ADD opcode mistakes, from flag updates (ADD vs ADDS) to immediate value limits, and write more efficient code.
Dr. Kenji Tanaka
Principal CPU Architect with 15+ years of experience in RISC instruction set design.
Introduction: Why Mastering ADD is Crucial for ARM64
The ARM64 (or AArch64) architecture is no longer just for mobile phones. It powers everything from Apple's latest silicon to high-performance computing servers at AWS. As its dominance grows, so does the need for developers to understand its instruction set architecture (ISA). Whether you're a compiler engineer, a security researcher, or a performance optimization specialist, a deep understanding of ARM64 assembly is an invaluable skill.
At the heart of any computation is arithmetic, and the most fundamental arithmetic operation is addition, represented by the ADD
opcode. It seems simple—just add two numbers. However, the nuances of the ARM64 `ADD` instruction and its variants are a common source of subtle, hard-to-debug errors. As we head into 2025, with ARM64's feature set stabilizing and its ecosystem maturing, avoiding these foundational mistakes is more critical than ever.
This post will dissect the three most common mistakes developers make with the ADD
opcode, providing clear examples and best practices to help you write more robust and efficient ARM64 code.
Mistake 1: Forgetting Flag Updates (ADD vs. ADDS)
Perhaps the most frequent error, especially for those coming from x86, is misunderstanding how ARM64 handles condition flags. This mistake can lead to conditional logic that never executes as intended.
The Silent Bug of Unchecked Conditions
The ARM architecture uses a set of condition flags in the Process State (PState) register to record the outcome of an operation. The most important flags for general arithmetic are:
- N (Negative): Set if the result is negative.
- Z (Zero): Set if the result is zero.
- C (Carry): Set if the operation resulted in an unsigned overflow (a carry-out).
- V (oVerflow): Set if the operation resulted in a signed overflow.
These flags are essential for implementing control flow, such as checking if two numbers are equal (by seeing if their difference is zero) or branching if an addition overflows. The mistake lies in assuming that every arithmetic instruction updates these flags.
ADD: The Non-Updating Variant
In ARM64, the standard ADD
instruction does not modify the condition flags. It performs the addition and stores the result, but leaves the N, Z, C, and V flags untouched. This is a deliberate design choice to improve performance by not calculating flag information when it isn't needed.
Consider this incorrect code snippet, which attempts to add two registers and branch if the result is zero:
; WARNING: This code is buggy!
MOV W0, #0xFFFFFFFF ; Load W0 with -1
MOV W1, #1 ; Load W1 with 1
ADD W0, W0, W1 ; W0 = -1 + 1 = 0. Flags are NOT updated.
B.EQ is_zero ; This branch will NOT be taken based on the ADD result
; ... some other code
is_zero:
; This block may never be reached
The B.EQ
(Branch if Equal, which checks if the Z flag is set) instruction's behavior will be determined by whatever instruction *before* the ADD
last set the flags. The result of our ADD
is ignored by the branch, creating a silent and potentially catastrophic bug.
ADDS: The Solution for Conditionals
To fix this, you must use the "Set Flags" variant of the instruction: ADDS
. The 'S' suffix tells the processor to perform the addition and update the NZCV flags based on the result.
Here is the corrected, functional version of the code:
; This is the correct way
MOV W0, #0xFFFFFFFF
MOV W1, #1
ADDS W0, W0, W1 ; W0 = 0. Z flag is set to 1.
B.EQ is_zero ; Branch is correctly taken!
; ... some other code
is_zero:
; This code now executes as expected
Rule of thumb for 2025: If your addition is part of a conditional logic sequence, you almost certainly need ADDS
, not ADD
.
Mistake 2: Mishandling Immediate Values
Another common tripwire is the limitation on immediate (constant) values that can be used directly within the ADD
instruction. The A64 instruction encoding is fixed at 32 bits, which means there's a finite amount of space to encode the opcode, registers, and any immediate value.
The Deceptive Simplicity of Immediate Operands
The ADD (immediate)
instruction has a specific format for its constant operand. It can encode:
- An unsigned 12-bit immediate value (0 to 4095).
- This same 12-bit value, but shifted left by 12 bits.
This means you can add small numbers like #100
or larger, specific numbers like #8192
(which is #2, LSL #12
) in a single instruction. However, you cannot add an arbitrary number like #5000
.
An assembler will catch this and throw an error, but understanding why it's an error is crucial. Attempting to write this will fail:
; This will fail to assemble
ADD X0, X1, #5000 ; Error: immediate value '5000' cannot be encoded.
The value 5000 is greater than 4095 and cannot be represented as a 12-bit value shifted by 12. Relying on the assembler to catch this is fine, but not knowing the alternative can halt your development.
The Correct Approach for Large Constants
When you need to add a large or arbitrary constant, you must first construct it in a temporary register. There are two primary methods:
- Using MOVZ/MOVK: The preferred method for constructing 32-bit or 64-bit constants.
MOVZ
(Move with Zero) places a 16-bit value into a register and zeroes the other bits.MOVK
(Move with Keep) places a 16-bit value into a specified position in the register, leaving the other bits untouched. - Loading from Memory: If the constant is used frequently, it can be more efficient to load it from a literal pool in memory.
; How to correctly add 5000 (0x1388)
MOV X2, #5000 ; Assembler pseudo-instruction, likely expands to MOVZ
ADD X0, X1, X2 ; X0 = X1 + 5000. Correct.
; For a larger constant like 0x123456789ABCDEF0
MOVZ X2, #0xDEF0, LSL #0 ; Load bottom 16 bits
MOVK X2, #0x9ABC, LSL #16 ; Load next 16 bits
MOVK X2, #0x5678, LSL #32 ; Load next 16 bits
MOVK X2, #0x1234, LSL #48 ; Load top 16 bits
ADD X0, X1, X2 ; Perform the addition
LDR X2, =5000 ; Pseudo-instruction to load 5000 from a literal pool
ADD X0, X1, X2 ; X0 = X1 + 5000. Correct.
Mistake 3: Incorrectly Using the Shifted Register Operand
One of ARM's most powerful features is its ability to shift or rotate a register's value as part of another instruction, all within a single cycle. The `ADD (shifted register)` instruction is a prime example, but its syntax can be confusing.
The Power and Peril of In-Instruction Shifting
The instruction ADD Xd, Xn, Xm,
calculates Xd = Xn + (Xm shifted by amount)
. This is incredibly efficient for common patterns like array indexing (e.g., `base_address + index * element_size`).
For example, to calculate `X0 = X1 + (X2 * 8)`, you can do it in one instruction instead of two:
; Efficient calculation using a left shift
ADD X0, X1, X2, LSL #3 ; LSL #3 is equivalent to multiplying by 2^3, or 8
The mistake arises from a misunderstanding of the operation's syntax and limitations.
Common Pitfalls with Shifted Operands
- Invalid Shift Amount: The shift amount is not unlimited. For 64-bit registers (X0-X30), the shift amount must be between 0 and 63. For 32-bit registers (W0-W30), it must be between 0 and 31. An assembler will catch an out-of-range static value, but it's a conceptual gap that can cause confusion.
- Misinterpreting the Order of Operations: A developer might mistakenly believe `ADD X0, X1, X2, LSL #3` computes `(X1 + X2) << 3`. This is incorrect. The shift operation only applies to the final register operand (
Xm
), not the result of the addition. The operation is always `Xd = Xn + op2`, where `op2` is the potentially shifted register. - Using the Wrong Shift Type: ARM64 provides several shift types. Using the wrong one can have dramatic consequences.
LSL
(Logical Shift Left): Fills with zeros. Used for multiplying by powers of 2.LSR
(Logical Shift Right): Fills with zeros. Used for unsigned division by powers of 2.ASR
(Arithmetic Shift Right): Fills with the sign bit (bit 63 or 31). Used for signed division by powers of 2. UsingLSR
on a negative number will corrupt its value.ROR
(Rotate Right): The bits shifted out from the right are inserted on the left.
Always double-check that you are applying the correct shift type and amount to the intended operand to avoid subtle data corruption bugs.
Instruction | Purpose | Updates Flags? | Typical Use Case |
---|---|---|---|
ADD | Adds two operands without affecting flags. | No | Simple arithmetic where no conditional logic follows (e.g., calculating a pointer offset). |
ADDS | Adds two operands and updates the NZCV flags. | Yes | Arithmetic that is immediately followed by a conditional branch (e.g., ADDS X0, X1, X2 then B.EQ ... ). |
ADC | Add with Carry. Adds two operands plus the value of the Carry flag. | No | Multi-word arithmetic for numbers larger than 64 bits. Used in a chain after an initial ADDS . |
ADCS | Add with Carry and Set Flags. Adds operands + Carry flag and updates flags. | Yes | The intermediate step in a multi-word addition chain that requires flag setting for the next link. |
Conclusion: Writing Cleaner, More Efficient ARM64 Code
The ADD
instruction is a building block of ARM64 assembly, but its apparent simplicity hides important details that can trip up even experienced programmers. By internalizing the distinction between ADD
and ADDS
, understanding the limitations of immediate operands, and correctly applying shifted registers, you can avoid a significant class of common bugs.
As we move further into 2025, proficiency in low-level ARM64 development will only become more valuable. Mastering these fundamentals is the first step toward writing the clean, efficient, and correct code that modern high-performance systems demand.