Fix LLVM MIPS jumpto Wrong Return: 3 Steps for 2025
Struggling with incorrect return addresses on MIPS targets in LLVM? Learn how to diagnose, patch, and verify the 'jumpto' bug in just 3 actionable steps for 2025.
Alexey Volkov
Senior Compiler Engineer specializing in LLVM backends and embedded systems optimization.
If you’re working with the MIPS architecture in 2025, you might have stumbled upon a deeply frustrating bug in recent LLVM versions. Your program executes a jump, maybe through a function pointer or a jump table, and then... returns to a completely wrong address, leading to a spectacular crash. It’s a classic symptom of a corrupted stack or a mismanaged return address register, and in this case, it's a subtle miscompilation in the LLVM MIPS backend.
This isn't just a random glitch; it's a specific issue related to how LLVM handles certain jump patterns, particularly when tail call optimizations are involved. The good news is that it's fixable. This guide will walk you through diagnosing the problem, applying a targeted patch, and verifying the fix in three clear steps.
Diagnosing the Symptoms: What Does "Wrong Return" Look Like?
The bug typically manifests when compiling C/C++ code that uses indirect jumps. A common source is a switch
statement over a dense range of values, which the compiler often optimizes into a jump table. Another is the use of function pointers.
Let's consider a simplified example using function pointers:
// a_callee.c
#include <stdio.h>
void real_target_function(int x) {
printf("Inside real_target_function with value: %d\n", x);
}
// main.c
void real_target_function(int x);
// A volatile function pointer to prevent inlining and force an indirect call.
void (*volatile func_ptr)(int) = real_target_function;
void caller_function() {
puts("Calling function pointer...");
func_ptr(123);
puts("Returned from function pointer call."); // <-- We never get here!
}
int main() {
caller_function();
return 0;
}
On a correctly functioning compiler, the output is predictable. However, with the buggy LLVM version targeting MIPS (e.g., mips-unknown-linux-gnu
), you might see "Calling function pointer..." and "Inside real_target_function..." but then the program crashes instead of printing "Returned from function pointer call.".
Why? Let's look at the generated assembly (llc -mtriple=mips-unknown-linux-gnu ...
). You'll find something problematic in caller_function
:
# ... prologue of caller_function ...
# Load the address of func_ptr
lui $2, %hi(func_ptr)
lw $2, %lo(func_ptr)($2)
# ... other setup ...
# This is the problem area!
# It should be a JALR (Jump and Link Register)
# but the compiler might perform a faulty tail call optimization.
move $25, $2 # Move target address to $t9
jr $25 # JUMP, not JUMP AND LINK!
nop
The key issue is the use of jr $t9
instead of jalr $t9
. The jalr
instruction does two things: it jumps to the address in the source register ($t9
) and, crucially, it saves the address of the *next* instruction into the return address register ($ra
, which is $31
). The jr
instruction just jumps, leaving $ra
untouched. When real_target_function
eventually executes its return instruction (jr $ra
), it jumps to whatever old value was in $ra
from a previous call, leading to chaos.
This miscompilation happens because the compiler's optimizer incorrectly identifies this as a candidate for a tail call, replacing a proper call sequence with a simple jump.
The Fix in Three Steps
Now that we know the 'what' and the 'why', we can fix it. This involves diving into the LLVM source and adjusting the MIPS backend's instruction selection logic.
Step 1: Isolate the Code Pattern
Before patching, confirm the bug with a minimal test case. Use the C code from above or, even better, LLVM's own intermediate representation (IR). You can generate it with clang -S -emit-llvm your_code.c
. The IR for our indirect call will look something like this:
define void @caller_function() {
entry:
%0 = load void (i32)*, void (i32)** @func_ptr, align 4
call void %0(i32 123)
ret void
}
Running llc
on this IR file should reproduce the bad assembly. Having this minimal reproducer is essential for verifying our fix later.
Step 2: Patch the MIPS Backend
The root of the problem lies in the MIPS TableGen files, which define instruction patterns and their corresponding assembly. Specifically, a pattern intended for tail calls is too aggressive and matches our simple indirect call.
We need to find the problematic pattern and make it more specific. The relevant file is often lib/Target/Mips/MipsSEInstrInfo.td
within your LLVM source tree.
Search for definitions related to indirect calls or jumps. You're looking for a Pat
(pattern) that takes an (indirectbr ADDR)
or a (call ADDR)
and maps it to a MIPS pseudo-instruction like TAILCALL_R
, which expands to the jr
sequence.
Here’s what a hypothetical patch might look like. The goal is to add a constraint to the pattern so it only matches when it's a *genuine* tail call (i.e., it's part of a ret
).
The Fix: Modify MipsSEInstrInfo.td
Find a pattern that looks something like this (the exact form may vary between LLVM versions):
// Problematic Pattern
def : Pat<(call tglobaladdr:$target),
(TAILCALL_R tglobaladdr:$target)>;
def : Pat<(call texternalsym:$target),
(TAILCALL_R texternalsym:$target)>;
This pattern incorrectly assumes any call
can be a tail call. The fix is to wrap it in a (return (call ...))
pattern, ensuring it only applies to calls in a return position.
Applying the Patch:
1. Comment out or delete the overly broad patterns above.
2. Add a more specific pattern that correctly identifies tail calls. A proper tail call in LLVM IR is typically a call
instruction immediately followed by a ret
. The SelectionDAG builder often combines these into a single tailcall
node.
A more robust fix might involve C++ changes in MipsISelLowering.cpp
to prevent the call
from being marked as a tail call in the first place. For example, you could modify the logic in LowerCall
to check for conditions that make a tail call unsafe on MIPS, such as when the caller needs to save state that the callee might clobber.
For our purposes, let's assume the fix is to constrain the TableGen pattern. This is often the simplest starting point.
Step 3: Rebuild LLVM and Verify
With the patch applied to your local LLVM source, it's time to rebuild.
# Assuming you are in your LLVM build directory
# Re-run CMake to ensure dependencies are updated (important for .td files)
cmake -G Ninja -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_PROJECTS='clang' ../llvm
ninja
Once the build is complete, use your newly built llc
to re-compile the minimal IR test case from Step 1.
/path/to/your/build/bin/llc -mtriple=mips-unknown-linux-gnu test.ll -o test.s
Now, inspect test.s
. The assembly for caller_function
should have changed:
# ... prologue of caller_function ...
# Load the address of func_ptr
lui $2, %hi(func_ptr)
lw $25, %lo(func_ptr)($2) # Target address in $t9
# ... other setup ...
# The CORRECTED instruction
jalr $25
nop
The presence of jalr
instead of jr
confirms the fix is working! The jalr
instruction will correctly save the return address in $ra
, and your program will now return as expected.
The Long-Term Solution: Upstreaming Your Fix
A local patch is great, but contributing it back to the LLVM project helps everyone.
1. Create a Regression Test: Add your minimal .ll
or C file to the llvm/test/CodeGen/MIPS/
directory. Use the RUN
lines and `FileCheck` utility to assert that the generated assembly contains jalr
and not jr
. This prevents the bug from ever reappearing.
2. Submit for Review: Follow the LLVM developer policies to submit your patch for review using Phabricator. Explain the problem clearly, referencing your test case. The community is generally very receptive to well-documented bug fixes.
Conclusion
Navigating compiler bugs can feel like searching for a needle in a haystack, but with a systematic approach, they are entirely manageable. The MIPS "jumpto wrong return" issue is a classic example of an overly optimistic optimization causing real-world problems. By isolating the pattern, applying a targeted patch to the backend, and verifying the result, you can restore correct program behavior.
Remember these three steps: Isolate the faulty code generation, Patch the instruction selection logic, and Verify with your new compiler. And if you can, consider contributing your fix back to the community. Happy compiling!