Ada Programming

Get Ada 2022 String Length Right: 5 Fast Fixes (2025)

Struggling with Ada 2022 string length? Learn 5 fast fixes for handling character counts vs. byte lengths, Unicode, bounded strings, and C interop. Updated for 2025.

D

Dr. Eleanor Vance

High-integrity systems expert specializing in Ada and SPARK for mission-critical software.

6 min read4 views

Why String Length is Tricky in Ada 2022: Bytes vs. Characters

You’ve written some clean Ada 2022 code. You have a string, say "Hello, 世界!". You check its length with the trusty 'Length attribute and get 9. But when you write this string to a file, you find it occupies 15 bytes. What gives? Welcome to the most common string-handling confusion in modern Ada: the critical difference between character count and byte count.

Ada has evolved significantly to embrace Unicode. The standard Character type is now Wide_Wide_Character, capable of representing any character in the Unicode standard. This is fantastic for internationalization and robust software. However, it means that the one-character-equals-one-byte assumption from the ASCII world is long gone.

When you use My_String'Length, you are asking, "How many Character elements are in this array?" This is the character length. When you talk about file sizes, network packets, or C-style strings, you are almost always dealing with the byte length after encoding (e.g., in UTF-8).

Understanding this distinction is the key to mastering string manipulation in Ada 2022. Let’s dive into five practical fixes that will help you get the right length, every time.

5 Fast Fixes for Accurate String Length

Here are five targeted solutions for the most common string length scenarios you'll encounter in your Ada 2025 projects.

Fix 1: The Default Case - Use 'Length for Character Count

For most of your internal application logic, the 'Length attribute is exactly what you need. When you're iterating over a string, accessing an element by index, or performing purely Ada-based manipulations, you are working with Character types, not their byte representations.

When to use it: Any time you need to know the number of Unicode code points (characters) in an Ada String.

with Ada.Text_IO; use Ada.Text_IO;

procedure Show_Character_Length is
   My_String : constant String := "Hello, 世界!";
begin
   -- This correctly reports the number of characters.
   Put_Line ("Character count: " & Integer'Image (My_String'Length)); -- Outputs 9
end Show_Character_Length;

Stick with 'Length for all your high-level string logic. It's simple, efficient, and semantically correct within the Ada environment.

Fix 2: For Byte Size - Use Ada.Strings.UTF_Encoding

When you need to know how much space a string will occupy on disk, in a database, or over a network, you need its byte length. This depends on the encoding. The standard library Ada.Strings.UTF_Encoding is your best friend here.

When to use it: Before writing to a file, sending data over a socket, or interfacing with any system that expects a byte stream. Most modern systems expect UTF-8.

with Ada.Text_IO;          use Ada.Text_IO;
with Ada.Strings.UTF_Encoding; use Ada.Strings.UTF_Encoding;

procedure Show_Byte_Length is
   My_String : constant String := "Hello, 世界!";
   Byte_Count : Natural;
begin
   -- Get the length of the string if it were encoded as UTF-8.
   Byte_Count := UTF_8.Encoded_Size (My_String);
   Put_Line ("UTF-8 byte count: " & Integer'Image (Byte_Count)); -- Outputs 15
end Show_Byte_Length;

The function Encoded_Size is crucial. It calculates the required byte count without performing the full encoding, making it highly efficient for pre-allocation and size checks.

Fix 3: For Safety - Master Ada.Strings.Bounded Length

In high-integrity and embedded systems, dynamically allocated strings are often forbidden. Ada.Strings.Bounded provides a safe, fixed-size string type. Getting the length here is straightforward, but the context is all about preventing buffer overflows.

The Length function from the package returns the current length of the content within the bounded container.

When to use it: When working with fixed-size string buffers in safety-critical or resource-constrained environments.

with Ada.Text_IO;                    use Ada.Text_IO;
with Ada.Strings.Bounded.Generic_Bounded_Length; 

procedure Show_Bounded_Length is
   Max_Len : constant := 32;
   package Bounded_32 is new Ada.Strings.Bounded.Generic_Bounded_Length (Max_Len);
   use Bounded_32;

   My_Bounded_String : Bounded_String := To_Bounded_String ("Hello");
begin
   Put_Line ("Current length: " & Integer'Image (Length (My_Bounded_String))); -- Outputs 5
   Append (My_Bounded_String, ", World!");
   Put_Line ("New length: " & Integer'Image (Length (My_Bounded_String))); -- Outputs 12
   Put_Line ("Max length: " & Integer'Image (Max_Length)); -- Outputs 32
end Show_Bounded_Length;

Key takeaway: Always check against Max_Length before appending or modifying to avoid raising a Length_Error exception.

Fix 4: For Flexibility - Use the Length Function with Unbounded_String

Ada.Strings.Unbounded.Unbounded_String is the go-to for general-purpose programming where string sizes vary unpredictably. Like its bounded cousin, it has a Length function to get the current character count.

When to use it: In general application development where string sizes are dynamic and performance is less critical than flexibility.

with Ada.Text_IO;               use Ada.Text_IO;
with Ada.Strings.Unbounded;     use Ada.Strings.Unbounded;

procedure Show_Unbounded_Length is
   My_Unbounded_String : Unbounded_String := To_Unbounded_String ("Ada 2022");
begin
   Put_Line ("Initial length: " & Integer'Image (Length (My_Unbounded_String))); -- Outputs 8
   My_Unbounded_String := My_Unbounded_String & " is powerful.";
   Put_Line ("Final length: " & Integer'Image (Length (My_Unbounded_String))); -- Outputs 22
end Show_Unbounded_Length;

While Unbounded_String handles memory for you, remember that converting it to a standard String (with To_String) allows you to use 'Length and the `UTF_Encoding` techniques we've already discussed.

For Interoperability - Correctly Size for C

Interfacing with C libraries is a common requirement. C strings are null-terminated arrays of `char` (bytes). Getting the length right is a two-step process: determine the required byte size, then perform the conversion.

When to use it: When preparing a string to be passed to a C function.

  1. Calculate byte length: Use `Ada.Strings.UTF_Encoding.Encoded_Size` to find out how many bytes your string will become (plus one for the null terminator).
  2. Convert: Use `Ada.Interfaces.C.Strings.New_String` to convert the Ada `String` into a C-compatible `char_array` with a null terminator.
with Ada.Text_IO;              use Ada.Text_IO;
with Ada.Strings.UTF_Encoding; use Ada.Strings.UTF_Encoding;
with Ada.Interfaces.C.Strings; use Ada.Interfaces.C.Strings;

procedure C_Interop_Length is
   Ada_Str   : constant String := "Test: €"; -- Euro symbol is multi-byte
   C_Str_Ptr : chars_ptr;
   Byte_Size : Natural;
begin
   -- Step 1: Calculate UTF-8 byte size (excluding null terminator)
   Byte_Size := UTF_8.Encoded_Size (Ada_Str);
   Put_Line ("Ada character length: " & Integer'Image (Ada_Str'Length));      -- Outputs 6
   Put_Line ("Required C byte length (UTF-8): " & Integer'Image (Byte_Size)); -- Outputs 8

   -- Step 2: Allocate and convert
   C_Str_Ptr := New_String (Ada_Str);
   
   -- Now C_Str_Ptr can be passed to a C function.
   -- Don't forget to free the memory!
   Free (C_Str_Ptr);
end C_Interop_Length;

Failing to account for UTF-8 encoding is the #1 source of bugs in Ada-to-C string interfaces. Always calculate the byte size first!

Quick Comparison: Ada String Length Methods

Here's a handy table to summarize which length-finding method to use in different situations.

Ada String Length Methodologies (2025)
Method / ContextHow to Get LengthWhen to Use ItPotential Pitfall
Standard `String`My_String'LengthInternal Ada logic, loops, character-based operations.Mistaking it for byte length with Unicode strings.
Byte Stream (e.g., UTF-8)UTF_Encoding.Encoded_Size(S)File I/O, network protocols, memory allocation for C interop.Forgetting to account for the chosen encoding (UTF-8 vs UTF-16).
`Bounded_String`Length(My_Bounded_Str)Safety-critical systems with fixed-size buffers.Not checking against `Max_Length` before modification, causing `Length_Error`.
`Unbounded_String`Length(My_Unbounded_Str)General-purpose applications with dynamic string sizes.Potential performance overhead compared to standard `String`.
C Interop (`char_array`)Interfaces.C.Strings.Strlen(C_Str)When receiving a string from C.Forgetting that C strings are null-terminated; incorrect memory management.

Practical Example: Writing a UTF-8 String to a File

Let's tie it all together. Suppose we want to write a string to a binary file, prefixed by its 4-byte UTF-8 length. This is a common serialization technique.

with Ada.Streams.Stream_IO;
with Ada.Strings.UTF_Encoding;
with Ada.Text_IO;

procedure Write_Length_Prefixed_String is
   File      : Ada.Streams.Stream_IO.File_Type;
   Stream    : Ada.Streams.Stream_IO.Stream_Access;
   My_Data   : constant String := "Ada 2022 handles Unicode (α, β, γ) perfectly.";
   -- A type to hold the byte representation of the string
   subtype Byte_Array is Ada.Streams.Stream_Element_Array;
   UTF8_Bytes: Byte_Array;
   Byte_Len  : Natural;
begin
   -- 1. Calculate the required byte length in UTF-8
   Byte_Len := Ada.Strings.UTF_Encoding.UTF_8.Encoded_Size (My_Data);
   Ada.Text_IO.Put_Line ("Character count: " & My_Data'Length'Image);
   Ada.Text_IO.Put_Line ("UTF-8 byte count: " & Byte_Len'Image);

   -- 2. Encode the Ada String into a byte array
   Ada.Strings.UTF_Encoding.UTF_8.Encode (My_Data, UTF8_Bytes, Byte_Len);

   -- 3. Create a file and write the length, then the data
   Ada.Streams.Stream_IO.Create (File, Name => "data.bin");
   Stream := Ada.Streams.Stream_IO.Stream (File);

   -- Write the 4-byte length prefix (assumes 32-bit Natural)
   Natural'Write (Stream, Byte_Len);
   -- Write the actual UTF-8 encoded string data
   Byte_Array'Write (Stream, UTF8_Bytes (1 .. Byte_Len));

   Ada.Streams.Stream_IO.Close (File);
   Ada.Text_IO.Put_Line ("Wrote " & (4 + Byte_Len)'Image & " bytes to data.bin");

exception
   when others =>
      Ada.Text_IO.Put_Line ("An error occurred.");
      if Ada.Streams.Stream_IO.Is_Open (File) then
         Ada.Streams.Stream_IO.Close (File);
      end if;
end Write_Length_Prefixed_String;

This example demonstrates the complete, robust workflow: calculate byte size, encode, and then write. This pattern prevents countless hard-to-find bugs related to character encoding.

Conclusion: Context is Everything

Getting string length right in Ada 2022 isn't about finding one magic function. It's about understanding the context of your operation. Are you counting characters for an algorithm, or are you measuring bytes for storage or transmission? Once you internalize this question, the solution becomes clear.

By default, use 'Length for character counts. When bytes matter, turn to Ada.Strings.UTF_Encoding. For specialized string types like Bounded_String and Unbounded_String, use the provided Length function from their respective packages. Master these five fixes, and you'll eliminate an entire class of string-related errors from your Ada applications.