Understanding ABI Encoding in Solidity

Yong kang Chia
4 min readApr 20, 2024

When interacting with smart contracts on the Ethereum blockchain, it’s essential to understand how data is encoded and passed between the contract and the external world. This is where the Application Binary Interface (ABI) comes into play. In this article, we’ll dive into the details of ABI encoding in Solidity and explore how it works under the hood.

What is ABI?

The ABI is a specification that defines how to interact with a smart contract. It provides information about the contract’s public and external functions, as well as the number and type of parameters each function accepts. When we want to invoke a function in a contract, we write expressions like function(arguments). However, the Ethereum Virtual Machine (EVM) doesn't understand such expressions directly. Instead, we need to send a binary expression encoded according to the contract's ABI.

Function Signature

Every function in Solidity has a unique signature, which is calculated as the first 4 bytes of the function’s Keccak256 hash, along with its parameter types.

Keccak256 is a cryptographic hash function that takes any input data and returns a fixed-size output (32 bytes). The resulting hash is a 64-character hexadecimal string, where each byte is represented by 2 hexadecimal digits.

For example, the hash of the function report(address,bytes,bytes[]) is c0965dc3b09190a0490c5d16aaccc5683a8981e70f5627ad4c73c63b5fd798bf. The first 4 bytes of this hash (c0965dc3) serve as the function signature.

Therefore the abi encoding is meant to encode the function signatures and its parameters into a format that can be passed to the EVM

Let’s remember the payload sent to the EVM, to execute the function `report(address,bytes,bytes[])`: `c0965dc3…000007`. The first 4 bytes are exactly the function signature. After the signature, we have 32 bytes that represent the function argument.

ABI Encoding

ABI encoding is the process of converting function signatures and parameters into a format that can be passed to the EVM.

The complexity of encoding depends on the number and types of parameters. While encoding integers is straightforward, encoding strings and arrays can be more involved.

Solidity provides a global variable called abi with several methods for encoding and decoding functions and arguments. Let's explore some of these methods.

The abi protocol specs can be found here

abi.encode

abi.encode is designed for most of encoding static types. The padding of bytes is determined by the underlying Solidity types being encoded. For example:

  • address and other static types less than 32 bytes (e.g., uint8) are zero-padded on the left side. For example:
abi.encode(0xe592427a0aece92de3edee1f18e0157c05861564) 
= 0x000000000000000000000000e592427a0aece92de3edee1f18e0157c05861564
  • Fixed-size byte values (e.g., bytes4, bytes8) are zero-padded on the right side. Like so:
abi.encode(0xabcdef12)   
= 0xabcdef1200000000000000000000000000000000000000000000000000000000

Dynamic types like strings, bytes, and arrays require a more nuanced approach due to their variable size. The encoding format for dynamic types includes:

  1. Offset: The first 32-byte word indicates the byte index at which the data starts.
  2. Length: The second 32-byte word indicates the length of the data, which varies among different dynamic types.
  3. Data: The actual data is encapsulated in a series of 32-byte words, adhering to the padding rules of static types.

The following is a demonstration of dynamic type encoding of "hello world


// The function above will return the following raw bytes value.
0x0000000000000000000000000000000000000000000000000000000000000020000000000000000000000000000000000000000000000000000000000000000b48656c6c6f20576f726c64000000000000000000000000000000000000000000

// We can split this into words that are 32 bytes long to get:
0x0000000000000000000000000000000000000000000000000000000000000020 // offset
000000000000000000000000000000000000000000000000000000000000000b // length
48656c6c6f20576f726c64000000000000000000000000000000000000000000 // string

abi.encodeWithSignature

When invoking a function in the EVM, the first four bytes of the payload represent the function’s signature. The abi.encodeWithSignature method allows encoding the function's arguments together with its signature.

abi.encodePacked

abi.encodePacked is used for packed encoding, where arguments are encoded to take up only their required size.

With abi.encodePacked , arguments are encoded so that they take up only their required size. TThis encoding is not compatible with how the EVM expects to receive functions, but it can be useful in certain scenarios.

When doing packed encoding in Solidity, the following ABI restrictions are not enforced anymore:

  • Dynamic types (for example, strings, arrays, and so on) are represented exactly as they are, without any offset or length information.

    - Zero padding is not applied to static types that are shorter than 32 bytes (for example, uint8, bytes4, and so on).

Further Specifications here.

However, it's worth noting that abi.encodePacked is being considered for deprecation in future versions of Solidity.

Conclusion

Understanding ABI encoding is crucial for interacting with smart contracts in Solidity. By grasping the concepts of function signatures, static and dynamic type encoding, and the various encoding methods provided by the abi variable, you'll be well-equipped to handle data exchange between your contracts and the external world.

--

--

Yong kang Chia

Blockchain Developer. Chainlink Ex Spartan Group, Ex Netherminds