From Idea to Implementation: Creating a Solidity Gas Optimizer for Smart Contracts
This article explores how Xian Lin (CHAD dev) and I developed a smart contract optimizer that enhances gas savings by performing optimizations at both the source code and intermediate representation levels.
TLDR
If you are too lazy to read, we built a web gas optimiser app that optimises your solidity code based on your selected optimisations — Struct Packing, Storage Variable Caching and Call Data Optimisation. Find the code here.
Why Build a Solidity Gas Optimizer?
Gas is a crucial resource on Ethereum Virtual Machine (EVM) based blockchains. The cost of executing smart contracts, measured in gas, directly impacts the efficiency and expense of blockchain operations.
The Solidity compiler translates Solidity code into EVM bytecode, incorporating various optimization techniques to reduce gas consumption during deployment and execution. However, these built-in optimizations have limitations. To achieve the best gas efficiency, developers must write highly optimized code.
Although some linters and optimizers are available, there is a notable shortage of tools that can analyze and suggest improvements at both the source code and intermediate representation levels.
This gap motivated us to create a dev tool that helps Solidity developers write gas-efficient code, ultimately minimizing the execution cost of smart contracts on the blockchain.
Background Information
How is Gas measured in EVM?
- In the EVM, gas measures the computational cost per bytecode instruction. Miners receive ETH based on gas price and total gas used per transaction. See here for more.
The Role of the Solidity Compiler
- The Solidity compiler is responsible for converting Solidity code into EVM bytecode. This bytecode is what gets deployed and executed on the blockchain.
Intermediate Representation — Abstract Syntax Tree (AST)
- An Abstract Syntax Tree (AST) is a tree-like data structure that represents the hierarchical syntactic structure of a programming language’s source code.
- The AST serves as an intermediate representation that bridges the high-level source code and the lower-level target code. This allows the compiler to perform syntax and semantic analyses more effectively.
1. Overview of the Application
Our Solidity Gas Optimizer offers two deployment options: a standalone command-line interface (CLI) for efficient contract optimization, and a web application providing a user-friendly visual interface for the same purpose.
The web application consists of two main components: a Go backend web server and a Next.js/React frontend.
When the user first opens the website, he is greeted by a screen which allows him to input the contract he wants to optimize as well as the optimizations he wants to select. When the Optimize Code button is pressed, the frontend sends the payload to the backend which interacts with the Optimizer module to get the optimised Code.
1.1. Displaying Code
The unoptimized and optimized code are displayed side by side for the user, with an option to view the differences between the two versions. When users press the “Toggle Diff” button, the view highlights the changes made by the optimization process.
Upon pressing the Toggle Diff button, this is the view that the user gets, which highlights the changes that were made by the optimisation.
1.2. Gas Estimation
After optimizing a contract, an “Estimate Gas” window appears, allowing users to write code to invoke their contract functions. This functionality enables the estimation of gas usage for both unoptimized and optimized contracts, allowing for a direct comparison.
The test code is sent to the backend, where it is incorporated into a template to generate two test files: Unoptimized.t.sol
and Optimized.t.sol
, necessary for the gas estimation process.
1.3. Gas Estimation Results
Upon running a gas estimation, a window displays the gas costs of the function call for each contract. For example, the function call for the optimized contract might cost approximately 4% less gas than its unoptimized counterpart.
Find the code here.
Now that you understand the purpose and functionality of our application, let’s move on to the thought process and design behind our Solidity Optimizer.
2. Design of Our Application
The design phase involves planning the architecture and key components of the Solidity optimizer. This phase ensures that the tool can effectively analyze and optimize Solidity smart contracts at both the source code and intermediate representation levels.
2.1. Architecture
Following a simpler design to how compilers are implemented, we came up with an initial architecture as seen below.
The following architecture shows the high-level workflow of our optimizer application
2.2. Lexing and Parsing
We made use of the solgo library for the lexing and parsing of the solidity language. The parser is generated from a Solidity grammar file using Antlr, producing a lexer, parser, and listener using AntlrGo.
As such, it enables the syntactic analysis of Solidity code, transforming it into a parse tree that offers a detailed syntactic representation of the code, allowing for intricate navigation and manipulation.
2.3. Gas Optimization
There were many different gas optimisers that we could go for an we wanted to choose the optimisations that has the lowest hanging fruit — lowest effort to bring about the most gas savings. The section on optimizations is below.
After feeding the Solidity program through the Lexer and Parser and obtaining an AST tree. We apply the selected optimizations sequentially to the AST, using techniques such as DFS and visitors to traverse the tree and mutate the AST directly.
These are the general steps of the Optimization process:
- Optional check to see if the construct we are about to modify adheres to the constraints under which our optimization can be applied.
- Use a combination of DFS and visitors to traverse the AST and keep track of the nodes to be modified
- Apply modifications to the AST
However the main idea is that optimisations should be modular, and adding further gas optimisations could be done independently.
2.4. Pretty Printing
After applying the optimizations, we convert the Abstract Syntax Tree (AST) back into its Solidity code representation, a process known as Pretty Printing or Unparsing. Given the extensive grammar of the Solidity language, we have selectively implemented pretty printing for the most commonly used node types.
While we plan to extend support to the remaining node types in the future, our current implementation is sufficient for printing most Solidity contracts.
Below is a list of the node types currently supported for pretty printing:
2.5. Gas Estimator
Foundry
Gas estimation within our tool leverages Foundry, a comprehensive smart contract development framework. Foundry streamlines the development process by managing dependencies, compiling projects, executing tests, facilitating deployments, and enabling interaction with the blockchain through command-line tools and Solidity scripts. For our purposes, we primarily utilize Foundry’s functionalities for test execution and gas reporting, allowing for accurate gas usage estimation.
Usage
Leveraging Foundry’s testing capabilities, we established a Forge project to build and execute Solidity scripts for gas estimation. Unit tests within the Forge project provide valuable insights into gas usage through tracing functionality, enabling us to accurately estimate gas consumption for contracts.
3. Optimizations
3.1. Struct Packing
Storage packing reduces the number of necessary SLOAD (loading data) or SSTORE (storing data) operations, cutting the cost of accessing storage variables by half or more, especially when multiple values in the same storage slot are read or written at once.
For an example code of how Struct Packing looks like, find the code here, or a primer here.
Correctness
Solidity compilers employ a space-saving strategy for storing a contract’s state variables. Each storage slot can accommodate 32 bytes (256 bits) of data. The compiler efficiently packs multiple variables into these slots, provided their combined size remains under the 32-byte limit. This approach optimizes storage utilization, minimizing unnecessary space allocation.
According to the official Solidity compiler documentation:
- The packing process is primarily driven by variable size. Smaller state variables can be grouped within a single slot to fill the available space left by larger ones.
- The declared order of state variables plays a crucial role. Variables are packed sequentially from top to bottom, ensuring efficient use of storage space.
- Structs and arrays occupy dedicated storage slots, but their internal elements adhere to the same packing principles, maximizing space utilization within each slot.
- The order of variables within structs becomes particularly relevant in the context of packing. When elements are commonly accessed or modified together, strategically arranging them can further enhance storage efficiency.
By reordering variables within structs that use less than 32 bytes to be adjacent to each other, we can save storage space on the Ethereum Virtual Machine (EVM).
Implementation
Following the above rules, we decided to use an Offline Optimal Binpacking algorithm over a greedy algorithm to decide how to pack the struct members. First we shifted all the struct members that were 32 bits or larger to the top of the struct, then we used the algorithm to exhaustively search the problem space to eventually arrive at an optimal solution, giving us a struct that will use the minimum number of slots. After arriving at this struct, we manipulate the AST and reorder the struct members in the struct definition to follow this new format.
3.2. Storage Variable Caching
Why is it useful?
The caching of storage variable allows a great reduction of gas in operations that requires reading the value of a state variable — a variable with a storage location — more than once for a function, caching it in memory is more gas-efficient than calling it multiple times.
Correctness
The primary opcodes relevant to caching data are SLOAD and MLOAD. These opcodes handle the loading of data from storage and memory, respectively.
MLOAD consistently incurs a cost of 3 gas, making it relatively inexpensive. On the other hand, SLOAD has a more complex cost structure: it costs 2100 gas to initially access a value during a transaction and only 100 gas for each subsequent access within the same transaction. This pricing structure indicates that accessing data from memory (MLOAD) is significantly cheaper than accessing data from storage (SLOAD), with a cost difference that can be over 97%.
The importance of optimizing data access by caching frequently accessed storage variables into memory is supported by several academic sources:
- Gas Analysis and Optimization for Ethereum Smart Contracts (GASOL Paper) discusses the impact of low-level instruction costs on the overall gas consumption of smart contracts. This paper presents methodologies for analyzing and reducing gas costs through strategic data handling and caching.
- Further details on the differential costs of memory versus storage operations and strategies for caching, including caching the length of arrays to avoid repeated SLOAD operations, can be found on page 8 of the IEEE paper available here.
- Additional insights into caching and its implications for gas optimization in Ethereum smart contracts are discussed in the GASPER study (link to GASPER).
Implementation
The key idea is that If there are more than 2 calls to global storage variable, we would declare a temp local variable as the cached value
- It walks through the AST nodes, looking for function definitions
- Within each function, it identifies storage variable declarations and usage.
- It maintains a count of how many times each storage variable is accessed within the function.
- For storage variables that are accessed more than once, it suggests caching them in a memory variable at the start of the function. This is done by generating a new variable declaration and assignment statement.
The full implementation can be found here.
3.3. Call Data Optimization
In Solidity smart contracts, optimizing the usage of calldata
instead of memory
for external function parameters can lead to significant gas savings. Calldata is a special data location that contains the function arguments passed in a transaction. It is read-only and cheaper to access compared to memory.
Gas Savings
Using calldata is generally cheaper than using memory for several reasons:
- Cost Efficiency: The Solidity documentation outlines that calldata is a special data location that contains the function’s arguments and behaves similarly to memory, but is only accessible from external functions. Accessing calldata costs 3 gas per word (32 bytes), which is cheaper in scenarios where data does not need to be modified.
- Memory Overhead: Memory usage incurs an initial cost of 3 gas for the first read and additional costs for expanding memory when a word is first written or read. Over time, this becomes more significant, especially in functions that handle large amounts of data.
Examples of Calldata Efficiency Optimization
Example 1: Simple Parameter Handling
In the first example, using calldata allows the data to be read directly without the need for copying to memory, thus saving gas.
Example 2: Array Handling in Functions
In the second example, using calldata eliminates the overhead of copying array data from calldata to memory, reducing gas costs and computational overhead.
Correctness
The correctness of this optimization hinges on the guarantee that parameters are not modified within the function. This is verified by both manual analysis and the use of pure or view modifiers as follows:
- Manual Analysis: By inspecting each function, we ensure that memory parameters are not assigned new values or passed to other functions that might modify them.
- Compiler Guarantees: The Solidity compiler enforces the constraints of pure and view functions. If a function is pure, the compiler will throw an error if it attempts to read (or write) to the state. If a function is view, the compiler will throw an error if it attempts to write to the state. These guarantees, however, do not extend to modifications of memory arguments, which must be checked manually.
Implementation
- Identify External Functions: Review the contract to find external functions where parameters are declared as memory.
- Check for Modifications: Analyze whether these memory parameters are modified within the function. This can be determined by checking for function modifiers like pure or view, which imply no state modification or data mutation.
- Optimize Parameter Declaration: For functions where parameters are not modified, change the parameter declaration from memory to calldata to leverage gas savings.
Usefulness
Utilizing calldata for external function parameters that do not require modification can significantly reduce the gas costs associated with calling these functions. Since calldata is a non-modifiable, temporary location where the input data of the function resides, it avoids the gas costs involved with copying data to memory.
4. Impact and Next Steps
4.1. Open Source Contribution
Our project hinges on the solgo library, a powerful Solidity static analyzer. This library facilitates tokenization, parsing, and our ability to traverse and manipulate the AST, enabling our tool to optimize code. During development, we came to understand the codebase of the library well enough to make contributions of our own, and we are now working closely with the library author on making improvements to solgo.
4.1.1. Pull request
Notably, during development, we identified a gap in solgo’s functionality — the absence of an AST-to-source code printing function. To address this, we implemented the missing functionality and contributed it back to the project as a pull request.
4.1.2. Issues
Our implementation of the AST printing function yielded a valuable secondary benefit. This process revealed several critical bugs within the solgo library that had remained undetected despite prior unit and integration testing. By surfacing these parsing errors through printing the AST, we were able to contribute significantly to the library’s improvement.
4.2. Current Landscape of Gas Optimizers
Our investigation into existing Solidity gas optimizers revealed a surprising lack of open-source solutions in this domain. While research papers exploring gas optimization techniques are readily available, a practical, open-source tool for automated gas optimization remains elusive.
This gap in the landscape motivated the development of our project and we believe our project represents the first open-source offering dedicated to automated Solidity gas optimization. Through this project, we aim to empower developers with a valuable tool to streamline contract efficiency and enhance their workflow.
4.3. Next Steps
4.3.1. In Progress
Our efforts currently center on completing the integration of our ‘pretty printing’ pull request within the solgo library. Additionally, we’re exploring the implementation of state variable packing, a functionality similar to our existing struct variable packing optimization. Our project’s modular architecture facilitates the seamless integration of new optimizations, making this a key focus for ongoing development. Finally, we’re committed to enhancing the user experience by improving the website’s user-friendliness and incorporating features like tooltips.
4.3.2. Near Future
Following further refinement to ensure a robust user experience and improved website security, we plan to launch our web application for public beta testing. This initiative will enable us to gather valuable feedback from the Solidity development community and contribute a practical tool for optimizing smart contract gas usage.
In conjunction with the public beta launch, we plan to actively engage with the Solidity development community through targeted outreach efforts on platforms like LinkedIn and Twitter. This multi-pronged approach will facilitate the collection of valuable user feedback while simultaneously raising awareness of our project within the target audience.
4.3.3. Future Plans
Our vision extends beyond the current offering. We envision the tool evolving into a versatile VS Code extension. Similar to code formatters, this extension would process Solidity source code, providing optimized versions for review, or simply overwrite the code. Additionally, we are exploring potential integration with existing static analyzer libraries. This collaboration could unlock valuable synergies and broaden the tool’s functionality within the Solidity development ecosystem.
5. Conclusion
Developing the Solidity Gas Optimizer presented a unique challenge. The scarcity of open-source tools and limited references required a combination of creativity and resourcefulness. Overcoming these hurdles was immensely rewarding, underscoring the value of open-source contributions and community engagement.
By immersing ourselves in the Solidity developer community, we not only provided a practical tool that serves the community but also leveraged the diverse skill set we have honed throughout our software engineering journey. This project was a significant learning experience, pushing us to adapt and innovate, ultimately leading to the creation of a tool that empowers developers to optimize their smart contracts.