Lessons Learned From Fuzzing Centrifuge Protocol Part 2
Separating signal from noise in broken properties
Introduction
In the first post in this series we looked at innovative methods that the Recon team implemented to achieve high code coverage in a fuzzing suite built for the Centrifuge Protocol during our engagement with them.
In this post we’ll be looking at how during the second part of the engagement the Recon team implemented safe-by-design thinking and clamping techniques to reduce the number of false positive property breaks after long duration runs of the Echidna fuzzer.
Reducing False Positives
During the second half of the engagement the goal wasn’t to improve the line coverage of the fuzzer over the system, as this was already achieved in the first half, but instead to separate signal from noise in broken property tests.
The long-duration fuzz run executed on the properties resulted in 15 of them breaking, with each having to be parsed through and its breaking source identified as a real bug or false-positive that isn’t actually reachable (because of calls to privileged functions, misconfiguration, etc.).
For easier sorting of the broken properties Recon’s scrapper tool was used to convert each broken property into a standalone unit test.
Initially much effort was spent in debugging broken properties one-by-one in an attempt to isolate the root cause of each break. However, this proved unfruitful and time consuming as many of the properties were failing for the same reasons indicating a greater issue with the test suite itself.
More code != effective code
Ultimately the solution to resolving the issues related to the numerous broken properties lay not in introducing more code, but in achieving a better understanding of the system. This required taking time to manually review the code and using insights related to the system’s functioning to make the improvements discussed below that ultimately led to a more efficient and effective test suite.
These insights were used to simplify the test suite by implementing a safer design and using clamping to reduce the search space of the fuzzer to more realistic values. Part of the resulting suite was also refactored during this half of the engagement for greater reusability as a generic library for use with implementations of the ERC7540 standard.
Safe-By-Design
The first optimization made with this extra understanding was used to recognize that the handleExecutedCollectInvest
function acts as an asynchronous callback to the requestDeposit
function (used to provide liquidity into the system). In other words, the requestDeposit
function is called, then after some action is executed on the Centrifuge chain it calls back into the system via handleExecutedCollectInvest
.
Below we can see a visual representation of this call sequence:
It was noted that the handleExecutedCollectInvest
function receives an input value of remainingInvestOrder
which defines how much of a user’s order is left after it has been filled (partial filling of orders is possible due to how Centrifuge fulfills orders).
This was then identified as the primary cause of many false positive broken properties, because the fuzzer which executed calls to handleExecutedCollectInvest
using a privileged role via the CryticTester
contract passed in random values for the remainingInvestOrder
that would exceed what had been initially requested.
Normally this wouldn’t be possible as the execution on the Centrifuge chain would not return a value for remainingInvestOrder
that’s larger than the order itself. However in the implementation this wasn’t strictly prohibited and the fuzzed target function didn’t apply any input validation that normally occurs as a result of the operation on the Centrifuge chain, implying that the system is safe as long as the Centrifuge chain always returns valid values.
With the recognition that these properties were breaking due to errors in the test suite configuration we needed a way to filter them out to make identifying the signal (valid broken properties) easier. Typically this could be done using clamping to restrict input values to the called function, however with the recognition that the system should never be able to process values where more of an investment than was requested is filled, we can use this to refactor the logic of the system to reflect this and ensure that any calls made to handleExecutedCollectInvest
(even though they can only be made by privileged users) do not violate this. This refactoring would therefore create a system that is safe-by-design.
This was implemented by changing the remainingInvestOrder
variable (representing the amount left of the order to be fulfilled) to fulfilledInvestOrder
(representing how much of the order has been fulfilled) and used to set the pendingDepositRequest
so that it would revert due to underflow if a value greater than the requested deposit amount was passed in.
This refactored implementation also had to take into account that Centrifuge chain’s order filling mechanism allows orders to be filled in a different order than the one in which they were created. We therefore needed to cap the amount of possible investment fulfilled, so that if the fuzzer attempts to use input values that exceed this investment cap it will revert and not break a property that checks whether the requested amount can be surpassed.
We can see the resulting change to the implementation and how it compares to the original implementation here:
This reduces use of clamping that must be implemented and allows the fuzz runs to execute more efficiently with less wasted computation.
Clamping
Clamping is a common method for reducing the number of possible input values used by the fuzzer in a function call.
There are generally two scenarios where clamping is a useful strategy in fuzzing. The first is to reduce the number of reverts due to things like arithmetic errors that prevent the fuzzer from achieving meaningful line coverage of the targeted contracts. The second ensures that the fuzzer doesn’t explore input values that would cause property breaks due to privileged calls.
In this engagement we were concerned with the second case in reducing the number of calls to callback handler functions by the Centrifuge chain that would never actually be passed in in practice (in our case they only occur because the Centrifuge chain calls were fuzzed after recognizing that fuzzing the centrifuge system added unnecessary complexity).
The changes explained in the Safe-By-Design section helped reduce the need for this type of clamping, but without further changes to the test suite setup some values still needed to be restricted to ensure only realistic scenarios were explored.
Reducing Bias
In any given system if we don’t want to explore paths with the fuzzer that revert for obvious reasons, we need to ensure that we still allow the fuzzer to explore all possible paths that a user could execute by not clamp input values too strictly as it may result in biasing the fuzzer.
This is why a good approach can often be to use state variable values from the system to define clamping boundaries, which increases the likelihood of the fuzzer exploring all valid paths.
The function handleExecutedCollectRedeem
serves as a callback to redemption requests made by users to the system to redeem their tranche tokens (representing their deposited liquidity in the pool) for the underlying asset. Assuming that the execution is performed correctly on the Centrifuge chain we can therefore assume that the amount of trancheTokenPayout
that handleExecutedCollectRedeem
receives as input can’t be greater than the requested amount. We can therefore apply clamping to the function call in our LiquidityPoolCallbacks
target contract by dividing the trancheTokenPayout
value modulo the pendingRedeemRequest
.
The above clamping however doesn’t take into consideration multiple redemption requests made at different prices of the tranche token which might more realistically simulate real-world behavior of the system. This is a potential source of bias in the applied clamping and may leave certain behaviors uncovered, however for this engagement this risk was acknowledged but not factored in due to time constraints.
Reusable Property Library
One of the stretch goals for this engagement that was planned for if time permitted was the implementation of a reusable property library for the ERC7540 vault specification created by the Centrifuge team. This would allow anyone implementing an ERC7540 vault to easily test the properties that should hold on it using functions from this library.
The difficulty in implementing such a library during this engagement was due to the use of ghost variables in the defined properties. This would require protocols implementing the ERC7540 properties to implement the same ghost variables to track changes to state in their system.
Therefore, to make the library more reusable, the target contract on which the properties were to be tested was made into a value passed into the properties which allows accessing the contract’s state variables directly using its interface:
In terms of reusability, in the initial implementation used during the engagement, the properties defined had many that were applicable to implementations of the ERC7540 specification as a whole, while others were specific to the implementation developed by Centrifuge for their use case.
The solution to this was to create a generic library (ERC7540Properties
) and any implementation specific checks were performed in a Centrifuge-specific properties contract (ERC7540CentrifugeProperties
) which overrides the generic properties in ERC7540Properties
and performs the centrifuge specific checks in calls to a _centrifugeSpecificPreChecks
function:
This is then called via a boolean property target function which is what the fuzzer actually calls:
The result: real broken properties
The above simplifications allowed the test suite to be run again for a long-duration job, uncovering properties that were broken because of actual incorrect system behavior instead of their implementation in the test suite (shown in this gist).
Trophy 🏆
One such real property break that was discovered after making the above changes was related to the maxRedeem
variable. In the implementation, the max amount received after a call to deposit
was rounded down in order to prevent scenarios where value can slowly be drained from the system if the rounding occurs in favor of the user instead of the protocol. This also restricted the maximum amount that a user could deposit into the system but the calculation for the maximum amount was performed using the rounded down amount, therefore allowing the user to subvert the functionality of rounding down by depositing through multiple transactions.
Conclusion
We’ve seen how the techniques defined above allowed the Recon team to separate signal from noise in the broken properties that were identified by the fuzzer.
This highlights the importance of not only focusing on achieving high line coverage of the fuzzer over the entire system, but also aiming for coverage that creates meaningful results with proper setup. Further, the ability to process the results of any broken properties to understand if they are due to incorrect operation of the system or indicative of deeper issues in the test suite itself is key to the entire invariant testing process.