Arbiter

Arbiter is a framework for stateful Ethereum smart-contract simulation. The framework features an ethers-rs middleware built on top of revm which allows the end user to interact with a sandboxed revm instance as if it were an Ethereum node. This provides a familiar interface for interacting with the Ethereum Virtual Machine (EVM), but with unrivaled speed. Furthermore, Arbiter provides containment and management for simulations. For a running list of vulnerabilities found with Arbiter, please see the Vulnerability Corpus.

Overview

The Arbiter workspace has three crates:

  • arbiter: The binary crate that exposes a command line interface for initializing simulations via a templated repository and generating contract bindings needed for the simulation.
  • arbiter-core: The lib crate that contains the core logic for the Arbiter framework including the RevmMiddleware discussed before, the Environment which envelopes simulations, and the Manager who controls a collection of environments.
  • arbiter-engine: The lib crate that provides abstractions for building simulations and more.

The purpose of Arbiter is to provide a toolset to construct arbitrary agents (defined in Rust, by smart contracts, or even other foreign function interfaces) and have these agents interact with an Ethereum-like environment of your design. All contract bytecode is run directly using a blazing-fast EVM instance revm (which is used in live RPC nodes such as reth) so that your contracts are tested in the exact same type of environment that they are deployed in.

Motivation

Smart contract engineers need to test their contracts against a wide array of potentially adversarial environments and contract parameters. The static stateless testing of contracts can only take you so far. To truly test the security of a contract, you need to test it against a wide array of dynamic environments that encompass the externalities of Ethereum mainnet. We wanted to do just that with Arbiter.

Both smart contract and financial engineers come together in Decentralized Finance (DeFi) to build and deploy a wide array of complex decentralized applications as well as financial strategies respectively. For the latter, a financial engineer may want to test their strategies against thousands of market conditions, contract settings, shocks, and autonomous or random or even AI agents all while making sure their strategy isn't vulnerable to bytecode-level exploits.

To configure such a rich simulation environment on a test or local network is also possible with Arbiter by a change in choice of middleware. The most efficient choice for getting robust, yet quick, simulations would bypass any networking and use a low level language's implementation of the EVM. Furthermore, we can gain control over the EVM worldstate by working directly on revm. We would like the user to have a choice in how they want to simulate their contracts and Arbiter provides that choice.

Sim Driven Development and Strategization

Test driven development is a popular engineering practice to write tests first, which fail, and implement logic to get the test to eventually pass. With simulation driven development, it's possible to build "tests" that can only pass if the incentives actually work. For example, a sim driven test might be is_loan_liquidated, and a simulation must be made for a liquidator agent to do the liquidation. This approach significantly improves the testing of economic systems and other mechanism designs, which is important in the world of networks that are mostly incentive driven.

The same goes with developing strategies that one would like to deploy on a live Ethereum network. One can use Arbiter to simulate their strategy with an intended goal and see if it actually works. This is especially important in the world of DeFi where strategies are often a mix of on and offchain and are susceptible to exploits.

Anomaly Detection

Anomaly detection in software design systems refers to identifying unusual patterns or behaviors that deviate from the expected or normal functioning of the software. These anomalies can be due to various reasons, such as bugs, performance issues, security vulnerabilities, or design flaws. Arbiter's agent-based modeling and EVM execution parity make it well suited for anomaly detection of greater systemic risk in the Ethereum ecosystem.

In the context of software design, anomaly detection can be used to identify design flaws or inconsistencies in the design of the software. For example, if a particular module or component of the software behaves differently than it was intended, it could indicate a design flaw or security vulnerability.

Agent Base Modeling

Agent-based simulations for anomaly detection systems involve creating a model of the system using agents, where each agent represents a component or a module of the system. These agents interact with each other and their environment, mimicking the behavior of the actual system. Agent-based simulations can be a powerful tool for anomaly detection as they can model complex systems and their interactions, making it possible to detect anomalies that other methods might miss. However, they also require a good understanding of the system being modeled and what constitutes normal behavior for that system.

Modeling the System

The first and most crucial step is to model the system. A well-modeled system accurately reflects the real-world behavior of the software or system under study. This ensures that the simulation provides meaningful and applicable results. We build the RevmMiddleware to accurately model how users/agents or externally owned accounts interact with the EVM. This means the RevmMiddleware implements the middleware trait from the rust Ethereum ecosystem, exploiting the same API the EOAs would use to talk to a node today. This is why having EVM execution parity is so important.

Statistical Methods:

These methods model the system's normal behavior using statistical models and then use these models to detect deviations. To model things well, people use techniques such as mean, median, and standard deviation, or more complex models like regression models can be used. For example, the Poisson distribution gives the probability of an event happening a certain number of times (k) within a given interval of time or space. So, you can quantify an average number of occurrences of some action (say, to model the behavior of a retail agent or network congestion from certain events). In that case, you can model this well with the Poisson distribution.

Defining Normal Behavior: Agent design

Once the system is modeled, the next step is to define what constitutes normal behavior for the system. This could be based on historical data, expert knowledge, or both. This is not a feature of Arbiter yet (The arbiter-engine crate is a WIP but contains some of our initial work on this). This can be incredibly simple (passive behavior) or complex (interactive behavior). But the better they model the system, the better the results. For example, you can model LPs as more passive agents that deposit and withdraw liquidity based on some average occurrences. In contrast, arbitrageurs can be modeled as more interactive agents that react to certain events or `SLAOD's on specific memory locations. As the agents start to resemble real-world actors, the results will be more accurate, and the data will be more beneficial for the system designers.

Simulating the System

The system is then simulated over some time. During this simulation, the agents interact with each other and their environment, generating data that reflects the system's behavior. You can decide on specific parameters and configurations for the system. Designating the system simulation to be as close to the real-world system as possible is recommended. For example, historically or with price processes, we can model a sequence of prices for arbitrageurs. The speed and performance of the simulation have made it possible for you to get more data by doing the latter.

Detecting Anomalies

The data generated by the simulation is then analyzed to detect anomalies. This could be done using various statistical methods, machine learning, or rule-based methods. Anomalies are identified as deviations from the defined normal behavior.

Machine Learning: Machine Learning techniques can be used to learn the system's normal behavior and then detect anomalies.

Rule-Based Methods: These methods define rules that describe the system's normal behavior. Any behavior that does not conform to these rules is considered an anomaly.

Time Series Analysis: In systems where data is collected over time, time series analysis can be used to detect anomalies. This involves looking for patterns or trends in the data over time and identifying any deviations from these patterns. >Log Analysis: Many software systems generate logs that record the system's activity. Analyzing these logs can help detect anomalies. This can be done manually or using automated tools.

Evaluating and Refining the Model: The detected anomalies are evaluated to determine if they are true anomalies or false positives. The model is refined based on these evaluations to improve its accuracy in detecting abnormalities.

Using Insights to Refine the System

Insights gained from the system can be invaluable in refining and improving it. By understanding the anomalies and their causes, we can make necessary adjustments to the system's design or operation. This could involve modifying the system's parameters, updating the agent's behaviors, or even redesigning certain aspects of the system.

However, it's essential to be cautious about overfitting the data. Overfitting occurs when a model is excessively complex, such as having too many parameters relative to the number of observations. An overfitted model has poor predictive performance, as it overreacts to minor fluctuations in the training data.

Developer Documentation

To see the documentation for the Arbiter crates, please visit the following:

You will also find each of these on crates.io.

Getting Started

To use Arbiter, you can use the Arbiter CLI to help you manage your projects or, if you feel you don't need any of the CLI features, you can be free to use the arbiter-core, arbiter-engine, and arbiter-bindings crates directly. You can find more information about these crates in the Usage section. The crates (aside from arbiter-engine at the moment) are linked to their crates.io pages so you can add them to your project by:

[dependencies]
arbiter-core = "*" # You can specify a version here if you'd like
arbiter-bindings = "*" # You can specify a version here if you'd like 
arbiter-engine = "*" # You can specify a version here if you'd like

Auditing

The current state of software auditing in the EVM is rapidly evolving. Competitive salaries are attracting top talent to firms like Spearbit, ChainSecurity, and Trail of Bits, while open security bounties and competitions like Code Arena are drawing in the best and brightest from around the world. Moreover, the rise of decentralized finance and the value at stake in these EVM-oriented systems have also caught the attention of a collection of black hats.

As competition in auditing intensifies, auditors will likely need to specialize to stay competitive. With its ability to model the EVM with a high degree of granularity, Arbiter is well-positioned to be leveraged by auditors to develop its tooling and methodologies to stay ahead of the curve.

One such methodology is domain-specific fuzzing. Fuzzing is a testing technique that provides invalid, unexpected, or random data as input to a computer program. The program is then monitored for exceptions such as crashes, failing built-in code assertions, or potential memory leaks. Domain-specific fuzzing in the context of EVM system design involves modeling "normal" system behavior with agents and then playing with different parameters of the system to expose system fragility.

With its high degree of EVM modeling granularity, Arbiter is well-suited to support and enable domain-specific fuzzing. It can accurately simulate the behavior of the EVM under a wide range of conditions and inputs, providing auditors with a powerful tool for identifying and addressing potential vulnerabilities. Moreover, Arbiter is designed to be highly performant and fast, allowing for efficient and timely auditing processes. This speed and performance make it an even more valuable tool in the rapidly evolving world of software auditing.

Examples

We have a few examples to help you get started with Arbiter. These examples are designed to be simple and easy to understand. They are also designed to be easy to run and modify. We hope you find them helpful!

Our examples are in the examples directory. There are two examples: one for building a simulation and one fork forking the mainnet state.

Simulation

You can run them with the following command:

cargo run --example project simulate examples/project/configs/example.toml

This will run the minimal counter-simulation. The simulation is very minimal and is designed to be easy to understand. It uses an arbiter main macro to derive the incrementer behavior for a single agent. Our design philosophy is that the users of Arbiter should only need to define behaviors and a configuration toml for the behaviors. You can see how the behaviors were represented in this simulation in the behaviors module. We implement a single behavior for the incrementer struct that deploys the counter on startup and then on the increment event will increment the count.

For more information on the behavior trait please see the section on behaviors

Forking

You can run the fork example with the following command:

arbiter fork examples/fork/weth_config.toml

This will fork the state specified in the weth_config.toml file. If you would like to fork a different state, you can modify the weth_config.toml file to point to include additional EOAs or contract storage. Once you have forked the state you want, you can start your simulation with the forked state by loading it into a memory revm instance like so:

use arbiter_core::{database::Fork::*, Environment, ArbiterMiddleware};

let fork = Fork::from_disk("tests/fork.json").unwrap();

// Get the environment going
let environment = Environment::builder().with_db(fork.db).build();

// Create a client
let client = ArbiterMiddleware::new(&environment, Some("name")).unwrap();

Software Architecture

Arbiter is broken into a number of crates that provide different levels of abstraction for interacting with the Ethereum Virtual Machine (EVM) sandbox.

Arbiter Core

The arbiter-core crate is the core of the Arbiter. It contains the Environment struct which acts as an EVM sandbox and the RevmMiddleware which gives a convenient interface for interacting with contracts deployed into the Environment. Direct usage of arbiter-core will be minimized as much as possible as it is intended for developers to mostly pull from the arbiter-engine crate in the future. This crate provides the interface for agents to interact with an in memory evm.

Arbiter Engine

The arbiter-engine crate is the main interface for running simulations. It is built on top of arbiter-core and provides a more ergonomic interface for designing agents and running them in simulations.

Arbiter CLI (under construction)

The Arbiter CLI is a minimal interface for managing your Arbiter projects. It is built on top of Foundry and aims to provide a similar CLI interface of setting up and interacting with Arbiter projects.

Arbiter Core

The arbiter-core crate is the core of the Arbiter framework. It contains the Environment struct which acts as an EVM sandbox and the RevmMiddleware which gives a convenient interface for interacting with contracts deployed into the Environment. The API provided by RevmMiddleware is that of the Middleware trait in the ethers-rs crate, therefore it looks and feels just like you're interacting with a live network when you work with an Arbiter Environment. The only notable differences are in the control you have over this Environment compared to something like Anvil, a testnet, or a live network.

Environment

The Environment owns a revm instance for processing EVM bytecode. To make the Environment performant and flexible, it runs on its own system thread and receives all communication via Instructions sent to it via a Sender<Instruction>. The Socket is a struct owned by the Environment that manages all inward and outward communication with the Environment's clients, such as the Instruction channel.

Usage

To create an Environment, we use a builder pattern that allows you to pre-load an Environment with your own database. We can do the following to create a default Environment:

use arbiter_core::environment::Environment;

fn main() {
    let env = Environment::builder().build();
}

Note that the call to .build() will start the Environment's thread and begin processing Instructions.

Inspector Configuration

The Environment also supports the ability to inspect the revm instance's state at any point in time which can be useful for debugging and managing gas. By default, the Environment will not inspect the revm instance's state at all (which should provide the highest speed), but you can enable these features by doing the following:

use arbiter_core::environment::Environment;

fn main() {
    let env = Environment::builder()
        .with_console_logs()
        .with_pay_gas()
        .build();
}

The feature with_console_logs will print out logs generated by console2.log in Solidity so that you can get intermediate state of your contracts. The feature with_pay_gas will pay gas for transactions which is useful for realism.

Fork Configuration

If you have a database that has been forked from a live network, it has likely been serialized to disk. In which case, you can do something like this:

use arbiter_core::environment::Environment;
use arbiter_core::database::fork::Fork;

fn main() {
    let path_to_fork = "path/to/fork";
    let fork = Fork::from_disk(path_to_fork).unwrap();
    let env = Environment::builder().with_db(fork).build();
}

This will create an Environment that has been forked from the database at the given path and is ready to receive Instructions.

Environment supports more customization for the gas_limit and contract_size_limit of the revm instance. You can do the following:

use arbiter_core::environment::Environment;

fn main() {
    let env = Environment::builder()
        .with_gas_limit(revm_primitives::U256::from(12_345_678))
        .with_contract_size_limit(111_111)
        .build();
}

Instructions

Instructions have been added to over time, but at the moment we allow for the following:

  • Instruction::AddAccount: Add an account to the Environment's world state. This is usually called by the RevmMiddleware when a new client is created.
  • Instruction::BlockUpdate: Update the Environment's block number and block timestamp. This can be handled by an external agent in a simulation, if desired.
  • Instruction::Cheatcode: Execute one of the Cheatcodes on the Environment's world state. The Cheatcodes include:
    • Cheatcodes::Deal: Used to set the raw ETH balance of a user. Useful when you need to pay gas fees in a transaction.
    • Cheatcodes::Load: Gets the value of a storage slot of an account.
    • Cheatcodes::Store: Sets the value of a storage slot of an account.
    • Cheatcodes::Access: Gets the account at an address.
  • Instruction::Query: Allows for querying the Environment's world state and current configuration. Anything in the EnvironmentData enum is accessible via this instruction.
    • EnvironmentData::BlockNumber: Gets the current block number of the Environment.
    • EnvironmentData::BlockTimestamp: Gets the current block timestamp of the Environment.
    • EnvironmentData::GasPrice: Gets the current gas price of the Environment.
    • EnvironmentData::Balance: Gets the current ETH balance of an account.
    • EnvironmentData::TransactionCount: Gets the current nonce of an account.
  • Instruction::Stop: Stops the Environment's thread and echos out to any listeners to shut down their event streams. This can be used when handling errors or reverts, or just when you're done with the Environment.
  • Instruction::Transaction: Executes a transaction on the Environment's world state. This is usually called by the RevmMiddleware when a client sends a ETH-call or state-changing transaction.

The RevmMiddleware provides methods for sending the above instructions to an associated Environment so that you do not have to interact with the Environment directly!

Events

The Environment also emits Ethereum events and errors/reverts to clients who are set to listen to them. To do so, we use a tokio::sync::broadcast channel and the RevmMiddleware manages subscriptions to these events. As for errors or reverts, we are working on making the flow of handling these more graceful so that your own program or agents can decide how to handle them.

Middleware

The ArbiterMiddleware is the main interface for interacting with an Environment. We implement the ethers-rs Middleware trait so that you may work with contract bindings generated by forge or arbiter bind as if you were interacting with a live network. Not all methods are implemented, but the relevant ones are.

ArbiterMiddleware owns a Connection which is the client's interface to the Environment's Socket. This Connection acts much like a WebSocket connection and is used to send Instructions and receive their outcome from the Environment as well as subscribe to events. To make this Connection and ArbiterMiddleware flexible, we also implement (for both) the JsonRpcClient and PubSubClient traits.

We also provide ArbiterMiddleware a wallet so that it can be associated to an account in the Environment's world state. The wallet: EOA field of ArbiterMiddleware is decided upon creation of the ArbiterMiddleware and, if the wallet is generated from calling ArbiterMiddleware::new(), wallet will be of EOA::Wallet(Wallet<SigningKey>) which allows for ArbiterMiddleware to sign transactions if need be. It is possible to create accounts from a forked database, in which case you would call ArbiterMiddleware::new_from_forked_eoa() and the wallet would be of EOA::Forked(Address). This type is unable to sign as it is effectively impossible to recover the signing key from an address. Fortunately, for almost every usecase of ArbiterMiddleware, you will not need to sign transactions, so this distinction does not matter.

Usage

To create a ArbiterMiddleware that is associated with an account in the Environment's world state, we can do the following:

use arbiter_core::{middleware::ArbiterMiddleware, environment::Environment};

fn main() {
    let env = Environment::builder().build();

    // Create a client for the above `Environment` with an ID
    let id = "alice";
    let alice = ArbiterMiddleware::new(&env, Some(id));

    // Create a client without an ID
    let client = ArbiterMiddleware::new(&env, None);
}

These created clients can then get access to making calls and transactions to contracts deployed into the Environment's world state. We can do the following:

use arbiter_core::{middleware::ArbiterMiddleware, environment::Environment};
use arbiter_bindings::bindings::arbiter_token::ArbiterToken;

#[tokio::main]
async fn main() {
    let env = Environment::builder().build();
    let client = ArbiterMiddleware::new(&env, None).unwrap();

    // Deploy a contract
    let contract = ArbiterToken::deploy(client, ("ARBT".to_owned(), "Arbiter Token".to_owned(), 18u8)).unwrap().send().await.unwrap();
}

Arbiter Engine

arbiter-engine provides the machinery to build agent based / event driven simulations and should be the primary entrypoint for using Arbiter. The goal of this crate is to abstract away the work required to set up agents, their behaviors, and the worlds they live in. At the moment, all interaction of agents is done through the arbiter-core crate and is meant to be for local simulations and it is not yet generalized for the case of live network automation.

Hierarchy

The primary components of arbiter-engine are, from the bottom up:

  • Behavior<E>: This is an event-driven behavior that takes in some item of type E and can act on that. The Behavior<E> has two methods: startup and process.
    • startup is meant to initialize the Behavior<E> and any context around it. An example could be an agent that deploys token contracts on startup.
    • process is meant to be a stage that runs on every event that comes in. An example could be an agent that deployed token contracts on startup, and now wants to process queries about the tokens deployed in the simulation (e.g., what their addresses are).
  • Engine<B,E> and StateMachine: The Engine is a struct that implements the StateMachine trait as an entrypoint to run Behaviors.
    • Engine<B,E> is a struct owns a B: Behavior<E> and the event stream Stream<Item = E> that the Behavior<E> will use for processing.
    • StateMachine is a trait that reduces the interface to Engine<B,E> to a single method: execute. This trait allows Agents to have multiple behaviors that may not use the same event type.
  • Agent a struct that contains an ID, a client (Arc<RevmMiddleware>) that provides means to send calls and transactions to an Arbiter Environment, and a Messager.
    • Messager is a struct that owns a Sender and Receiver for sending and receiving messages. This is a way for Agents to communicate with each other. It can also be streamed and used for processing messages in a Behavior<Message>.
    • Agent also owns a Vec<Box<dyn StateMachine>> which is a list of StateMachines that the Agent will run. This is a way for Agents to have multiple Behaviors that may not use the same event type.
  • World is a struct that has an ID, an Arbiter Environment, a mapping of Agents, and a Messager.
    • The World is tasked with letting Agents join in, and when they do so, to connect them to the Environment with a client and Messager with the Agent's ID.
  • Universe is a struct that wraps a mapping of Worlds.
    • The Universe is tasked with letting Worlds join in and running those Worlds in parallel.

Behaviors

The design of arbiter-engine is centered around the concept of Agents and Behaviors. At the core, we place Behaviors as the event-driven machinery that defines the entire simulation. What we want is that your simulation is defined completely with how your Agents behaviors are defined. All you should be looking for is how to define your Agents behaviors and what emergent properties you want to observe.

trait Behavior<E>

To define a Behavior, you need to implement the Behavior trait on a struct of your own design. The Behavior trait is defined as follows:

pub trait Behavior<E> {
    fn startup(&mut self, client: Arc<RevmMiddleware>, messager: Messager) -> Result<EventStream<E>, ArbiterEngineError>;
    fn process(&mut self, event: E) -> Result<ControlFlow, ArbiterEngineError>;
}

To outline the design principles here:

  • startup is a method that initializes the Behavior and returns an EventStream that the Behavior will use for processing.
    • This method yields a client and messager from the Agent that owns the Behavior. In this method you should take the client and messager and store them in your struct if you will need them in the processing of events. Note, you may not need them!
  • process is a method that processes an event of type E and returns an Option<MachineHalt>.
    • If process returns Some(MachineHalt), then the Behavior will stop processing events completely.

Summary: A Behavior<E> is tantamount to engage the processing some events of type E.

Advice: Behaviors should be limited in scope and should be a simplistic action driven from a single event. Otherwise you risk having a Behavior that is too complex and difficult to understand and maintain.

Example

To see this in use, let's take a look at an example of a Behavior called Replier that replies to a message with a message of its own, and stops once it has replied a certain number of times.

use std::sync::Arc;
use arbiter_core::middleware::RevmMiddleware;
use arbiter_engine::{
    machine::{Behavior, ControlFlow},
    messager::{Messager, To}, 
    EventStream};

pub struct Replier {
    receive_data: String,
    send_data: String,
    max_count: u64,
    startup_message: Option<String>,
    count: u64,
    messager: Option<Messager>,
}

impl Replier {
    pub fn new(
        receive_data: String,
        send_data: String,
        max_count: u64,
        startup_message: Option<String>,
    ) -> Self {
        Self {
            receive_data,
            send_data,
            startup_message,
            max_count,
            count: 0,
            messager: None,
        }
    }
}

impl Behavior<Message> for Replier {
    async fn startup(
        &mut self,
        client: Arc<RevmMiddleware>,
        messager: Messager,
    ) -> Result<EventStream<Message>, ArbiterEngineError> {
        if let Some(startup_message) = &self.startup_message {
            messager.send(To::All, startup_message).await;
        }
        self.messager = Some(messager.clone());
        messager.stream()
    }

    async fn process(&mut self, event: Message) -> Result<ControlFlow, ArbiterEngineError> {
        if event.data == self.receive_data {
            self.messager.unwrap().messager.send(To::All, send_data).await;
            self.count += 1;
        }
        if self.count == self.max_count {
            return Ok(ControlFlow::Halt);
        }
        Ok(ControlFlow::Continue)
    }
}

In this example, we have a Behavior that upon startup will see if there is a startup_message assigned and if so, send it to all Agents that are listening to their Messager. Then, it will store the Messager for sending messages later on and start a stream of incoming messages so that we have E = Message in this case. Once these are completed, the Behavior automatically transitions into the processing stage where events are popped from the EventStream<E> and fed to the process method.

As messages come in, if the receive_data matches the incoming message, then the Behavior will send the send_data to all Agents listening to their Messager a message with data send_data.

Agents and Engines

Behaviors are the heartbeat of your Agents and they are wrapped by Engines. The main idea here is that you can have an Agent that has as many Behaviors as you like, and each of those behaviors may process different types of events. This gives flexibility in how you want to design your Agents and what emergent properties you want to observe.

Design Principles

We designed the behaviors to be flexible. It is up to you whether or not you prefer to have Agents have multiple Behaviors or not or if you want them to have a single Behavior that processes all events. For the former case, you will build Behavior<E> for different types E and place these inside of an Agent. For the latter, you will create an enum that wraps all the different types of events that you want to process and then implement Behavior on that enum. The latter will also require a stream::select type of operation to merge all the different event streams into one, though this is not difficult to do.

struct Agent

The Agent struct is the primary struct that you will be working with. It contains an ID, a client (Arc<RevmMiddleware>) that provides means to send calls and transactions to an Arbiter Environment, and a Messager. It looks like this:

pub struct Agent {
    pub id: String,
    pub messager: Messager,
    pub client: Arc<RevmMiddleware>,
    pub(crate) behavior_engines: Vec<Box<dyn StateMachine>>,
}

Your work will only be to define Behaviors and then add them to an Agent with the Agent::with_behavior method.

The Agent is inactive until it is paired with a World and then it is ready to be run. This is handled by creating a world (see: Worlds and Universes) and then adding the Agent to the World with the World::add_agent method. Some of the intermediary representations are below:

struct AgentBuilder

The AgentBuilder struct is a builder pattern for creating Agents. This is essentially invisible for the end-user, but it is used internally so that Agents can be built in a more ergonomic way.

struct Engine<B,E>

Briefly, the Engine<B,E> struct provides the machinery to run a Behavior<E> and it is not necessary for you to handle this directly. The purpose of this design is to encapsulate the Behavior<E> and the event stream Stream<Item = E> that the Behavior<E> will use for processing. This encapsulation also allows the Agent to hold onto Behavior<E> for various different types of E all at once.

Example

Let's create an Agent that has two Behaviors using the Replier behavior from before.

use arbiter_engine::agent::Agent;
use crate::Replier;

fn setup() {
    let ping_replier = Replier::new("ping", "pong", 5, None);
    let pong_replier = Replier::new("pong", "ping", 5, Some("ping"));
    let agent = Agent::builder("my_agent")
                    .with_behavior(ping_replier)
                    .with_behavior(pong_replier);
}

In this example, we have created an Agent with two Replier behaviors. The ping_replier will reply to a message with "pong" and the pong_replier will reply to a message with "ping". Given that the pong_replier has a startup_message of "ping", it will send a message to everyone (including the "my_agent" itself who holds the ping_replier behavior) when it starts up. This will start a chain of messages that will continue in a "ping" "pong" fashion until the max_count is reached.

Worlds and Universes

Universes are the top-level struct that you will be working with in the Arbiter Engine. They are tasked with letting Worlds join in and running those Worlds in parallel. By no means are you required to use Universes, but they will be useful for running multiple simulations at once or, in the future, they will allow for running Worlds that have different internal environments. For instance, one could have a World that consists of Agents acting on the Ethereum mainnet, another World that consists of Agents acting on Optimism, and finally a World that has an Arbiter Environment as the network analogue. Using these in tandem is a long-term goal of the Arbiter project.

Depending on your needs, you will either use the Universe if you want to run multiple Worlds in parallel or you will use the World if you only want to run a single simulation. The choice is yours.

struct Universe

The Universe struct looks like this:

pub struct Universe {
    worlds: Option<HashMap<String, World>>,
    world_tasks: Option<Vec<Result<World, JoinError>>>,
}

The Universe is a struct that wraps a mapping of Worlds where the key of the map is the World's ID. Also, the Universe manages the running of those Worlds in parallel by storing the running Worlds as tasks. In the future, more introspection and control will be added to the Universe to allow for debugging and managing the running Worlds.

The Universe::run_worlds currently iterates through the Worlds and starts them in concurrent tasks.

struct World

The World struct looks like this:

pub struct World {
    pub id: String,
    pub agents: Option<HashMap<String, Agent>>,
    pub environment: Environment,
    pub messager: Messager,
}

The World is a struct that has an ID, an Arbiter Environment, a mapping of Agents, and a Messager. The World is tasked with letting Agents join in, and when they do so, to connect them to the Environment with a client and Messager with the Agent's ID. Then the World stores the Agents in a map where the key is the Agent's ID.

The main methods to use with the world is World::add_agent which adds an agent to the World and World::run which will engage all of the Agent Behaviors.

In future development, the World will be generic over your choice of Provider that encapsulates the Ethereum-like execution environment you want to use (e.g., Ethereum mainnet, Optimism, or an Arbiter Environment).

Example

Let's first do a quick example where we take a World and add an Agent to it.

use arbiter_engine::{agent::Agent, world::World};
use crate::Replier;

fn setup_world(id: &str) -> World {
    let ping_replier = Replier::new("ping", "pong", 5, None);
    let pong_replier = Replier::new("pong", "ping", 5, Some("ping"));
    let agent = Agent::new("my_agent")
                    .with_behavior(ping_replier)
                    .with_behavior(pong_replier);
    let mut world = World::new(id);
    world.add_agent(agent);
}

async fn run() {
    let world = setup_world("my_world");
    world.run().await;
}

If you wanted to extend this to use a Universe, you would simply create a Universe and add the World to it.

use arbiter_engine::{agent::Agent, world::World};
use crate::Replier;

fn setup_world(id: &str) -> World {
    let ping_replier = Replier::new("ping", "pong", 5, None);
    let pong_replier = Replier::new("pong", "ping", 5, Some("ping"));
    let agent = Agent::new("my_agent")
                    .with_behavior(ping_replier)
                    .with_behavior(pong_replier);
    let mut world = World::new(id);
    world.add_agent(agent);
}

fn main() {
    let mut universe = Universe::new();
    universe.add_world(setup_world("my_world"));
    universe.add_world(setup_world("my_other_world"));
    universe.run_worlds().await;
}

Configuration

To make it so you rarely have to recompile your project, you can use a configuration file to set the parameters of your simulation once your Behaviors have been defined. Let's take a look at how to do this.

Behavior Enum

It is good practice to take your Behaviors and wrap them in an enum so that you can use them in a configuration file. For instance, let's say you have two struct Maker and Taker that implement Behavior<E> for their own E. Then you can make your enum like this:

use arbiter_macros::Behaviors;

#[derive(Behaviors)]
pub enum Behaviors {
    Maker(Maker),
    Taker(Taker),
}

Notice that we used the Behaviors derive macro from the arbiter_macros crate. This macro will generate an implementation of a CreateStateMachine trait for the Behaviors enum and ultimately save you from having to write a lot of boilerplate code. The macro solely requires that the Behaviors you have implement the Behavior trait and that the necessary imports are in scope.

Configuration File

Now that you have your enum of Behaviors, you can configure your World and the Agents inside of it from configuration file. Since the World and your simulation is completely defined by the Agent Behaviors you make, all you need to do is specify your Agents in the configuration file. For example, let's say we have the Replier behavior from before, so we have:

#[derive(Behaviors)]
pub enum Behaviors {
    Replier(Replier),
}

pub struct Replier {
    receive_data: String,
    send_data: String,
    max_count: u64,
    startup_message: Option<String>,
    count: u64,
    messager: Option<Messager>,
}

Then, we can specify the "ping" and "pong" Behaviors like this:

[[my_agent]]
Replier = { send_data = "ping", receive_data = "pong", max_count = 5, startup_message = "ping" }

[[my_agent]]
Replier = { send_data = "pong", receive_data = "ping", max_count = 5 }

If you instead wanted to specify two Agents "Alice" and "Bob" each with one of the Replier Behaviors, you could do it like this:

[[alice]]
Replier = { send_data = "ping", receive_data = "pong", max_count = 5, startup_message = "ping" }

[[bob]]
Replier = { send_data = "pong", receive_data = "ping", max_count = 5 }

Loading the Configuration

Once you have your configuration file located at ./path/to/config.toml, you can load it and run your simulation like this:

fn main()  {
    let world = World::from_config("./path/to/config.toml")?;
    world.run().await;
}

At the moment, we do not configure Universes from a configuration file, but this is a feature that is planned for the future.

Arbiter CLI

Arbiter provides a Foundry-like CLI experience. You can initialize new projects, generate bindings and execute simulations using the CLI.

To create a new Arbiter project:

arbiter init your-new-project
cd your-new-project

This initializes a new Arbiter project with a template. You can run arbiter init <simulation_name> --no-git to remove the .git directory from the template upon initialization.

Bindings

You can load or write your own smart contracts in the arbiter-bindings/contracts/ directory and begin writing your own simulations. Arbiter treats Rust smart-contract bindings as first-class citizens. The contract bindings are generated via Foundry's forge command. arbiter bind wraps forge with some convenience features that will generate all your bindings to src/bindings as a rust module. Foundry power-users are welcome to use forge directly. You can generate the bindings again by running:

arbiter bind

Arbiter bind wraps forge bind and is configured from your cargo.toml. There are three optional fields you can add to your toml to configure arbiter bind.

[arbiter]
bindings_workspace = "simulation" # must be a valid workspace member
submodules = false # change to true if you want the submodule bindings to be generated
ignore_interfaces = false # change to true if you want to ignore interfaces contracts

The template is executable at this point and you can run it by running:

cargo run

You can load or write your own smart contracts in the templates contracts/ directory and begin writing your own simulations. Arbiter treats Rust smart-contract bindings as first-class citizens. The contract bindings are generated via Foundry's forge command. arbiter bind wraps forge with some convenience features that will generate all your bindings to src/bindings as a rust module. Foundry power-users are welcome to use forge directly. You can also manage project dependencies using git submodules via forge install. The Foundry book provides further details on managing project dependencies and other features.

Forking

To fork a state of an EVM network, you must first create a fork config file. An example is provided in the example_fork directory. Essentially, you provide your storage location for the data, the network you want the block number you want, and metadata about the contracts you want to fork.

arbiter fork <fork_config.toml>

This will create a fork of the network you specified in the config file and store it in the location you specified. It can then be loaded into an arbiter-core Environment by using the Fork::from_disk() method.

Forking is done this way to make sure that all emulation done does not require a constant connection to an RPC-endpoint.

Optional Arguments You can run arbiter fork <fork_config.toml> --overwrite to overwrite the fork if it already exists.

Arbiter macros

arbiter_macros provides a set of macros to help with the use of arbiter-engine and arbiter-core. Macros allow for code generation which enables developers to write code that writes code. We use them here to reduce boilerplate by abstracting repetitive patterns. Macros can be used for tasks like deriving traits automatically or for generating code based on custom attributes.

Procedural Macros

#[derive(Behaviors)] This Rust procedural macro automatically implements the CreateStateMachine trait for an enum, generating a create_state_machine method that matches each enum variant to a new state machine instance. It's designed for enums where each variant contains a single unnamed field representing state data. This macro simplifies the creation of state machines from enums, eliminating repetitive boilerplate code and enhancing code maintainability.

Example

You can use this macro like so:

use arbiter_macros::Behaviors;
use arbiter_engine::machine::Behavior;

struct MyBehavior1 {}
impl Behavior for MyBehavior1 {
    // ...
}
struct MyBehavior2 {}

}
impl Behavior for MyBehavior2 {
    // ...
}

#[derive(Behaviors)]
enum Behaviors {
    MyBehavior1(MyBehavior1),
    MyBehavior2(MyBehavior2),
}

#[main]. The #[arbiter_macros::main] macro in arbiter-macros/src/lib.rs is designed to simplify the creation of a CLI that will let you run your simulations by automatically generating a main function that sets up command-line parsing, logging, async execution, and world creation. It takes custom attributes to configure the application's metadata such as the project's name, description, and the set of behaviors you want to use. Under the hood, it uses the clap crate for parsing CLI arguments and tracing for logging based on verbosity level. The macro needs to have have an object that has the CreateStateMachine trait implemented which can be done using the #[derive(Behaviors)] macro.

Usage

You can find an example that uses both of these macros in the arbiter-template repository. Similarly, in the Arbiter repo itself, this exact same collection of code is found in the examples/template/ directory.

If you wanted to use the #[main] macro alongside the #[derive(Behaviors)] macro, you would do so like this:

use arbiter_macros::main;

use Behaviors; // From the Behaviors example above


#[main(
    name = "ExampleArbiterProject",
    about = "Our example to get you started.",
    behaviors = Behaviors
)]
pub async fn main() {}

Techniques

At a high level when you are designing a simulation the two things you need to think about are behaviors and one or more random variable. A random variable is what you can perturb over the course of a simulation. For example almost all economic models have a random variable that represents the price. This allows you to see how the model behaves under different prices or market conditions. Does this system handle price volatility well? Or does it break down?

Anomaly Detection

Anomaly detection is the process of identifying unexpected items or events in data sets, which differ from the norm. Anomaly detection is often applied on unlabeled data which is known as unsupervised anomaly detection.

When you are building your simulation you are trying to discover unknown unknowns and carefully examine design assumptions. This is a difficult task and it is not always clear what you are looking for. As a result the best place to start is the design a simulation that will validate the existing design assumptions.

Measuring Risk

Quantifying Security Risk

Quantitative security is a field of research that applies mathematical and statistical methods to studying cybersecurity. It aims to quantify and model security risks, vulnerabilities, and impacts providing a more objective and measurable approach to security management. Quantitative security can be used to assess the effectiveness of security controls, identify vulnerabilities, and predict the impact of security incidents. It can also be used to evaluate the effectiveness of security policies and procedures.

In software design, quantitative security can be used to quantify the economic risk of a system. This involves modeling the system's behavior and then using statistical methods to analyze the data generated by the model. The results can be used to identify vulnerabilities and predict the impact of security incidents.

Risk is understood as $$ risk = impact * likelihood $$ where impact is the cost of exploitation and likelihood is the probability of exploitation.

for example

pub fn calculate_impact(consequences: Vec<f64>, weights: Option<Vec<f64>>) -> f64 {
    let weights = match weights {
        Some(w) => w,
        None => vec![1.0; consequences.len()],  // If no weights are provided, assume equal importance
    };
    consequences.iter().zip(weights.iter()).map(|(c, w)| c * w).sum()
}

// Example: Data loss (e.g., $5000), downtime (e.g., 10 hours), reputational damage (e.g., 7 on a scale of 10)
let consequences = vec![5000.0, 10.0, 7.0];
let weights = Some(vec![0.5, 0.3, 0.2]);  // Weights reflecting the relative importance of each consequence
let impact = calculate_impact(consequences, weights);
println!("{}", impact);  // Outputs: 2535.0

and calculating the likelihood of exploitation is a function of its historical frequency, threat capability, control effectiveness, and environmental factors. All of which are between zero and one. Threat capability is a metric quantifying threat actor sophistication and resources, control effectiveness quantifying the effectiveness of security controls, and environment factor quantifying the security of the environment in which the system operates.

fn calculate_likelihood(historical_frequency: f64, threat_capability: f64, control_effectiveness: f64, environment_factor: f64) -> f64 {
    historical_frequency * threat_capability * (1.0 - control_effectiveness) * environment_factor
}

fn main() {
    // Example: High historical frequency (e.g., 0.8), high threat capability (e.g., 0.9), medium control effectiveness (e.g., 0.5), high environment factor (e.g., 1.0)
    let likelihood = calculate_likelihood(0.8, 0.9, 0.5, 1.0);
    println!("{}", likelihood);  // Outputs: 0.36
}

Economic Risk

Economic risk in the context of finance can be quantified by considering various factors such as:

  • Market Risk: This is the risk of investments declining in value because of economic developments or other events that affect the entire market. For example, the risk of a decline in the stock market.

  • Credit Risk: This is the risk that a borrower will not repay a loan according to the loan terms, resulting in a loss to the lender—for example, the risk of a company defaulting on its bonds.

  • Operational Risk: This is the risk of loss resulting from inadequate or failed internal processes, people, and systems or external events—for example, the risk of a data breach due to insufficient cybersecurity measures.

  • Liquidity Risk: This is the risk that an investor will not be able to sell an investment when they wish because of a lack of buyers in the market—for example, the risk of being unable to sell real estate quickly at a fair price.

These risks can be quantified using various financial models and statistical methods. For example, Value at Risk (VaR) is commonly used to quantify market risk. Given a certain level of confidence and time horizon, it estimates the potential loss that could occur on an investment.

Credit risk can be quantified using credit scoring models like the Altman Z-score, which predicts the probability of a company going bankrupt. Operational risk can be quantified using methods like the loss distribution approach (LDA), where the frequency and severity of losses are modeled to estimate the total loss. Liquidity risk can be quantified using the bid-ask spread or the liquidity coverage ratio (LCR).

It's important to note that these are just examples, and quantifying economic risk in finance is a complex process that requires a deep understanding of financial theories and statistical models.

Metrics

Data plays a crucial role in quantifying risk and modeling systems. It provides the foundation for statistical analysis and predictive modeling, enabling us to measure and understand the behavior of systems under various conditions. We can identify patterns, trends, and correlations by analyzing data to help us predict future events or outcomes. This is particularly important in economic risk, where accurate predictions can help mitigate potential losses and optimize returns.

The particular metrics we have been interested in (by no means exhaustive or representative of the entire field) are:

Arbitrage Profit

Arbitrage profit is the profit made by taking advantage of the price differences of a particular asset across different markets or platforms. In DeFi, these opportunities can arise due to inefficiencies in asset pricing. If related to a decentralized exchange, such as an automated market maker(AMM), mathematical metrics can be derived to compute the cost and revenue of these arbitrage opportunities exactly.

There are generally two types of arbitrage opportunities in DeFi:

Atomic arbitrage opportunities in DeFi are transactions that are either fully executed or not executed at all. This is possible due to the atomicity of the Ethereum Virtual Machine (EVM), which ensures that all operations within a transaction are treated as a single, indivisible unit. The entire transaction is reverted if any operation fails, ensuring no partial state changes occur. This characteristic of the EVM allows for risk-free arbitrage opportunities, as the arbitrageur is not exposed to the risk of one part of the trade executing while the other does not.

Non-atomic arbitrage opportunities in DeFi are transactions that are partially executed. This is possible due to the lack of atomicity in the EVM, allowing partial state changes to occur. If one part of the trade fails, the other can still be executed, resulting in a partial state change. This characteristic of the EVM allows for riskier arbitrage opportunities, as the arbitrageur is exposed to the risk of one part of the trade executing while the other is not.

Non-atomic arbitrage is much more challenging to measure and model, requiring a more complex understanding of the EVM and its execution model. However, atomic arbitrage is easy to measure, as it only requires a basic understanding of the EVM and its execution model.

Liquidity Provider Portfolio Value

Liquidity Provider Portfolio Value refers to the payoff that an LP assumes when providing liquidity to a poolReplicating Market MakersReplicating Monotonic Payoffs Without Oracles.

The has been shown to have two components path dependent and path independent components, which have been introduced in this paper as loss vs. holding(LVH) and loss vs. rebalancing (LVR), respectively.

Fee Growth

Fee Growth in Automated Market Makers (AMMs) refers to the fees collected by the liquidity providers over time. These fees are generated from the trading activity in the liquidity pool and are directly proportional to the volume of trades. The more the trading activity (turnover), the higher the fees collected, leading to a growth in the fees. This fee growth can be a significant source of income for liquidity providers, in addition to the potential price appreciation of the assets in the pool.

Model Parameters

Geometric Brownian Motion (GBM)

Geometric Brownian Motion (GBM) is a standard method to model price paths in financial markets. Two parameters characterize it:

  1. Drift (μ): This represents the asset's expected return. It is the direction that we expect our asset to move in the future.

  2. Volatility (σ): This represents the standard deviation of the asset's returns. It is a measure of the asset's risk or uncertainty.

The GBM model assumes that the logarithmic returns of the asset prices are normally distributed and that the following stochastic differential equation can model them:

$$ dS_t = μS_t dt + σS_t dW_t $$

Where:

  • $S_t$ is the asset price at time t
  • $μ$ is the drift
  • $σ$ is the volatility
  • $W_t$ is a Wiener process

This equation describes the change in the asset price over an infinitesimally small period. The first term on the right-hand side represents the deterministic trend (drift), and the second term represents the random fluctuation (volatility).

Contributing

Feedback is the number one way you can help us improve Arbiter, and we want to hear from you! A worthy contribution to the repo is opening an issue or a discussion on the GitHub issues page. Similarly, you can feel free to reach out to us on Telegram. Any and all questions are welcome.

Open Source Community

Arbiter is an open-source project and we welcome contributions from the community. We keep track of all issues and feature requests on our GitHub issues page. Issues that are approachable for newcomers are tagged with the good first issue, so be on the lookout for those!

See our Contributing Guidelines

Vulnerability Corpus

If you have found a vulnerability in a smart contract using Arbiter, please report it to us by opening an issue on our GitHub issues page or consider adding it yourself to our Vulnerability Corpus. This can help the Ethereum developer community know how to test their own smart contracts and avoid similar vulnerabilities.

Vulnerability Corpus

Here is a running list of vulnerabilities that have been found with Arbiter. This list is not exhaustive, but it is a good starting point for understanding how to use Arbiter to find vulnerabilities. Arbiter has a unique ability to detect anomaly behavior in a production-like environment. This can be used to audit mechanism design in smart contract systems as well as detect vulnerabilities in smart contracts.


Vulnerabilities

Portfolio Rebalancing: Severity - High

This was a critical vulnerability discovered in the Portfolio Contracts that we were auditing internally. The bug is described in this PR. To reproduce the vulnerability you can run the following command:

git clone https://github.com/primitivefinance/portfolio_simulations.git
cd portfolio_simulations
git checkout (bug-found)-invariant-pre-post-swap
cargo run --release

The bug was not caught by our prior audits and extensive test suit. The simulation ran an arbitrageur against the Portfolio AMM and a stochastic price path. The bug was identified after 18,000 swaps. It turns out that that Portfolio pools can reach an edge case where the pool reaches one of the tails of its liquidity distribution and causes the invariant to jump, affecting the price of the trade. This would allow a swapper to take advantage of the mispriced funds and take funds from LPs. With arbiter we were able to run ~20000 swaps with this emulated protocol state in parallel with other parameters in <30s allowing us to discover this anomaly.


Rating System

Low: Includes both Non-critical (code style, clarity, syntax, versioning, off-chain monitoring (events, etc) and Low risk (e.g. assets are not at risk: state handling, function incorrect as to spec, issues with comments).

Med: Assets not at direct risk, but the function of the protocol or its availability could be impacted, or leak value with a hypothetical attack path with stated assumptions, but external requirements.

High: Assets can be stolen/lost/compromised directly (or indirectly if there is a valid attack path that does not have hand-wavy hypotheticals). These are considered critical issues that should be addressed immediately.

This criteria is based on the Code4rena judging criteria.

Resources for Classifying Vulnerabilities

Contributing to the Corpus

If you find any vulnerabilities with Arbiter, please submit a pull request to this file with the vulnerability and a description of the vulnerability, a link to the arbiter repo and post mortem and steps to reproduce. If the vulnerability is in the wild and has not yet been patched, please do your best to work with the team responsible for the vulnerability to resolve the vulnerability before disclosing it publicly.