Skip to content

CovaCore: COVA Gossip Network and CovaVM

Technical Overview

CovaCore is our implementation of the core features of COVA protocol and it contains two key components: a) CovaVM with Centrifuge b) COVA Gossip Network

CovaVM can be thought of as the operating system (”Software”) that runs python programs and enforces data usage policies, whereas the Gossip Network (consisting of routing and compute node software), as a distributed computing network (or a supercomputer i.e. ”Hardware”), that provides us enough computing power to run the system.

Checkout our CovaCore implementation on github.

CovaVM

CovaVM is a Python Virtual Machine bytecode interpreter written in Python. Python is usually considered to be an interpreted language, but CPython’s implementation actually generates bytecode for a stack-based virtual machine after parsing (then stored in a .pyc file), and runs that for optimization and quicker subsequent loading. We have taken advantage of the internal high-level bytecode mechanism to build a custom execution engine with the advanced instrumentation necessary for expressive data policies.

COVA Network

CovaVM first compiles Python source to extract its code object, loads the data and policy, and runs the checks described by the validators. Once complete, CovaVM initiates execution of the code object’s bytecode, redirecting program flow back to policy validators when a predefined trigger state has been reached. CovaVM’s enables control over computational activity at the bytecode level, and provides policy writers a granularity of introspection previously unmatched.

Gossip Network

COVA Network

Gossip network's is COVA's way of dealing with efficiently allocating jobs and exploring available compute nodes along with a persistence mechanism for the sensitive information--all without any centrizied entity permission or control. In the sections, we will explain the current design and various parts of the mechanism:

Permissionless Compute Node Registration with Heartbeat

We have assigned 5 (100 in the mainnet design) routing nodes to provision compute nodes and general state persistence. Any compute node joins the network by sending heartbeat to 3 random routing nodes (these random routers are determined via random hashing whose seed is public key of the compute node ). A compute node sends heartbeat to those 3 routers at a fixed interval.

Now these heartbeats are stored in a router. When some computer misses consecutive 3 heartbeats it is considered dead.

Free Node Exploration

Now when a router needs a free computer it randomly selects a router from these 5 routers and asks a free compute node from them. If there is available computing node whose heartbeat is in that router then it returns the IP address of that node and provisions it. After getting the available computer the router informs the 3 routers to which the computer was sending heartbeats to mark this computing node as unavailable. It also sends a request to the computer and commands it to stop sending heartbeat to those 3 routers and send heartbeat only to this router.

When data user needs a free computer for computing it sends a request to a routing node with task details. Then the router sends him a task id and total amount of money needed to pay for this task. After paying the data user again asks the router for a free compute node (using the method outlined above). The router now checks if the payment is complete for the task id and then it finds a free computer using the above method and assigns the task to that computer.

Secure Key Storage with Threshold Encryption

In the application layer, each encryption key is broken into N pieces using Shamir's Secret Sharing Scheme:

eq1

Without consensus of at least half of the routing nodes, it is impossible to retrieve these keys by the assigned compute node. This mechanism enables COVA to not only store data securely but also provides guarantees of security, in case a few of the routings nodes are compomised in the cases similar to Meltdown or Spectre.

Persistence and Fault Tolerance

If for some reason the computer becomes unavailable during computation then the routing node detects that the heartbeat is not coming from the computer and assumes that it is dead. It then reassigns the task to another free computer.

After the computation is finished then the computer sends its result back to the router encrypting the data with data user’s public key. The router saves it locally and registers the task id to persistence layer. When the data user finds the task id in the blockchain then it requests the result to the router and the router sends the result to the data user.

Development So Far

CovaVM

We have finished the first iteration of CovaVM and a JSON based implementation of Centrifuge, that are shipped with the current version. More information and implementation details are here.

Network Layer

We have finished implementing the Gossip network inside Cova-Core. This enables a robust routing node setup, so that even with a few missing routing nodes the network, is self-correcting and operational.

In addition, due to the randomized heartbeat sending mechanism described above, we have a very fast and robust compute node finding solution.

Future Feature List

As we plan to keep improving the protocol, we have a long list of future features to add. Some of the key upcoming features are:

  1. Improving the security with a custom TLS solution, where remote attestation and transfers are fully encrypted between the TEE enclaves of two nodes
  2. Thread local design: Due to the regular web framework such as Flask or CherryPy not functioning in SGX enclave, we had to code up our own web framework. We had to design process local globals which we plan to refactor and move to a thread-local design
  3. Improving the CovaVM along with Centrifuge to allow for more policies
  4. Scale up the routing nodes and TEE network, and support a mix of TEE and non-TEE nodes
  5. Various security patches to fight vulnerabilities and side channel attacks