## Ethereum's Compact Merkle Trie (part I)

Ethereums Compact Merkle Trie (Part II: Merkling) . The EVM persists transactions, receipts and account state in data structures bestdescribed as compact Merkle tries. My goal below is to describe and implement the underlying data structure as clearly and succinctly as possible. While I originally intended to cover the implementation end-to-end in a single post, it becameunwieldy. Ive split the persistence/crypto into afollow-up article Ethereums Compact Merkle Trie (Part II: Merkling) .The below is focused exclusively on an in-memory, immutable trie with the correctcompaction behaviour. Were primarily interested in the data structure itself, and wont spend muchtime discussing the details of how its applied. Ethereum call their data structure a Modified Merkle-Patricia Tree. As modified is vague, and theterm Patricia tree is commonly used in reference to at least two subtlydifferent things, Ill prefer compact Merkle trie. Trie and tree are used interchangeably in this article. Tries are an intuitive means of encoding associative data in a tree structure.Unlike, say, a hash table, a trie structurally represents its keys, enablingthe efficient retrieval of all suffixes, given a partial key. On the right, youll hopefully see abstract depictions of two character-oriented triescontaining word counts for the string the cat casts the cats. The compact trie (bottom) contains fewer nodes, as itdecomposes keys only at points of difference. Of course, the keys need not be textual we could just as well store routing tables bybranching at each bit of an IP address, or almost anything else. Even when keysare essentially textual, we may choose to decompose them at a level more granularthan characters (bits, hex nibbles, etc.). The specific data structure used by Ethereum decomposes k Continue reading >>

## Merkle Patricia Tree

In the following, there will be specific technical terms, where the definitions of these terms are first In the etherbox, the status data for all accounts (including contract accounts, regular accounts) are collectively referred to as world states. refers to only block data stored in the block node node. two blocks that point to the same parent block are generated at the same time. Some miners see one of the blocks and the other miners see another block. This leads to the simultaneous growth of the two block chains. refers to a part of the block structure, used to store the block of the head information, such as the parent block hash, the world state hash, transaction receipt collection hash and so on. The block header stores only some "fixed" length hash fields. As mentioned above, although the prefix tree can serve the purpose of maintaining key-value data, it has a very obvious limitation. Whether it is query operation, or increase or decrease the data, not only inefficient, and the storage space is serious. Therefore, in the ether square, for the MPT tree added several different types of tree nodes, in order to try to compress the overall tree height, reduce the complexity of the operation. In the MPT tree, the tree nodes can be divided into the following four categories: whose hollow nodes are used to represent empty strings. branch node is used to represent all non-leaf nodes in the MPT tree that have more than one child node. The definition is as follows: Children [17] node // Actual trie node data to encode / decode (needs custom encoder // nodeFlag contains caching-related metadata about a node. hash hashNode // cached hash of the node (may be nil) dirty bool // if the node has changes that must be written to the database Like the prefix tree, the MPT also enc Continue reading >>

## Data Structures - Merkle Patricia Tree (ethereum) And Hashtable, Which One Has Faster Search Speed? - Stack Overflow

Merkle Patricia Tree (Ethereum) and Hashtable, which one has faster search speed? I would like to implement a function which has a sequence of inputs "X1,...,Xn" and outputs an ordered list "Xp,..,Xq" where all the elements are distinct but ordered. For every Xi in the sequence "X1,...,Xn", it is a 256 bits long string. The sequence of inputs "X1,...,Xn" may have the same elements, which means that there may exist two elements Xi and Xj to satisfy Xi=Xj. For the same elements in sequence of "X1,...,Xn", we only output one element in the ordered list. The speed of function should be as fast as possible. And it does not matter that how much storage volume is used in function. The size of sequence "X1,...,Xn" is n, and n is a number that no more than 10,000. I use an Array to store the sequence which is initially empty. When inputting Xi, first I search the Hashtable to judge if Xi is already in the above array. If yes, just dropping it. And if not, add Xi to Hashtable and Array. If having inputting all the element of the sequence "X1,...,Xn", I sort the array and output it. And with Merkle Patricia Tree (Ethereum) and Hashtable, which oneshould I choose? For Merkle Patricia Tree (Ethereum) and Hashtable, which one has faster search speed? Or is there a better data structure to satisfy this function? Continue reading >>

## Merkling In Ethereum

Merkle trees are a fundamental part of what makes blockchains tick. Although it is definitely theoretically possible to make a blockchain without Merkle trees, simply by creating giant block headers that directly contain every transaction, doing so poses large scalability challenges that arguably puts the ability to trustlessly use blockchains out of the reach of all but the most powerful computers in the long term. Thanks to Merkle trees, it is possible to build Ethereum nodes that run on all computers and laptops large and small, smart phones, and even internet of things devices such as those that will be produced by Slock.it . So how exactly do these Merkle trees work, and what value do they provide, both now and in the future? First, the basics. A Merkle tree, in the most general sense, is a way of hashing a large number of chunks of data together which relies on splitting the chunks into buckets, where each bucket contains only a few chunks, then taking the hash of each bucket and repeating the same process, continuing to do so until the total number of hashes remaining becomes only one: the root hash. The most common and simple form of Merkle tree is the binary Mekle tree, where a bucket always consists of two adjacent chunks or hashes; it can be depicted as follows: So what is the benefit of this strange kind of hashing algorithm? Why not just concatenate all the chunks together into a single big chunk and use a regular hashing algorithm on that? The answer is that it allows for a neat mechanism known as Merkle proofs: A Merkle proof consists of a chunk, the root hash of the tree, and the branch consisting of all of the hashes going up along the path from the chunk to the root. Someone reading the proof can verify that the hashing, at least for that branch, is c Continue reading >>

## Merkle Tendermint Documentation

For an overview of Merkle trees, see wikipedia . There are two types of Merkle trees used in Tendermint. IAVL+ Tree: An immutable self-balancing binarytree for persistent application state Simple Tree: A simple compact binary tree fora static list of items The purpose of this data structure is to provide persistent storage forkey-value pairs (e.g. account state, name-registrar data, andper-contract data) such that a deterministic merkle root hash can becomputed. The tree is balanced using a variant of the AVLalgorithm so all operationsare O(log(n)). Nodes of this tree are immutable and indexed by its hash. Thus any nodeserves as an immutable snapshot which lets us stage uncommittedtransactions from the mempool cheaply, and we can instantly roll back tothe last committed state to process transactions of a newly committedblock (which may not be the same set of transactions as those from themempool). In an AVL tree, the heights of the two child subtrees of any node differby at most one. Whenever this condition is violated upon an update, thetree is rebalanced by creating O(log(n)) new nodes that point tounmodified nodes of the old tree. In the original AVL algorithm, innernodes can also hold key-value pairs. The AVL+ algorithm (note the plus)modifies the AVL algorithm to keep all values on leaf nodes, while onlyusing branch-nodes to store keys. This simplifies the algorithm whileminimizing the size of merkle proofs In Ethereum, the analog is the Patriciatrie . There are tradeoffs.Keys do not need to be hashed prior to insertion in IAVL+ trees, so thisprovides faster iteration in the key space which may benefit someapplications. The logic is simpler to implement, requiring only twotypes of nodes inner nodes and leaf nodes. The IAVL+ tree is a binarytree, so merkle proofs a Continue reading >>

## Eli5 How Does A Merkle-patricia-trie Tree Work?

Trie (also called digital tree, prefix trie or radix trie) An ordered tree data structure that is used to store a dynamic set or associative array where the keys are usually strings. A node's position in the tree defines the key with which it is associated. A trie for keys "A","to", "tea", "ted", "ten", "i", "in", and "inn". Patricia - Practical Algorithm To Retrieve Information Coded In Alphanumeric ( source )( orginial paper by Donald R. Morrison ). A Patricia trie is a binary radix trie - binary choice at each node when traversing the trie; this is modified in Ethereum. Source: In ethereum, hexadecimal is used - X characters from an 16 character "alphabet". Hence nodes in the trie have 16 child nodes (the 16 character hex "alphabet") and a maximum depth of X. Note a hex character is referred to as a "nibble". As described here , the term Merkle implies that the root node becomes a cryptographic fingerprint of the entire data structure The yellow paper describes a modified merkle patricia trie. This defines three different node types; extension, branch and leaf. These are descibed, using a simplified world state, in the diagram below: Shouldn't a key under world-state contain 8 nibble (32 bytes)? In your example(latest image) it contains 7 nibbles, which is 28 bytes. @atomh33ls alper Dec 15 '17 at 6:57 @Alper thanks... I shortened it to make it more concise. I think 8 nibbles is just 4 bytes... and 7 is 3.5 bytes. I'd need 64 hex chars for 32 bytes... atomh33ls Mar 12 at 12:56 What do the leaf node and extension nodes of the trie do? jamarcus_13 Apr 6 at 16:46 'Trie' comes from the word retrieval, since it only uses the prefix of a word to find it in a dictionary. It is an ordered tree where the keys are usually strings ending with a terminal symbol, and each vertex Continue reading >>

## Package - Merkle-patricia-tree

This is an implementation of the modified merkle patricia tree as speficed in the Ethereum's yellow paper. Last updated 2 months ago by holgerd77 . This is an implementation of the modified merkle patricia tree as specified in the Ethereum's yellow paper . The modified Merkle Patricia tree (trie) provides a persistent data structure to map between arbitrary-length binary data (byte arrays). It is defined in terms of a mutable data structure to map between 256-bit binary fragments and arbitrary-length binary data. The core of the trie, and its sole requirement in terms of the protocol specification is to provide a single 32-byte value that identifies a given set of key-value pairs. The only backing store supported is LevelDB through the levelup module. var Trie = require('merkle-patricia-tree'),levelup = require('levelup'),db = levelup('./testdb'),trie = new Trie(db); trie.put('test', 'one', function () { trie.get('test', function (err, value) { if(value) console.log(value.toString()) });}); Trie.prove(trie, 'test', function (err, prove) { if (err) return cb(err) Trie.verifyProof(trie.root, 'test', prove, function (err, value) { if (err) return cb(err) console.log(value.toString()) cb() })}) Continue reading >>

## Data Structure In Ethereum. Episode 3: Patriciatrie.

Image source: /r/ethereum by BitcoinIsTehFuture Data structure in Ethereum. Episode 3: Patriciatrie. Patricia trie is the main trie used in Ethereum to store data. It is a mixture of Radix trie and Merkle trie . In order to make it easy to understand, I will take us to an inefficient Patricia trie first, then move to improved Patricia trie and finally, the real Patricia trie. This way will help us know the reason why some structures have been generated. It means about motivation and intuition . A node in Ethereum is stored as a key-value with key is hash of node. Value is an array with 17 elements. In 17 elements, first 16 elements are indexed by hex number from 0 to f and the last one is data contained by that node. Please note that, key is used for database lookup (it means to find a node by database mechanism) and path is used for trie lookup (it means to find data by path descending as Radix trie ). If it is still ambiguous, it would be more easy with an example below. We build a trie that represents our dataset. To test it again, lets try searching the value of 395 path step by step. Notations: this is database lookup tdl, this is trie lookup ttl; We descend 395 to 3 parts 3, 9, 5 and use them in sequence. Starting at rootHash, we have to find rootNode corresponding with rootHash (tdl). First part of path is 3, so we get the element indexed by 3 of rootNode and that is hashA (ttl). Looking for hashA (tdl), getting element indexed by 9 (ttl) and value of element is hashC. Looking for hashC (tdl), getting element indexed by 5 (ttl) and value of element is hashD. At this time, we descend entire the path so we will get value contained in data element (the last element) of node corresponding with hashD (ttl), and the result is duck. You can see that, we used path to fi Continue reading >>

## Exploring Ethereum's State Trie With Node.js

Exploring Ethereum's state trie with Node.js If you not familiar with ethereum start here . In this post I will attempt to read cpp-ethereums state db give a root using Node.js and the merkle-patricia-tree module. The merkle-patricia-tree module is hot off the press so there might be bugs. If you find any please let me know. The State DB is like a global ledger; It stores balances of all the accounts was well as the code for all of the contracts and their respective balances. The global state is stored in a modified merkle patricia tree (trie), which you can read more about in the yellow paper or on the wiki . The important part is understanding what it is suppose to do, which is The core of the trie, and its sole requirement in terms of the protocol specification is to provide a single 32-byte value that identifies a given set of key-value pairs. cpp-ethereum stores the trie in a level db. On linux the path to the state db is ~/.ethereum/state. Let try just opening the database and seeing what we get. To do this we are going to need install the levelup module npm install levelup We are going to look up one of the state roots. You can get them from althezero. In the example Im using the genesis root hash from the latest cpp code from the dev branch. Looking up the root should give the top node in the trie. var levelup = require('levelup');var db = levelup('./.ethereum/state');//the genesis state rootvar root = '12582945fc5ad12c3e7b67c4fc37a68fc0d52d995bb7f7291ff41a2739a7ca16';//Note: we are doing everything using binary encoding.db.get(new Buffer(root, 'hex'), { encoding: 'binary'}, function (err, value) { console.log(value);}); We should get a print out of a Buffer not necessarily usefully. Each node in the trie is rlp encoded, so let decoded the node. To do that we a Continue reading >>

## How Does Ethereum Work, Anyway?

Odds are youve heard about the Ethereum blockchain, whether or not you know what it is. Its been in the news a lot lately, including the cover of some major magazines, but reading those articles can be like gibberish if you dont have a foundation for what exactly Ethereum is. So what is it? In essence, a public database that keeps a permanent record of digital transactions. Importantly, this database doesnt require any central authority to maintain and secure it. Instead it operates as a trustless transactional system a framework in which individuals can make peer-to-peer transactions without needing to trust a third party OR one another. Still confused? Thats where this post comes in. My aim is to explain how Ethereum functions at a technical level, without complex math or scary-looking formulas. Even if youre not a programmer, I hope youll walk away with at least better grasp of the tech. If some parts are too technical and difficult to grok, thats totally fine! Theres really no need to understand every little detail. I recommend just focusing on understanding things at a broad level. Many of the topics covered in this post are a breakdown of the concepts discussed in the yellow paper. Ive added my own explanations and diagrams to make understanding Ethereum easier. Those brave enough to take on the technical challenge can also read the Ethereum yellow paper. A blockchain is a cryptographically secure transactional singleton machine with shared-state.[1]Thats a mouthful, isnt it? Lets break it down. * Cryptographically securemeans that the creation of digital currency is secured by complex mathematical algorithms that are obscenely hard to break. Think of a firewall of sorts. They make it nearly impossible to cheat the system (e.g. create fake transactions, erase tra Continue reading >>

## Data Structure In Ethereum. Episode 2: Radix Trie And Merkletrie.

Trie is a main data structure used in Ethereum. Data structure in Ethereum. Episode 2: Radix trie and Merkletrie. In the episode 1 and 1+ , we got familiar with some encoding/decoding algorithms which are used for constructing Ethereum data. Now, we are moving on to the organization of data in Ethereum. I will introduce two kinds of trie, they are Radix trie and Merkle trie. Actually, they are not used in raw, Ethereum mixed them and created a new trie that more optimized and named Patricia trie. This episode is such a preparation for us to understand Patricia trie. If you never ever heard about trie or tries, dont worry . They are very easy to understand. Trie is a word or a terminology that represents digital tree in science computer. Sometime, we can see that tree is used, its ok because of the same meaning. In others word, trie is an ordered data structure that is used to store a dynamic set or associative array which is formed to key-value where the keys are usually strings. We can get familiar with some terminologies of trie by the image above. Further, set of root, internal node, leaf will be called node in common. We will use this data sample for all examples. In dataset, key is strings and value is integers. Radix trie is used to optimize for searching . In radix trie, the key in dataset will be the path to reach the value. As you can see, every path of trie represents a character which is ASCII and it is used for searching value. For example, we are looking for the value of key which is dodo. Just start from the root, try to look for the d path first and keep descenting whole the path. The final result is the red line and green node with value 4. However, the branch of house and houses key was degraded, too many internal nodes with null value. In order to rea Continue reading >>

## Design Rationale | Ethereum Builder's Guide

Although Ethereum borrows many ideas that have already been tried and tested for half a decade in older cryptocurrencies like Bitcoin, there are a number of places in which Ethereum diverges from the most common way of handling certain protocol features, and there are also many situations in which Ethereum has been forced to develop completely new economic approaches because it offers functionality that is not offered by other existing systems. The purpose of this document will be to detail all of the finer potentially nonobvious or in some cases controversial decisions that were made in the process of building the Ethereum protocol, as well as showing the risks involved in both our approach and possible alternatives. The Ethereum protocol design process follows a number of principles: Sandwich complexity model: we believe that the bottom level architecture of Ethereum should be as simple as possible, and the interfaces to Ethereum (including high level programming languages for developers and the user interface for users) should be as easy to understand as possible. Where complexity is inevitable, it should be pushed into the "middle layers" of the protocol, that are not part of the core consensus but are also not seen by end users - high-level-language compilers, argument serialization and deserialization scripts, storage data structure models, the leveldb storage interface and the wire protocol, etc. However, this preference is not absolute. Freedom: users should not be restricted in what they use the Ethereum protocol for, and we should not attempt to preferentially favor or disfavor certain kinds of Ethereum contracts or transactions based on the nature of their purpose. This is similar to the guiding principle behind the concept of "net neutrality". One example o Continue reading >>

## Github - Ethereumjs/merkle-patricia-tree: This Is An Implementation Of The Modified Merkle Patricia Tree As Specified In The Ethereum's Yellow Paper.

This is an implementation of the modified merkle patricia tree as specified in the Ethereum's yellow paper. If nothing happens, download GitHub Desktop and try again. If nothing happens, download GitHub Desktop and try again. If nothing happens, download Xcode and try again. If nothing happens, download the GitHub extension for Visual Studio and try again. This is an implementation of the modified merkle patricia tree as specified in the Ethereum's yellow paper . The modified Merkle Patricia tree (trie) provides a persistent data structure to map between arbitrary-length binary data (byte arrays). It is defined in terms of a mutable data structure to map between 256-bit binary fragments and arbitrary-length binary data. The core of the trie, and its sole requirement in terms of the protocol specification is to provide a single 32-byte value that identifies a given set of key-value pairs. The only backing store supported is LevelDB through the levelup module. There are 3 variants of the tree implemented in this library, namely: BaseTrie, CheckpointTrie and SecureTrie. CheckpointTrie adds checkpointing functionality to the BaseTrie with the methods checkpoint, commit and revert. SecureTrie extends CheckpointTrie and is the most suitable variant for Ethereum applications. It stores values under the keccak256 hash of their keys. const Trie = require('merkle-patricia-tree').BaseTrie, level = require('level'), db = level('./testdb'), trie = new Trie(db)trie.put('test', 'one', function() { trie.get('test', function(err, value) { if (value) console.log(value.toString()) })}) Trie.prove(trie, 'test', function(err, prove) { if (err) return cb(err) Trie.verifyProof(trie.root, 'test', prove, function(err, value) { if (err) return cb(err) console.log(value.toString()) cb() })}) var Continue reading >>

## Patricia Tree Ethereum/wiki Wiki Github

In a basic radix trie, every node looks as follows: Where i0 ... in represent the symbols of the alphabet (often binary or hex), value is the terminal value at the node, and the values in the i0 ... in slots are either NULL or pointers to (in our case, hashes of) other nodes. This forms a basic (key, value) store; for example, if you are interested in the value that is currently mapped to dog in the trie, you would first convert dog into letters of the alphabet (giving 64 6f 67), and then descend down the trie following that path until at the end of the path you read the value. That is, you would first look up the root hash in a flat key/value DB to find the root node of the trie (which is basically an array of keys to other nodes), use the value at index 6 as a key (and look it up in the flat key/value DB) to get the node one level down, then pick index 4 of that to lookup the next value, then pick index 6 of that, and so on, until, once you followed the path: root -> 6 -> 4 -> 6 -> 15 -> 6 -> 7, you look up the value of the node that you have and return the result. Note there is a difference between looking something up in the "trie" vs the underlying flat key/value "DB". They both define key/values arrangements, but the underlying DB can do a traditional 1 step lookup of a key, while looking up a key in the trie requires multiple underlying DB lookups to get to the final value as described above. To eliminate ambiguity, let's refer to the latter as a path. The update and delete operations for radix tries are simple, and can be defined roughly as follows: def update(node,path,value): if path == '': curnode = db.get(node) if node else [ NULL ] * 17 newnode = curnode.copy() newnode[-1] = value else: curnode = db.get(node) if node else [ NULL ] * 17 newnode = curnode.co Continue reading >>

## Understanding The Ethereum Trie

The other day I finally got around to reading the entire ethereum yellow paper and to figuring out how the modified Merkle-patricia-tree (trie) works. So lets go through a brief but hopefully complete explanation of the trie, using examples. A block in the ethereum blockchain consists of a header, a list of transactions, and a list of uncle blocks. Included in the header is a transaction root hash, which is used to validate the list of transactions. While transactions are sent over the wire from peer to peer as a simple list, they must be assembled into a special data structure called a trie to compute the root hash. Note that this data structure is not needed except to verify blocks (and hence of course to mine them), and can technically be discarded once a block has been verified. However, it is implied that the transaction lists are stored locally in a trie, and serialized to lists to send to clients requesting the blockchain. Of course, that client will then reconstruct the transaction list trie for each block to verify the root hash. Note that RLP (recursive length prefix encoding) , ethereums home-rolled encoding system, is used to encode all entries in the trie. A trie is also known as a radix tree , and the ethereum implementation introduces a couple modifications to boost efficiency. In a normal radix tree, a key is the actual path taken through the tree to get to the corresponding value. That is, beginning from the root node of the tree, each character in the key tells you which child node to follow to get to the corresponding value, where the values are stored in the leaf nodes that terminate every path through the tree. Supposing the keys come from an alphabet containing N characters, each node in the tree can have up to N children, and the maximum depth of Continue reading >>