ethereum.ethash

Ethash is a proof-of-work algorithm designed to be [ASIC] resistant through memory hardness.

To achieve memory hardness, computing Ethash requires access to subsets of a large structure. The particular subsets chosen are based on the nonce and block header, while the set itself is changed every [epoch].

At a high level, the Ethash algorithm is as follows:

Create a seed value, generated with [generate_seed] and based on the preceding block numbers.
From the seed, compute a pseudorandom cache with [generate_cache].
From the cache, generate a dataset with [generate_dataset]. The dataset grows over time based on [DATASET_EPOCH_GROWTH_SIZE].
Miners hash slices of the dataset together, which is where the memory hardness is introduced. Verification of the proof-of-work only requires the cache to be able to recompute a much smaller subset of the full dataset.

View Source

  1"""
  2Ethash is a proof-of-work algorithm designed to be [ASIC] resistant through
  3[memory hardness][mem-hard].
  4
  5To achieve memory hardness, computing Ethash requires access to subsets of a
  6large structure. The particular subsets chosen are based on the nonce and block
  7header, while the set itself is changed every [`epoch`].
  8
  9At a high level, the Ethash algorithm is as follows:
 10
 111. Create a **seed** value, generated with [`generate_seed`] and based on the
 12   preceding block numbers.
 131. From the seed, compute a pseudorandom **cache** with [`generate_cache`].
 141. From the cache, generate a **dataset** with [`generate_dataset`]. The
 15   dataset grows over time based on [`DATASET_EPOCH_GROWTH_SIZE`].
 161. Miners hash slices of the dataset together, which is where the memory
 17   hardness is introduced. Verification of the proof-of-work only requires the
 18   cache to be able to recompute a much smaller subset of the full dataset.
 19
 20[`DATASET_EPOCH_GROWTH_SIZE`]: ref:ethereum.ethash.DATASET_EPOCH_GROWTH_SIZE
 21[`generate_dataset`]: ref:ethereum.ethash.generate_dataset
 22[`generate_cache`]: ref:ethereum.ethash.generate_cache
 23[`generate_seed`]: ref:ethereum.ethash.generate_seed
 24[`epoch`]: ref:ethereum.ethash.epoch
 25[ASIC]: https://en.wikipedia.org/wiki/Application-specific_integrated_circuit
 26[mem-hard]: https://en.wikipedia.org/wiki/Memory-hard_function
 27"""
 28
 29from typing import Callable, Tuple, Union
 30
 31from ethereum.base_types import U32, Bytes8, Uint
 32from ethereum.crypto.hash import Hash32, Hash64, keccak256, keccak512
 33from ethereum.utils.numeric import (
 34    is_prime,
 35    le_bytes_to_uint32_sequence,
 36    le_uint32_sequence_to_bytes,
 37    le_uint32_sequence_to_uint,
 38)
 39
 40EPOCH_SIZE = 30000
 41"""
 42Number of blocks before a dataset needs to be regenerated (known as an
 43"epoch".) See [`epoch`].
 44
 45[`epoch`]: ref:ethereum.ethash.epoch
 46"""
 47
 48INITIAL_CACHE_SIZE = 2**24
 49"""
 50Size of the cache (in bytes) during the first epoch. Each subsequent epoch's
 51cache roughly grows by [`CACHE_EPOCH_GROWTH_SIZE`] bytes. See [`cache_size`].
 52
 53[`CACHE_EPOCH_GROWTH_SIZE`]: ref:ethereum.ethash.CACHE_EPOCH_GROWTH_SIZE
 54[`cache_size`]: ref:ethereum.ethash.cache_size
 55"""
 56
 57CACHE_EPOCH_GROWTH_SIZE = 2**17
 58"""
 59After the first epoch, the cache size grows by roughly this amount. See
 60[`cache_size`].
 61
 62[`cache_size`]: ref:ethereum.ethash.cache_size
 63"""
 64
 65INITIAL_DATASET_SIZE = 2**30
 66"""
 67Size of the dataset (in bytes) during the first epoch. Each subsequent epoch's
 68dataset roughly grows by [`DATASET_EPOCH_GROWTH_SIZE`] bytes. See
 69[`dataset_size`].
 70
 71[`DATASET_EPOCH_GROWTH_SIZE`]: ref:ethereum.ethash.DATASET_EPOCH_GROWTH_SIZE
 72[`dataset_size`]: ref:ethereum.ethash.dataset_size
 73"""
 74
 75DATASET_EPOCH_GROWTH_SIZE = 2**23
 76"""
 77After the first epoch, the dataset size grows by roughly this amount. See
 78[`dataset_size`].
 79
 80[`dataset_size`]: ref:ethereum.ethash.dataset_size
 81"""
 82
 83HASH_BYTES = 64
 84"""
 85Length of a hash, in bytes.
 86"""
 87
 88MIX_BYTES = 128
 89"""
 90Width of mix, in bytes. See [`generate_dataset_item`].
 91
 92[`generate_dataset_item`]: ref:ethereum.ethash.generate_dataset_item
 93"""
 94
 95CACHE_ROUNDS = 3
 96"""
 97Number of times to repeat the [`keccak512`] step while generating the hash. See
 98[`generate_cache`].
 99
100[`keccak512`]: ref:ethereum.crypto.hash.keccak512
101[`generate_cache`]: ref:ethereum.ethash.generate_cache
102"""
103
104DATASET_PARENTS = 256
105"""
106Number of parents of each dataset element. See [`generate_dataset_item`].
107
108[`generate_dataset_item`]: ref:ethereum.ethash.generate_dataset_item
109"""
110
111HASHIMOTO_ACCESSES = 64
112"""
113Number of accesses in the [`hashimoto`] loop.
114
115[`hashimoto`]: ref:ethereum.ethash.hashimoto
116"""
117
118
119def epoch(block_number: Uint) -> Uint:
120    """
121    Obtain the epoch number to which the block identified by `block_number`
122    belongs. The first epoch is numbered zero.
123
124    An Ethash epoch is a fixed number of blocks ([`EPOCH_SIZE`]) long, during
125    which the dataset remains constant. At the end of each epoch, the dataset
126    is generated anew. See [`generate_dataset`].
127
128    [`EPOCH_SIZE`]: ref:ethereum.ethash.EPOCH_SIZE
129    [`generate_dataset`]: ref:ethereum.ethash.generate_dataset
130    """
131    return block_number // EPOCH_SIZE
132
133
134def cache_size(block_number: Uint) -> Uint:
135    """
136    Obtain the cache size (in bytes) of the epoch to which `block_number`
137    belongs.
138
139    See [`INITIAL_CACHE_SIZE`] and [`CACHE_EPOCH_GROWTH_SIZE`] for the initial
140    size and linear growth rate, respectively. The cache is generated in
141    [`generate_cache`].
142
143    The actual cache size is smaller than simply multiplying
144    `CACHE_EPOCH_GROWTH_SIZE` by the epoch number to minimize the risk of
145    unintended cyclic behavior. It is defined as the highest prime number below
146    what linear growth would calculate.
147
148    [`INITIAL_CACHE_SIZE`]: ref:ethereum.ethash.INITIAL_CACHE_SIZE
149    [`CACHE_EPOCH_GROWTH_SIZE`]: ref:ethereum.ethash.CACHE_EPOCH_GROWTH_SIZE
150    [`generate_cache`]: ref:ethereum.ethash.generate_cache
151    """
152    size = INITIAL_CACHE_SIZE + (CACHE_EPOCH_GROWTH_SIZE * epoch(block_number))
153    size -= HASH_BYTES
154    while not is_prime(size // HASH_BYTES):
155        size -= 2 * HASH_BYTES
156
157    return size
158
159
160def dataset_size(block_number: Uint) -> Uint:
161    """
162    Obtain the dataset size (in bytes) of the epoch to which `block_number`
163    belongs.
164
165    See [`INITIAL_DATASET_SIZE`] and [`DATASET_EPOCH_GROWTH_SIZE`][ds] for the
166    initial size and linear growth rate, respectively. The complete dataset is
167    generated in [`generate_dataset`], while the slices used in verification
168    are generated in [`generate_dataset_item`].
169
170    The actual dataset size is smaller than simply multiplying
171    `DATASET_EPOCH_GROWTH_SIZE` by the epoch number to minimize the risk of
172    unintended cyclic behavior. It is defined as the highest prime number below
173    what linear growth would calculate.
174
175    [`INITIAL_DATASET_SIZE`]: ref:ethereum.ethash.INITIAL_DATASET_SIZE
176    [ds]: ref:ethereum.ethash.DATASET_EPOCH_GROWTH_SIZE
177    [`generate_dataset`]: ref:ethereum.ethash.generate_dataset
178    [`generate_dataset_item`]: ref:ethereum.ethash.generate_dataset_item
179    """
180    size = INITIAL_DATASET_SIZE + (
181        DATASET_EPOCH_GROWTH_SIZE * epoch(block_number)
182    )
183    size -= MIX_BYTES
184    while not is_prime(size // MIX_BYTES):
185        size -= 2 * MIX_BYTES
186
187    return size
188
189
190def generate_seed(block_number: Uint) -> Hash32:
191    """
192    Obtain the cache generation seed for the block identified by
193    `block_number`. See [`generate_cache`].
194
195    [`generate_cache`]: ref:ethereum.ethash.generate_cache
196    """
197    epoch_number = epoch(block_number)
198
199    seed = b"\x00" * 32
200    while epoch_number != 0:
201        seed = keccak256(seed)
202        epoch_number -= 1
203
204    return Hash32(seed)
205
206
207def generate_cache(block_number: Uint) -> Tuple[Tuple[U32, ...], ...]:
208    """
209    Generate the cache for the block identified by `block_number`. See
210    [`generate_dataset`] for how the cache is used.
211
212    The cache is generated in two steps: filling the array with a chain of
213    [`keccak512`] hashes, then running two rounds of Sergio Demian Lerner's
214    [RandMemoHash] on those bytes.
215
216    [`keccak512`]: ref:ethereum.crypto.hash.keccak512
217    [`generate_dataset`]: ref:ethereum.ethash.generate_dataset
218    [RandMemoHash]: http://www.hashcash.org/papers/memohash.pdf
219    """
220    seed = generate_seed(block_number)
221    cache_size_bytes = cache_size(block_number)
222
223    cache_size_words = cache_size_bytes // HASH_BYTES
224    cache = [keccak512(seed)]
225
226    for index in range(1, cache_size_words):
227        cache_item = keccak512(cache[index - 1])
228        cache.append(cache_item)
229
230    for _ in range(CACHE_ROUNDS):
231        for index in range(cache_size_words):
232            # Converting `cache_size_words` to int as `-1 + Uint(5)` is an
233            # error.
234            first_cache_item = cache[
235                (index - 1 + int(cache_size_words)) % cache_size_words
236            ]
237            second_cache_item = cache[
238                U32.from_le_bytes(cache[index][0:4]) % cache_size_words
239            ]
240            result = bytes(
241                [a ^ b for a, b in zip(first_cache_item, second_cache_item)]
242            )
243            cache[index] = keccak512(result)
244
245    return tuple(
246        le_bytes_to_uint32_sequence(cache_item) for cache_item in cache
247    )
248
249
250def fnv(a: Union[Uint, U32], b: Union[Uint, U32]) -> U32:
251    """
252    A non-associative substitute for XOR, inspired by the [FNV] hash by Fowler,
253    Noll, and Vo. See [`fnv_hash`], [`generate_dataset_item`], and
254    [`hashimoto`].
255
256    Note that here we multiply the prime with the full 32-bit input, in
257    contrast with the [FNV-1] spec which multiplies the prime with one byte
258    (octet) in turn.
259
260    [`hashimoto`]: ref:ethereum.ethash.hashimoto
261    [`generate_dataset_item`]: ref:ethereum.ethash.generate_dataset_item
262    [`fnv_hash`]: ref:ethereum.ethash.fnv_hash
263    [FNV]: https://w.wiki/XKZ
264    [FNV-1]: http://www.isthe.com/chongo/tech/comp/fnv/#FNV-1
265    """
266    # This is a faster way of doing `number % (2 ** 32)`.
267    result = ((Uint(a) * 0x01000193) ^ Uint(b)) & U32.MAX_VALUE
268    return U32(result)
269
270
271def fnv_hash(
272    mix_integers: Tuple[U32, ...], data: Tuple[U32, ...]
273) -> Tuple[U32, ...]:
274    """
275    Combines `data` into `mix_integers` using [`fnv`]. See [`hashimoto`] and
276    [`generate_dataset_item`].
277
278    [`hashimoto`]: ref:ethereum.ethash.hashimoto
279    [`generate_dataset_item`]: ref:ethereum.ethash.generate_dataset_item
280    [`fnv`]: ref:ethereum.ethash.fnv
281    """
282    return tuple(
283        fnv(mix_integers[i], data[i]) for i in range(len(mix_integers))
284    )
285
286
287def generate_dataset_item(
288    cache: Tuple[Tuple[U32, ...], ...], index: Uint
289) -> Hash64:
290    """
291    Generate a particular dataset item 0-indexed by `index` by hashing
292    pseudorandomly-selected entries from `cache` together. See [`fnv`] and
293    [`fnv_hash`] for the digest function, [`generate_cache`] for generating
294    `cache`, and [`generate_dataset`] for the full dataset generation
295    algorithm.
296
297    [`fnv`]: ref:ethereum.ethash.fnv
298    [`fnv_hash`]: ref:ethereum.ethash.fnv_hash
299    [`generate_dataset`]: ref:ethereum.ethash.generate_dataset
300    [`generate_cache`]: ref:ethereum.ethash.generate_cache
301    """
302    mix = keccak512(
303        (
304            le_uint32_sequence_to_uint(cache[index % len(cache)]) ^ index
305        ).to_le_bytes(number_bytes=HASH_BYTES)
306    )
307
308    mix_integers = le_bytes_to_uint32_sequence(mix)
309
310    for j in range(DATASET_PARENTS):
311        mix_word: U32 = mix_integers[j % 16]
312        cache_index = fnv(index ^ j, mix_word) % len(cache)
313        parent = cache[cache_index]
314        mix_integers = fnv_hash(mix_integers, parent)
315
316    mix = Hash64(le_uint32_sequence_to_bytes(mix_integers))
317
318    return keccak512(mix)
319
320
321def generate_dataset(block_number: Uint) -> Tuple[Hash64, ...]:
322    """
323    Generate the full dataset for the block identified by `block_number`.
324
325    This function is present only for demonstration purposes. It is not used
326    while validating blocks.
327    """
328    dataset_size_bytes: Uint = dataset_size(block_number)
329    cache: Tuple[Tuple[U32, ...], ...] = generate_cache(block_number)
330
331    # TODO: Parallelize this later on if it adds value
332    return tuple(
333        generate_dataset_item(cache, Uint(index))
334        for index in range(dataset_size_bytes // HASH_BYTES)
335    )
336
337
338def hashimoto(
339    header_hash: Hash32,
340    nonce: Bytes8,
341    dataset_size: Uint,
342    fetch_dataset_item: Callable[[Uint], Tuple[U32, ...]],
343) -> Tuple[bytes, Hash32]:
344    """
345    Obtain the mix digest and the final value for a header, by aggregating
346    data from the full dataset.
347
348    #### Parameters
349
350    - `header_hash` is a valid [RLP hash] of a block header.
351    - `nonce` is the propagated nonce for the given block.
352    - `dataset_size` is the size of the dataset. See [`dataset_size`].
353    - `fetch_dataset_item` is a function that retrieves a specific dataset item
354      based on its index.
355
356    #### Returns
357
358    - The mix digest generated from the header hash and propagated nonce.
359    - The final result obtained which will be checked for leading zeros (in
360      byte representation) in correspondence with the block difficulty.
361
362    [RLP hash]: ref:ethereum.rlp.rlp_hash
363    [`dataset_size`]: ref:ethereum.ethash.dataset_size
364    """
365    nonce_le = bytes(reversed(nonce))
366    seed_hash = keccak512(header_hash + nonce_le)
367    seed_head = U32.from_le_bytes(seed_hash[:4])
368
369    rows = dataset_size // 128
370    mix = le_bytes_to_uint32_sequence(seed_hash) * (MIX_BYTES // HASH_BYTES)
371
372    for i in range(HASHIMOTO_ACCESSES):
373        new_data: Tuple[U32, ...] = ()
374        parent = fnv(i ^ seed_head, mix[i % len(mix)]) % rows
375        for j in range(MIX_BYTES // HASH_BYTES):
376            # Typecasting `parent` from U32 to Uint as 2*parent + j may
377            # overflow U32.
378            new_data += fetch_dataset_item(2 * Uint(parent) + j)
379
380        mix = fnv_hash(mix, new_data)
381
382    compressed_mix = []
383    for i in range(0, len(mix), 4):
384        compressed_mix.append(
385            fnv(fnv(fnv(mix[i], mix[i + 1]), mix[i + 2]), mix[i + 3])
386        )
387
388    mix_digest = le_uint32_sequence_to_bytes(compressed_mix)
389    result = keccak256(seed_hash + mix_digest)
390
391    return mix_digest, result
392
393
394def hashimoto_light(
395    header_hash: Hash32,
396    nonce: Bytes8,
397    cache: Tuple[Tuple[U32, ...], ...],
398    dataset_size: Uint,
399) -> Tuple[bytes, Hash32]:
400    """
401    Run the [`hashimoto`] algorithm by generating dataset item using the cache
402    instead of loading the full dataset into main memory.
403
404    #### Parameters
405
406    - `header_hash` is a valid [RLP hash] of a block header.
407    - `nonce` is the propagated nonce for the given block.
408    - `cache` is the cache generated by [`generate_cache`].
409    - `dataset_size` is the size of the dataset. See [`dataset_size`].
410
411    #### Returns
412
413    - The mix digest generated from the header hash and propagated nonce.
414    - The final result obtained which will be checked for leading zeros (in
415      byte representation) in correspondence with the block difficulty.
416
417    [RLP hash]: ref:ethereum.rlp.rlp_hash
418    [`dataset_size`]: ref:ethereum.ethash.dataset_size
419    [`generate_cache`]: ref:ethereum.ethash.generate_cache
420    [`hashimoto`]: ref:ethereum.ethash.hashimoto
421    """
422
423    def fetch_dataset_item(index: Uint) -> Tuple[U32, ...]:
424        item: Hash64 = generate_dataset_item(cache, index)
425        return le_bytes_to_uint32_sequence(item)
426
427    return hashimoto(header_hash, nonce, dataset_size, fetch_dataset_item)

EPOCH_SIZE = 30000

Number of blocks before a dataset needs to be regenerated (known as an "epoch".) See [epoch].

INITIAL_CACHE_SIZE = 16777216

Size of the cache (in bytes) during the first epoch. Each subsequent epoch's cache roughly grows by [CACHE_EPOCH_GROWTH_SIZE] bytes. See [cache_size].

CACHE_EPOCH_GROWTH_SIZE = 131072

After the first epoch, the cache size grows by roughly this amount. See [cache_size].

INITIAL_DATASET_SIZE = 1073741824

Size of the dataset (in bytes) during the first epoch. Each subsequent epoch's dataset roughly grows by [DATASET_EPOCH_GROWTH_SIZE] bytes. See [dataset_size].

DATASET_EPOCH_GROWTH_SIZE = 8388608

After the first epoch, the dataset size grows by roughly this amount. See [dataset_size].

HASH_BYTES = 64

Length of a hash, in bytes.

MIX_BYTES = 128

Width of mix, in bytes. See [generate_dataset_item].

CACHE_ROUNDS = 3

Number of times to repeat the [keccak512] step while generating the hash. See [generate_cache].

DATASET_PARENTS = 256

Number of parents of each dataset element. See [generate_dataset_item].

HASHIMOTO_ACCESSES = 64

Number of accesses in the [hashimoto] loop.

def epoch(block_number: ethereum.base_types.Uint) -> ethereum.base_types.Uint: View Source

120def epoch(block_number: Uint) -> Uint:
121    """
122    Obtain the epoch number to which the block identified by `block_number`
123    belongs. The first epoch is numbered zero.
124
125    An Ethash epoch is a fixed number of blocks ([`EPOCH_SIZE`]) long, during
126    which the dataset remains constant. At the end of each epoch, the dataset
127    is generated anew. See [`generate_dataset`].
128
129    [`EPOCH_SIZE`]: ref:ethereum.ethash.EPOCH_SIZE
130    [`generate_dataset`]: ref:ethereum.ethash.generate_dataset
131    """
132    return block_number // EPOCH_SIZE

Obtain the epoch number to which the block identified by block_number belongs. The first epoch is numbered zero.

An Ethash epoch is a fixed number of blocks ([EPOCH_SIZE]) long, during which the dataset remains constant. At the end of each epoch, the dataset is generated anew. See [generate_dataset].

def cache_size(block_number: ethereum.base_types.Uint) -> ethereum.base_types.Uint: View Source

135def cache_size(block_number: Uint) -> Uint:
136    """
137    Obtain the cache size (in bytes) of the epoch to which `block_number`
138    belongs.
139
140    See [`INITIAL_CACHE_SIZE`] and [`CACHE_EPOCH_GROWTH_SIZE`] for the initial
141    size and linear growth rate, respectively. The cache is generated in
142    [`generate_cache`].
143
144    The actual cache size is smaller than simply multiplying
145    `CACHE_EPOCH_GROWTH_SIZE` by the epoch number to minimize the risk of
146    unintended cyclic behavior. It is defined as the highest prime number below
147    what linear growth would calculate.
148
149    [`INITIAL_CACHE_SIZE`]: ref:ethereum.ethash.INITIAL_CACHE_SIZE
150    [`CACHE_EPOCH_GROWTH_SIZE`]: ref:ethereum.ethash.CACHE_EPOCH_GROWTH_SIZE
151    [`generate_cache`]: ref:ethereum.ethash.generate_cache
152    """
153    size = INITIAL_CACHE_SIZE + (CACHE_EPOCH_GROWTH_SIZE * epoch(block_number))
154    size -= HASH_BYTES
155    while not is_prime(size // HASH_BYTES):
156        size -= 2 * HASH_BYTES
157
158    return size

Obtain the cache size (in bytes) of the epoch to which block_number belongs.

See [INITIAL_CACHE_SIZE] and [CACHE_EPOCH_GROWTH_SIZE] for the initial size and linear growth rate, respectively. The cache is generated in [generate_cache].

The actual cache size is smaller than simply multiplying CACHE_EPOCH_GROWTH_SIZE by the epoch number to minimize the risk of unintended cyclic behavior. It is defined as the highest prime number below what linear growth would calculate.

def dataset_size(block_number: ethereum.base_types.Uint) -> ethereum.base_types.Uint: View Source

161def dataset_size(block_number: Uint) -> Uint:
162    """
163    Obtain the dataset size (in bytes) of the epoch to which `block_number`
164    belongs.
165
166    See [`INITIAL_DATASET_SIZE`] and [`DATASET_EPOCH_GROWTH_SIZE`][ds] for the
167    initial size and linear growth rate, respectively. The complete dataset is
168    generated in [`generate_dataset`], while the slices used in verification
169    are generated in [`generate_dataset_item`].
170
171    The actual dataset size is smaller than simply multiplying
172    `DATASET_EPOCH_GROWTH_SIZE` by the epoch number to minimize the risk of
173    unintended cyclic behavior. It is defined as the highest prime number below
174    what linear growth would calculate.
175
176    [`INITIAL_DATASET_SIZE`]: ref:ethereum.ethash.INITIAL_DATASET_SIZE
177    [ds]: ref:ethereum.ethash.DATASET_EPOCH_GROWTH_SIZE
178    [`generate_dataset`]: ref:ethereum.ethash.generate_dataset
179    [`generate_dataset_item`]: ref:ethereum.ethash.generate_dataset_item
180    """
181    size = INITIAL_DATASET_SIZE + (
182        DATASET_EPOCH_GROWTH_SIZE * epoch(block_number)
183    )
184    size -= MIX_BYTES
185    while not is_prime(size // MIX_BYTES):
186        size -= 2 * MIX_BYTES
187
188    return size

Obtain the dataset size (in bytes) of the epoch to which block_number belongs.

See [INITIAL_DATASET_SIZE] and DATASET_EPOCH_GROWTH_SIZE">DATASET_EPOCH_GROWTH_SIZE for the initial size and linear growth rate, respectively. The complete dataset is generated in [generate_dataset], while the slices used in verification are generated in [generate_dataset_item].

The actual dataset size is smaller than simply multiplying DATASET_EPOCH_GROWTH_SIZE by the epoch number to minimize the risk of unintended cyclic behavior. It is defined as the highest prime number below what linear growth would calculate.

def generate_seed(block_number: ethereum.base_types.Uint) -> ethereum.base_types.Bytes32: View Source

191def generate_seed(block_number: Uint) -> Hash32:
192    """
193    Obtain the cache generation seed for the block identified by
194    `block_number`. See [`generate_cache`].
195
196    [`generate_cache`]: ref:ethereum.ethash.generate_cache
197    """
198    epoch_number = epoch(block_number)
199
200    seed = b"\x00" * 32
201    while epoch_number != 0:
202        seed = keccak256(seed)
203        epoch_number -= 1
204
205    return Hash32(seed)

Obtain the cache generation seed for the block identified by block_number. See [generate_cache].

def generate_cache( block_number: ethereum.base_types.Uint) -> Tuple[Tuple[ethereum.base_types.U32, ...], ...]: View Source

208def generate_cache(block_number: Uint) -> Tuple[Tuple[U32, ...], ...]:
209    """
210    Generate the cache for the block identified by `block_number`. See
211    [`generate_dataset`] for how the cache is used.
212
213    The cache is generated in two steps: filling the array with a chain of
214    [`keccak512`] hashes, then running two rounds of Sergio Demian Lerner's
215    [RandMemoHash] on those bytes.
216
217    [`keccak512`]: ref:ethereum.crypto.hash.keccak512
218    [`generate_dataset`]: ref:ethereum.ethash.generate_dataset
219    [RandMemoHash]: http://www.hashcash.org/papers/memohash.pdf
220    """
221    seed = generate_seed(block_number)
222    cache_size_bytes = cache_size(block_number)
223
224    cache_size_words = cache_size_bytes // HASH_BYTES
225    cache = [keccak512(seed)]
226
227    for index in range(1, cache_size_words):
228        cache_item = keccak512(cache[index - 1])
229        cache.append(cache_item)
230
231    for _ in range(CACHE_ROUNDS):
232        for index in range(cache_size_words):
233            # Converting `cache_size_words` to int as `-1 + Uint(5)` is an
234            # error.
235            first_cache_item = cache[
236                (index - 1 + int(cache_size_words)) % cache_size_words
237            ]
238            second_cache_item = cache[
239                U32.from_le_bytes(cache[index][0:4]) % cache_size_words
240            ]
241            result = bytes(
242                [a ^ b for a, b in zip(first_cache_item, second_cache_item)]
243            )
244            cache[index] = keccak512(result)
245
246    return tuple(
247        le_bytes_to_uint32_sequence(cache_item) for cache_item in cache
248    )

Generate the cache for the block identified by block_number. See [generate_dataset] for how the cache is used.

The cache is generated in two steps: filling the array with a chain of [keccak512] hashes, then running two rounds of Sergio Demian Lerner's [RandMemoHash] on those bytes.

def fnv( a: Union[ethereum.base_types.Uint, ethereum.base_types.U32], b: Union[ethereum.base_types.Uint, ethereum.base_types.U32]) -> ethereum.base_types.U32: View Source

251def fnv(a: Union[Uint, U32], b: Union[Uint, U32]) -> U32:
252    """
253    A non-associative substitute for XOR, inspired by the [FNV] hash by Fowler,
254    Noll, and Vo. See [`fnv_hash`], [`generate_dataset_item`], and
255    [`hashimoto`].
256
257    Note that here we multiply the prime with the full 32-bit input, in
258    contrast with the [FNV-1] spec which multiplies the prime with one byte
259    (octet) in turn.
260
261    [`hashimoto`]: ref:ethereum.ethash.hashimoto
262    [`generate_dataset_item`]: ref:ethereum.ethash.generate_dataset_item
263    [`fnv_hash`]: ref:ethereum.ethash.fnv_hash
264    [FNV]: https://w.wiki/XKZ
265    [FNV-1]: http://www.isthe.com/chongo/tech/comp/fnv/#FNV-1
266    """
267    # This is a faster way of doing `number % (2 ** 32)`.
268    result = ((Uint(a) * 0x01000193) ^ Uint(b)) & U32.MAX_VALUE
269    return U32(result)

A non-associative substitute for XOR, inspired by the [FNV] hash by Fowler, Noll, and Vo. See [fnv_hash], [generate_dataset_item], and [hashimoto].

Note that here we multiply the prime with the full 32-bit input, in contrast with the [FNV-1] spec which multiplies the prime with one byte (octet) in turn.

def fnv_hash( mix_integers: Tuple[ethereum.base_types.U32, ...], data: Tuple[ethereum.base_types.U32, ...]) -> Tuple[ethereum.base_types.U32, ...]: View Source

272def fnv_hash(
273    mix_integers: Tuple[U32, ...], data: Tuple[U32, ...]
274) -> Tuple[U32, ...]:
275    """
276    Combines `data` into `mix_integers` using [`fnv`]. See [`hashimoto`] and
277    [`generate_dataset_item`].
278
279    [`hashimoto`]: ref:ethereum.ethash.hashimoto
280    [`generate_dataset_item`]: ref:ethereum.ethash.generate_dataset_item
281    [`fnv`]: ref:ethereum.ethash.fnv
282    """
283    return tuple(
284        fnv(mix_integers[i], data[i]) for i in range(len(mix_integers))
285    )

Combines data into mix_integers using [fnv]. See [hashimoto] and [generate_dataset_item].

def generate_dataset_item( cache: Tuple[Tuple[ethereum.base_types.U32, ...], ...], index: ethereum.base_types.Uint) -> ethereum.base_types.Bytes64: View Source

288def generate_dataset_item(
289    cache: Tuple[Tuple[U32, ...], ...], index: Uint
290) -> Hash64:
291    """
292    Generate a particular dataset item 0-indexed by `index` by hashing
293    pseudorandomly-selected entries from `cache` together. See [`fnv`] and
294    [`fnv_hash`] for the digest function, [`generate_cache`] for generating
295    `cache`, and [`generate_dataset`] for the full dataset generation
296    algorithm.
297
298    [`fnv`]: ref:ethereum.ethash.fnv
299    [`fnv_hash`]: ref:ethereum.ethash.fnv_hash
300    [`generate_dataset`]: ref:ethereum.ethash.generate_dataset
301    [`generate_cache`]: ref:ethereum.ethash.generate_cache
302    """
303    mix = keccak512(
304        (
305            le_uint32_sequence_to_uint(cache[index % len(cache)]) ^ index
306        ).to_le_bytes(number_bytes=HASH_BYTES)
307    )
308
309    mix_integers = le_bytes_to_uint32_sequence(mix)
310
311    for j in range(DATASET_PARENTS):
312        mix_word: U32 = mix_integers[j % 16]
313        cache_index = fnv(index ^ j, mix_word) % len(cache)
314        parent = cache[cache_index]
315        mix_integers = fnv_hash(mix_integers, parent)
316
317    mix = Hash64(le_uint32_sequence_to_bytes(mix_integers))
318
319    return keccak512(mix)

Generate a particular dataset item 0-indexed by index by hashing pseudorandomly-selected entries from cache together. See [fnv] and [fnv_hash] for the digest function, [generate_cache] for generating cache, and [generate_dataset] for the full dataset generation algorithm.

def generate_dataset( block_number: ethereum.base_types.Uint) -> Tuple[ethereum.base_types.Bytes64, ...]: View Source

322def generate_dataset(block_number: Uint) -> Tuple[Hash64, ...]:
323    """
324    Generate the full dataset for the block identified by `block_number`.
325
326    This function is present only for demonstration purposes. It is not used
327    while validating blocks.
328    """
329    dataset_size_bytes: Uint = dataset_size(block_number)
330    cache: Tuple[Tuple[U32, ...], ...] = generate_cache(block_number)
331
332    # TODO: Parallelize this later on if it adds value
333    return tuple(
334        generate_dataset_item(cache, Uint(index))
335        for index in range(dataset_size_bytes // HASH_BYTES)
336    )

Generate the full dataset for the block identified by block_number.

This function is present only for demonstration purposes. It is not used while validating blocks.

def hashimoto( header_hash: ethereum.base_types.Bytes32, nonce: ethereum.base_types.Bytes8, dataset_size: ethereum.base_types.Uint, fetch_dataset_item: Callable[[ethereum.base_types.Uint], Tuple[ethereum.base_types.U32, ...]]) -> Tuple[bytes, ethereum.base_types.Bytes32]: View Source

339def hashimoto(
340    header_hash: Hash32,
341    nonce: Bytes8,
342    dataset_size: Uint,
343    fetch_dataset_item: Callable[[Uint], Tuple[U32, ...]],
344) -> Tuple[bytes, Hash32]:
345    """
346    Obtain the mix digest and the final value for a header, by aggregating
347    data from the full dataset.
348
349    #### Parameters
350
351    - `header_hash` is a valid [RLP hash] of a block header.
352    - `nonce` is the propagated nonce for the given block.
353    - `dataset_size` is the size of the dataset. See [`dataset_size`].
354    - `fetch_dataset_item` is a function that retrieves a specific dataset item
355      based on its index.
356
357    #### Returns
358
359    - The mix digest generated from the header hash and propagated nonce.
360    - The final result obtained which will be checked for leading zeros (in
361      byte representation) in correspondence with the block difficulty.
362
363    [RLP hash]: ref:ethereum.rlp.rlp_hash
364    [`dataset_size`]: ref:ethereum.ethash.dataset_size
365    """
366    nonce_le = bytes(reversed(nonce))
367    seed_hash = keccak512(header_hash + nonce_le)
368    seed_head = U32.from_le_bytes(seed_hash[:4])
369
370    rows = dataset_size // 128
371    mix = le_bytes_to_uint32_sequence(seed_hash) * (MIX_BYTES // HASH_BYTES)
372
373    for i in range(HASHIMOTO_ACCESSES):
374        new_data: Tuple[U32, ...] = ()
375        parent = fnv(i ^ seed_head, mix[i % len(mix)]) % rows
376        for j in range(MIX_BYTES // HASH_BYTES):
377            # Typecasting `parent` from U32 to Uint as 2*parent + j may
378            # overflow U32.
379            new_data += fetch_dataset_item(2 * Uint(parent) + j)
380
381        mix = fnv_hash(mix, new_data)
382
383    compressed_mix = []
384    for i in range(0, len(mix), 4):
385        compressed_mix.append(
386            fnv(fnv(fnv(mix[i], mix[i + 1]), mix[i + 2]), mix[i + 3])
387        )
388
389    mix_digest = le_uint32_sequence_to_bytes(compressed_mix)
390    result = keccak256(seed_hash + mix_digest)
391
392    return mix_digest, result

Obtain the mix digest and the final value for a header, by aggregating data from the full dataset.

Parameters

header_hash is a valid [RLP hash] of a block header.
nonce is the propagated nonce for the given block.
dataset_size is the size of the dataset. See [dataset_size].
fetch_dataset_item is a function that retrieves a specific dataset item based on its index.

Returns

The mix digest generated from the header hash and propagated nonce.
The final result obtained which will be checked for leading zeros (in byte representation) in correspondence with the block difficulty.

def hashimoto_light( header_hash: ethereum.base_types.Bytes32, nonce: ethereum.base_types.Bytes8, cache: Tuple[Tuple[ethereum.base_types.U32, ...], ...], dataset_size: ethereum.base_types.Uint) -> Tuple[bytes, ethereum.base_types.Bytes32]: View Source

395def hashimoto_light(
396    header_hash: Hash32,
397    nonce: Bytes8,
398    cache: Tuple[Tuple[U32, ...], ...],
399    dataset_size: Uint,
400) -> Tuple[bytes, Hash32]:
401    """
402    Run the [`hashimoto`] algorithm by generating dataset item using the cache
403    instead of loading the full dataset into main memory.
404
405    #### Parameters
406
407    - `header_hash` is a valid [RLP hash] of a block header.
408    - `nonce` is the propagated nonce for the given block.
409    - `cache` is the cache generated by [`generate_cache`].
410    - `dataset_size` is the size of the dataset. See [`dataset_size`].
411
412    #### Returns
413
414    - The mix digest generated from the header hash and propagated nonce.
415    - The final result obtained which will be checked for leading zeros (in
416      byte representation) in correspondence with the block difficulty.
417
418    [RLP hash]: ref:ethereum.rlp.rlp_hash
419    [`dataset_size`]: ref:ethereum.ethash.dataset_size
420    [`generate_cache`]: ref:ethereum.ethash.generate_cache
421    [`hashimoto`]: ref:ethereum.ethash.hashimoto
422    """
423
424    def fetch_dataset_item(index: Uint) -> Tuple[U32, ...]:
425        item: Hash64 = generate_dataset_item(cache, index)
426        return le_bytes_to_uint32_sequence(item)
427
428    return hashimoto(header_hash, nonce, dataset_size, fetch_dataset_item)

Run the [hashimoto] algorithm by generating dataset item using the cache instead of loading the full dataset into main memory.

Parameters

header_hash is a valid [RLP hash] of a block header.
nonce is the propagated nonce for the given block.
cache is the cache generated by [generate_cache].
dataset_size is the size of the dataset. See [dataset_size].

Returns

The mix digest generated from the header hash and propagated nonce.
The final result obtained which will be checked for leading zeros (in byte representation) in correspondence with the block difficulty.