ethereum.ethash

Ethash is a proof-of-work algorithm designed to be [ASIC] resistant through memory hardness.

To achieve memory hardness, computing Ethash requires access to subsets of a large structure. The particular subsets chosen are based on the nonce and block header, while the set itself is changed every [epoch].

At a high level, the Ethash algorithm is as follows:

  1. Create a seed value, generated with [generate_seed] and based on the preceding block numbers.
  2. From the seed, compute a pseudorandom cache with [generate_cache].
  3. From the cache, generate a dataset with [generate_dataset]. The dataset grows over time based on [DATASET_EPOCH_GROWTH_SIZE].
  4. Miners hash slices of the dataset together, which is where the memory hardness is introduced. Verification of the proof-of-work only requires the cache to be able to recompute a much smaller subset of the full dataset.
  1"""
  2Ethash is a proof-of-work algorithm designed to be [ASIC] resistant through
  3[memory hardness][mem-hard].
  4
  5To achieve memory hardness, computing Ethash requires access to subsets of a
  6large structure. The particular subsets chosen are based on the nonce and block
  7header, while the set itself is changed every [`epoch`].
  8
  9At a high level, the Ethash algorithm is as follows:
 10
 111. Create a **seed** value, generated with [`generate_seed`] and based on the
 12   preceding block numbers.
 131. From the seed, compute a pseudorandom **cache** with [`generate_cache`].
 141. From the cache, generate a **dataset** with [`generate_dataset`]. The
 15   dataset grows over time based on [`DATASET_EPOCH_GROWTH_SIZE`].
 161. Miners hash slices of the dataset together, which is where the memory
 17   hardness is introduced. Verification of the proof-of-work only requires the
 18   cache to be able to recompute a much smaller subset of the full dataset.
 19
 20[`DATASET_EPOCH_GROWTH_SIZE`]: ref:ethereum.ethash.DATASET_EPOCH_GROWTH_SIZE
 21[`generate_dataset`]: ref:ethereum.ethash.generate_dataset
 22[`generate_cache`]: ref:ethereum.ethash.generate_cache
 23[`generate_seed`]: ref:ethereum.ethash.generate_seed
 24[`epoch`]: ref:ethereum.ethash.epoch
 25[ASIC]: https://en.wikipedia.org/wiki/Application-specific_integrated_circuit
 26[mem-hard]: https://en.wikipedia.org/wiki/Memory-hard_function
 27"""
 28
 29from typing import Callable, Tuple, Union
 30
 31from ethereum.base_types import U32, Bytes8, Uint
 32from ethereum.crypto.hash import Hash32, Hash64, keccak256, keccak512
 33from ethereum.utils.numeric import (
 34    is_prime,
 35    le_bytes_to_uint32_sequence,
 36    le_uint32_sequence_to_bytes,
 37    le_uint32_sequence_to_uint,
 38)
 39
 40EPOCH_SIZE = 30000
 41"""
 42Number of blocks before a dataset needs to be regenerated (known as an
 43"epoch".) See [`epoch`].
 44
 45[`epoch`]: ref:ethereum.ethash.epoch
 46"""
 47
 48INITIAL_CACHE_SIZE = 2**24
 49"""
 50Size of the cache (in bytes) during the first epoch. Each subsequent epoch's
 51cache roughly grows by [`CACHE_EPOCH_GROWTH_SIZE`] bytes. See [`cache_size`].
 52
 53[`CACHE_EPOCH_GROWTH_SIZE`]: ref:ethereum.ethash.CACHE_EPOCH_GROWTH_SIZE
 54[`cache_size`]: ref:ethereum.ethash.cache_size
 55"""
 56
 57CACHE_EPOCH_GROWTH_SIZE = 2**17
 58"""
 59After the first epoch, the cache size grows by roughly this amount. See
 60[`cache_size`].
 61
 62[`cache_size`]: ref:ethereum.ethash.cache_size
 63"""
 64
 65INITIAL_DATASET_SIZE = 2**30
 66"""
 67Size of the dataset (in bytes) during the first epoch. Each subsequent epoch's
 68dataset roughly grows by [`DATASET_EPOCH_GROWTH_SIZE`] bytes. See
 69[`dataset_size`].
 70
 71[`DATASET_EPOCH_GROWTH_SIZE`]: ref:ethereum.ethash.DATASET_EPOCH_GROWTH_SIZE
 72[`dataset_size`]: ref:ethereum.ethash.dataset_size
 73"""
 74
 75DATASET_EPOCH_GROWTH_SIZE = 2**23
 76"""
 77After the first epoch, the dataset size grows by roughly this amount. See
 78[`dataset_size`].
 79
 80[`dataset_size`]: ref:ethereum.ethash.dataset_size
 81"""
 82
 83HASH_BYTES = 64
 84"""
 85Length of a hash, in bytes.
 86"""
 87
 88MIX_BYTES = 128
 89"""
 90Width of mix, in bytes. See [`generate_dataset_item`].
 91
 92[`generate_dataset_item`]: ref:ethereum.ethash.generate_dataset_item
 93"""
 94
 95CACHE_ROUNDS = 3
 96"""
 97Number of times to repeat the [`keccak512`] step while generating the hash. See
 98[`generate_cache`].
 99
100[`keccak512`]: ref:ethereum.crypto.hash.keccak512
101[`generate_cache`]: ref:ethereum.ethash.generate_cache
102"""
103
104DATASET_PARENTS = 256
105"""
106Number of parents of each dataset element. See [`generate_dataset_item`].
107
108[`generate_dataset_item`]: ref:ethereum.ethash.generate_dataset_item
109"""
110
111HASHIMOTO_ACCESSES = 64
112"""
113Number of accesses in the [`hashimoto`] loop.
114
115[`hashimoto`]: ref:ethereum.ethash.hashimoto
116"""
117
118
119def epoch(block_number: Uint) -> Uint:
120    """
121    Obtain the epoch number to which the block identified by `block_number`
122    belongs. The first epoch is numbered zero.
123
124    An Ethash epoch is a fixed number of blocks ([`EPOCH_SIZE`]) long, during
125    which the dataset remains constant. At the end of each epoch, the dataset
126    is generated anew. See [`generate_dataset`].
127
128    [`EPOCH_SIZE`]: ref:ethereum.ethash.EPOCH_SIZE
129    [`generate_dataset`]: ref:ethereum.ethash.generate_dataset
130    """
131    return block_number // EPOCH_SIZE
132
133
134def cache_size(block_number: Uint) -> Uint:
135    """
136    Obtain the cache size (in bytes) of the epoch to which `block_number`
137    belongs.
138
139    See [`INITIAL_CACHE_SIZE`] and [`CACHE_EPOCH_GROWTH_SIZE`] for the initial
140    size and linear growth rate, respectively. The cache is generated in
141    [`generate_cache`].
142
143    The actual cache size is smaller than simply multiplying
144    `CACHE_EPOCH_GROWTH_SIZE` by the epoch number to minimize the risk of
145    unintended cyclic behavior. It is defined as the highest prime number below
146    what linear growth would calculate.
147
148    [`INITIAL_CACHE_SIZE`]: ref:ethereum.ethash.INITIAL_CACHE_SIZE
149    [`CACHE_EPOCH_GROWTH_SIZE`]: ref:ethereum.ethash.CACHE_EPOCH_GROWTH_SIZE
150    [`generate_cache`]: ref:ethereum.ethash.generate_cache
151    """
152    size = INITIAL_CACHE_SIZE + (CACHE_EPOCH_GROWTH_SIZE * epoch(block_number))
153    size -= HASH_BYTES
154    while not is_prime(size // HASH_BYTES):
155        size -= 2 * HASH_BYTES
156
157    return size
158
159
160def dataset_size(block_number: Uint) -> Uint:
161    """
162    Obtain the dataset size (in bytes) of the epoch to which `block_number`
163    belongs.
164
165    See [`INITIAL_DATASET_SIZE`] and [`DATASET_EPOCH_GROWTH_SIZE`][ds] for the
166    initial size and linear growth rate, respectively. The complete dataset is
167    generated in [`generate_dataset`], while the slices used in verification
168    are generated in [`generate_dataset_item`].
169
170    The actual dataset size is smaller than simply multiplying
171    `DATASET_EPOCH_GROWTH_SIZE` by the epoch number to minimize the risk of
172    unintended cyclic behavior. It is defined as the highest prime number below
173    what linear growth would calculate.
174
175    [`INITIAL_DATASET_SIZE`]: ref:ethereum.ethash.INITIAL_DATASET_SIZE
176    [ds]: ref:ethereum.ethash.DATASET_EPOCH_GROWTH_SIZE
177    [`generate_dataset`]: ref:ethereum.ethash.generate_dataset
178    [`generate_dataset_item`]: ref:ethereum.ethash.generate_dataset_item
179    """
180    size = INITIAL_DATASET_SIZE + (
181        DATASET_EPOCH_GROWTH_SIZE * epoch(block_number)
182    )
183    size -= MIX_BYTES
184    while not is_prime(size // MIX_BYTES):
185        size -= 2 * MIX_BYTES
186
187    return size
188
189
190def generate_seed(block_number: Uint) -> Hash32:
191    """
192    Obtain the cache generation seed for the block identified by
193    `block_number`. See [`generate_cache`].
194
195    [`generate_cache`]: ref:ethereum.ethash.generate_cache
196    """
197    epoch_number = epoch(block_number)
198
199    seed = b"\x00" * 32
200    while epoch_number != 0:
201        seed = keccak256(seed)
202        epoch_number -= 1
203
204    return Hash32(seed)
205
206
207def generate_cache(block_number: Uint) -> Tuple[Tuple[U32, ...], ...]:
208    """
209    Generate the cache for the block identified by `block_number`. See
210    [`generate_dataset`] for how the cache is used.
211
212    The cache is generated in two steps: filling the array with a chain of
213    [`keccak512`] hashes, then running two rounds of Sergio Demian Lerner's
214    [RandMemoHash] on those bytes.
215
216    [`keccak512`]: ref:ethereum.crypto.hash.keccak512
217    [`generate_dataset`]: ref:ethereum.ethash.generate_dataset
218    [RandMemoHash]: http://www.hashcash.org/papers/memohash.pdf
219    """
220    seed = generate_seed(block_number)
221    cache_size_bytes = cache_size(block_number)
222
223    cache_size_words = cache_size_bytes // HASH_BYTES
224    cache = [keccak512(seed)]
225
226    for index in range(1, cache_size_words):
227        cache_item = keccak512(cache[index - 1])
228        cache.append(cache_item)
229
230    for _ in range(CACHE_ROUNDS):
231        for index in range(cache_size_words):
232            # Converting `cache_size_words` to int as `-1 + Uint(5)` is an
233            # error.
234            first_cache_item = cache[
235                (index - 1 + int(cache_size_words)) % cache_size_words
236            ]
237            second_cache_item = cache[
238                U32.from_le_bytes(cache[index][0:4]) % cache_size_words
239            ]
240            result = bytes(
241                [a ^ b for a, b in zip(first_cache_item, second_cache_item)]
242            )
243            cache[index] = keccak512(result)
244
245    return tuple(
246        le_bytes_to_uint32_sequence(cache_item) for cache_item in cache
247    )
248
249
250def fnv(a: Union[Uint, U32], b: Union[Uint, U32]) -> U32:
251    """
252    A non-associative substitute for XOR, inspired by the [FNV] hash by Fowler,
253    Noll, and Vo. See [`fnv_hash`], [`generate_dataset_item`], and
254    [`hashimoto`].
255
256    Note that here we multiply the prime with the full 32-bit input, in
257    contrast with the [FNV-1] spec which multiplies the prime with one byte
258    (octet) in turn.
259
260    [`hashimoto`]: ref:ethereum.ethash.hashimoto
261    [`generate_dataset_item`]: ref:ethereum.ethash.generate_dataset_item
262    [`fnv_hash`]: ref:ethereum.ethash.fnv_hash
263    [FNV]: https://w.wiki/XKZ
264    [FNV-1]: http://www.isthe.com/chongo/tech/comp/fnv/#FNV-1
265    """
266    # This is a faster way of doing `number % (2 ** 32)`.
267    result = ((Uint(a) * 0x01000193) ^ Uint(b)) & U32.MAX_VALUE
268    return U32(result)
269
270
271def fnv_hash(
272    mix_integers: Tuple[U32, ...], data: Tuple[U32, ...]
273) -> Tuple[U32, ...]:
274    """
275    Combines `data` into `mix_integers` using [`fnv`]. See [`hashimoto`] and
276    [`generate_dataset_item`].
277
278    [`hashimoto`]: ref:ethereum.ethash.hashimoto
279    [`generate_dataset_item`]: ref:ethereum.ethash.generate_dataset_item
280    [`fnv`]: ref:ethereum.ethash.fnv
281    """
282    return tuple(
283        fnv(mix_integers[i], data[i]) for i in range(len(mix_integers))
284    )
285
286
287def generate_dataset_item(
288    cache: Tuple[Tuple[U32, ...], ...], index: Uint
289) -> Hash64:
290    """
291    Generate a particular dataset item 0-indexed by `index` by hashing
292    pseudorandomly-selected entries from `cache` together. See [`fnv`] and
293    [`fnv_hash`] for the digest function, [`generate_cache`] for generating
294    `cache`, and [`generate_dataset`] for the full dataset generation
295    algorithm.
296
297    [`fnv`]: ref:ethereum.ethash.fnv
298    [`fnv_hash`]: ref:ethereum.ethash.fnv_hash
299    [`generate_dataset`]: ref:ethereum.ethash.generate_dataset
300    [`generate_cache`]: ref:ethereum.ethash.generate_cache
301    """
302    mix = keccak512(
303        (
304            le_uint32_sequence_to_uint(cache[index % len(cache)]) ^ index
305        ).to_le_bytes(number_bytes=HASH_BYTES)
306    )
307
308    mix_integers = le_bytes_to_uint32_sequence(mix)
309
310    for j in range(DATASET_PARENTS):
311        mix_word: U32 = mix_integers[j % 16]
312        cache_index = fnv(index ^ j, mix_word) % len(cache)
313        parent = cache[cache_index]
314        mix_integers = fnv_hash(mix_integers, parent)
315
316    mix = Hash64(le_uint32_sequence_to_bytes(mix_integers))
317
318    return keccak512(mix)
319
320
321def generate_dataset(block_number: Uint) -> Tuple[Hash64, ...]:
322    """
323    Generate the full dataset for the block identified by `block_number`.
324
325    This function is present only for demonstration purposes. It is not used
326    while validating blocks.
327    """
328    dataset_size_bytes: Uint = dataset_size(block_number)
329    cache: Tuple[Tuple[U32, ...], ...] = generate_cache(block_number)
330
331    # TODO: Parallelize this later on if it adds value
332    return tuple(
333        generate_dataset_item(cache, Uint(index))
334        for index in range(dataset_size_bytes // HASH_BYTES)
335    )
336
337
338def hashimoto(
339    header_hash: Hash32,
340    nonce: Bytes8,
341    dataset_size: Uint,
342    fetch_dataset_item: Callable[[Uint], Tuple[U32, ...]],
343) -> Tuple[bytes, Hash32]:
344    """
345    Obtain the mix digest and the final value for a header, by aggregating
346    data from the full dataset.
347
348    #### Parameters
349
350    - `header_hash` is a valid [RLP hash] of a block header.
351    - `nonce` is the propagated nonce for the given block.
352    - `dataset_size` is the size of the dataset. See [`dataset_size`].
353    - `fetch_dataset_item` is a function that retrieves a specific dataset item
354      based on its index.
355
356    #### Returns
357
358    - The mix digest generated from the header hash and propagated nonce.
359    - The final result obtained which will be checked for leading zeros (in
360      byte representation) in correspondence with the block difficulty.
361
362    [RLP hash]: ref:ethereum.rlp.rlp_hash
363    [`dataset_size`]: ref:ethereum.ethash.dataset_size
364    """
365    nonce_le = bytes(reversed(nonce))
366    seed_hash = keccak512(header_hash + nonce_le)
367    seed_head = U32.from_le_bytes(seed_hash[:4])
368
369    rows = dataset_size // 128
370    mix = le_bytes_to_uint32_sequence(seed_hash) * (MIX_BYTES // HASH_BYTES)
371
372    for i in range(HASHIMOTO_ACCESSES):
373        new_data: Tuple[U32, ...] = ()
374        parent = fnv(i ^ seed_head, mix[i % len(mix)]) % rows
375        for j in range(MIX_BYTES // HASH_BYTES):
376            # Typecasting `parent` from U32 to Uint as 2*parent + j may
377            # overflow U32.
378            new_data += fetch_dataset_item(2 * Uint(parent) + j)
379
380        mix = fnv_hash(mix, new_data)
381
382    compressed_mix = []
383    for i in range(0, len(mix), 4):
384        compressed_mix.append(
385            fnv(fnv(fnv(mix[i], mix[i + 1]), mix[i + 2]), mix[i + 3])
386        )
387
388    mix_digest = le_uint32_sequence_to_bytes(compressed_mix)
389    result = keccak256(seed_hash + mix_digest)
390
391    return mix_digest, result
392
393
394def hashimoto_light(
395    header_hash: Hash32,
396    nonce: Bytes8,
397    cache: Tuple[Tuple[U32, ...], ...],
398    dataset_size: Uint,
399) -> Tuple[bytes, Hash32]:
400    """
401    Run the [`hashimoto`] algorithm by generating dataset item using the cache
402    instead of loading the full dataset into main memory.
403
404    #### Parameters
405
406    - `header_hash` is a valid [RLP hash] of a block header.
407    - `nonce` is the propagated nonce for the given block.
408    - `cache` is the cache generated by [`generate_cache`].
409    - `dataset_size` is the size of the dataset. See [`dataset_size`].
410
411    #### Returns
412
413    - The mix digest generated from the header hash and propagated nonce.
414    - The final result obtained which will be checked for leading zeros (in
415      byte representation) in correspondence with the block difficulty.
416
417    [RLP hash]: ref:ethereum.rlp.rlp_hash
418    [`dataset_size`]: ref:ethereum.ethash.dataset_size
419    [`generate_cache`]: ref:ethereum.ethash.generate_cache
420    [`hashimoto`]: ref:ethereum.ethash.hashimoto
421    """
422
423    def fetch_dataset_item(index: Uint) -> Tuple[U32, ...]:
424        item: Hash64 = generate_dataset_item(cache, index)
425        return le_bytes_to_uint32_sequence(item)
426
427    return hashimoto(header_hash, nonce, dataset_size, fetch_dataset_item)
EPOCH_SIZE = 30000

Number of blocks before a dataset needs to be regenerated (known as an "epoch".) See [epoch].

INITIAL_CACHE_SIZE = 16777216

Size of the cache (in bytes) during the first epoch. Each subsequent epoch's cache roughly grows by [CACHE_EPOCH_GROWTH_SIZE] bytes. See [cache_size].

CACHE_EPOCH_GROWTH_SIZE = 131072

After the first epoch, the cache size grows by roughly this amount. See [cache_size].

INITIAL_DATASET_SIZE = 1073741824

Size of the dataset (in bytes) during the first epoch. Each subsequent epoch's dataset roughly grows by [DATASET_EPOCH_GROWTH_SIZE] bytes. See [dataset_size].

DATASET_EPOCH_GROWTH_SIZE = 8388608

After the first epoch, the dataset size grows by roughly this amount. See [dataset_size].

HASH_BYTES = 64

Length of a hash, in bytes.

MIX_BYTES = 128

Width of mix, in bytes. See [generate_dataset_item].

CACHE_ROUNDS = 3

Number of times to repeat the [keccak512] step while generating the hash. See [generate_cache].

DATASET_PARENTS = 256

Number of parents of each dataset element. See [generate_dataset_item].

HASHIMOTO_ACCESSES = 64

Number of accesses in the [hashimoto] loop.

def epoch(block_number: ethereum.base_types.Uint) -> ethereum.base_types.Uint:
120def epoch(block_number: Uint) -> Uint:
121    """
122    Obtain the epoch number to which the block identified by `block_number`
123    belongs. The first epoch is numbered zero.
124
125    An Ethash epoch is a fixed number of blocks ([`EPOCH_SIZE`]) long, during
126    which the dataset remains constant. At the end of each epoch, the dataset
127    is generated anew. See [`generate_dataset`].
128
129    [`EPOCH_SIZE`]: ref:ethereum.ethash.EPOCH_SIZE
130    [`generate_dataset`]: ref:ethereum.ethash.generate_dataset
131    """
132    return block_number // EPOCH_SIZE

Obtain the epoch number to which the block identified by block_number belongs. The first epoch is numbered zero.

An Ethash epoch is a fixed number of blocks ([EPOCH_SIZE]) long, during which the dataset remains constant. At the end of each epoch, the dataset is generated anew. See [generate_dataset].

def cache_size(block_number: ethereum.base_types.Uint) -> ethereum.base_types.Uint:
135def cache_size(block_number: Uint) -> Uint:
136    """
137    Obtain the cache size (in bytes) of the epoch to which `block_number`
138    belongs.
139
140    See [`INITIAL_CACHE_SIZE`] and [`CACHE_EPOCH_GROWTH_SIZE`] for the initial
141    size and linear growth rate, respectively. The cache is generated in
142    [`generate_cache`].
143
144    The actual cache size is smaller than simply multiplying
145    `CACHE_EPOCH_GROWTH_SIZE` by the epoch number to minimize the risk of
146    unintended cyclic behavior. It is defined as the highest prime number below
147    what linear growth would calculate.
148
149    [`INITIAL_CACHE_SIZE`]: ref:ethereum.ethash.INITIAL_CACHE_SIZE
150    [`CACHE_EPOCH_GROWTH_SIZE`]: ref:ethereum.ethash.CACHE_EPOCH_GROWTH_SIZE
151    [`generate_cache`]: ref:ethereum.ethash.generate_cache
152    """
153    size = INITIAL_CACHE_SIZE + (CACHE_EPOCH_GROWTH_SIZE * epoch(block_number))
154    size -= HASH_BYTES
155    while not is_prime(size // HASH_BYTES):
156        size -= 2 * HASH_BYTES
157
158    return size

Obtain the cache size (in bytes) of the epoch to which block_number belongs.

See [INITIAL_CACHE_SIZE] and [CACHE_EPOCH_GROWTH_SIZE] for the initial size and linear growth rate, respectively. The cache is generated in [generate_cache].

The actual cache size is smaller than simply multiplying CACHE_EPOCH_GROWTH_SIZE by the epoch number to minimize the risk of unintended cyclic behavior. It is defined as the highest prime number below what linear growth would calculate.

def dataset_size(block_number: ethereum.base_types.Uint) -> ethereum.base_types.Uint:
161def dataset_size(block_number: Uint) -> Uint:
162    """
163    Obtain the dataset size (in bytes) of the epoch to which `block_number`
164    belongs.
165
166    See [`INITIAL_DATASET_SIZE`] and [`DATASET_EPOCH_GROWTH_SIZE`][ds] for the
167    initial size and linear growth rate, respectively. The complete dataset is
168    generated in [`generate_dataset`], while the slices used in verification
169    are generated in [`generate_dataset_item`].
170
171    The actual dataset size is smaller than simply multiplying
172    `DATASET_EPOCH_GROWTH_SIZE` by the epoch number to minimize the risk of
173    unintended cyclic behavior. It is defined as the highest prime number below
174    what linear growth would calculate.
175
176    [`INITIAL_DATASET_SIZE`]: ref:ethereum.ethash.INITIAL_DATASET_SIZE
177    [ds]: ref:ethereum.ethash.DATASET_EPOCH_GROWTH_SIZE
178    [`generate_dataset`]: ref:ethereum.ethash.generate_dataset
179    [`generate_dataset_item`]: ref:ethereum.ethash.generate_dataset_item
180    """
181    size = INITIAL_DATASET_SIZE + (
182        DATASET_EPOCH_GROWTH_SIZE * epoch(block_number)
183    )
184    size -= MIX_BYTES
185    while not is_prime(size // MIX_BYTES):
186        size -= 2 * MIX_BYTES
187
188    return size

Obtain the dataset size (in bytes) of the epoch to which block_number belongs.

See [INITIAL_DATASET_SIZE] and DATASET_EPOCH_GROWTH_SIZE">DATASET_EPOCH_GROWTH_SIZE for the initial size and linear growth rate, respectively. The complete dataset is generated in [generate_dataset], while the slices used in verification are generated in [generate_dataset_item].

The actual dataset size is smaller than simply multiplying DATASET_EPOCH_GROWTH_SIZE by the epoch number to minimize the risk of unintended cyclic behavior. It is defined as the highest prime number below what linear growth would calculate.

def generate_seed(block_number: ethereum.base_types.Uint) -> ethereum.base_types.Bytes32:
191def generate_seed(block_number: Uint) -> Hash32:
192    """
193    Obtain the cache generation seed for the block identified by
194    `block_number`. See [`generate_cache`].
195
196    [`generate_cache`]: ref:ethereum.ethash.generate_cache
197    """
198    epoch_number = epoch(block_number)
199
200    seed = b"\x00" * 32
201    while epoch_number != 0:
202        seed = keccak256(seed)
203        epoch_number -= 1
204
205    return Hash32(seed)

Obtain the cache generation seed for the block identified by block_number. See [generate_cache].

def generate_cache( block_number: ethereum.base_types.Uint) -> Tuple[Tuple[ethereum.base_types.U32, ...], ...]:
208def generate_cache(block_number: Uint) -> Tuple[Tuple[U32, ...], ...]:
209    """
210    Generate the cache for the block identified by `block_number`. See
211    [`generate_dataset`] for how the cache is used.
212
213    The cache is generated in two steps: filling the array with a chain of
214    [`keccak512`] hashes, then running two rounds of Sergio Demian Lerner's
215    [RandMemoHash] on those bytes.
216
217    [`keccak512`]: ref:ethereum.crypto.hash.keccak512
218    [`generate_dataset`]: ref:ethereum.ethash.generate_dataset
219    [RandMemoHash]: http://www.hashcash.org/papers/memohash.pdf
220    """
221    seed = generate_seed(block_number)
222    cache_size_bytes = cache_size(block_number)
223
224    cache_size_words = cache_size_bytes // HASH_BYTES
225    cache = [keccak512(seed)]
226
227    for index in range(1, cache_size_words):
228        cache_item = keccak512(cache[index - 1])
229        cache.append(cache_item)
230
231    for _ in range(CACHE_ROUNDS):
232        for index in range(cache_size_words):
233            # Converting `cache_size_words` to int as `-1 + Uint(5)` is an
234            # error.
235            first_cache_item = cache[
236                (index - 1 + int(cache_size_words)) % cache_size_words
237            ]
238            second_cache_item = cache[
239                U32.from_le_bytes(cache[index][0:4]) % cache_size_words
240            ]
241            result = bytes(
242                [a ^ b for a, b in zip(first_cache_item, second_cache_item)]
243            )
244            cache[index] = keccak512(result)
245
246    return tuple(
247        le_bytes_to_uint32_sequence(cache_item) for cache_item in cache
248    )

Generate the cache for the block identified by block_number. See [generate_dataset] for how the cache is used.

The cache is generated in two steps: filling the array with a chain of [keccak512] hashes, then running two rounds of Sergio Demian Lerner's [RandMemoHash] on those bytes.

251def fnv(a: Union[Uint, U32], b: Union[Uint, U32]) -> U32:
252    """
253    A non-associative substitute for XOR, inspired by the [FNV] hash by Fowler,
254    Noll, and Vo. See [`fnv_hash`], [`generate_dataset_item`], and
255    [`hashimoto`].
256
257    Note that here we multiply the prime with the full 32-bit input, in
258    contrast with the [FNV-1] spec which multiplies the prime with one byte
259    (octet) in turn.
260
261    [`hashimoto`]: ref:ethereum.ethash.hashimoto
262    [`generate_dataset_item`]: ref:ethereum.ethash.generate_dataset_item
263    [`fnv_hash`]: ref:ethereum.ethash.fnv_hash
264    [FNV]: https://w.wiki/XKZ
265    [FNV-1]: http://www.isthe.com/chongo/tech/comp/fnv/#FNV-1
266    """
267    # This is a faster way of doing `number % (2 ** 32)`.
268    result = ((Uint(a) * 0x01000193) ^ Uint(b)) & U32.MAX_VALUE
269    return U32(result)

A non-associative substitute for XOR, inspired by the [FNV] hash by Fowler, Noll, and Vo. See [fnv_hash], [generate_dataset_item], and [hashimoto].

Note that here we multiply the prime with the full 32-bit input, in contrast with the [FNV-1] spec which multiplies the prime with one byte (octet) in turn.

def fnv_hash( mix_integers: Tuple[ethereum.base_types.U32, ...], data: Tuple[ethereum.base_types.U32, ...]) -> Tuple[ethereum.base_types.U32, ...]:
272def fnv_hash(
273    mix_integers: Tuple[U32, ...], data: Tuple[U32, ...]
274) -> Tuple[U32, ...]:
275    """
276    Combines `data` into `mix_integers` using [`fnv`]. See [`hashimoto`] and
277    [`generate_dataset_item`].
278
279    [`hashimoto`]: ref:ethereum.ethash.hashimoto
280    [`generate_dataset_item`]: ref:ethereum.ethash.generate_dataset_item
281    [`fnv`]: ref:ethereum.ethash.fnv
282    """
283    return tuple(
284        fnv(mix_integers[i], data[i]) for i in range(len(mix_integers))
285    )

Combines data into mix_integers using [fnv]. See [hashimoto] and [generate_dataset_item].

def generate_dataset_item( cache: Tuple[Tuple[ethereum.base_types.U32, ...], ...], index: ethereum.base_types.Uint) -> ethereum.base_types.Bytes64:
288def generate_dataset_item(
289    cache: Tuple[Tuple[U32, ...], ...], index: Uint
290) -> Hash64:
291    """
292    Generate a particular dataset item 0-indexed by `index` by hashing
293    pseudorandomly-selected entries from `cache` together. See [`fnv`] and
294    [`fnv_hash`] for the digest function, [`generate_cache`] for generating
295    `cache`, and [`generate_dataset`] for the full dataset generation
296    algorithm.
297
298    [`fnv`]: ref:ethereum.ethash.fnv
299    [`fnv_hash`]: ref:ethereum.ethash.fnv_hash
300    [`generate_dataset`]: ref:ethereum.ethash.generate_dataset
301    [`generate_cache`]: ref:ethereum.ethash.generate_cache
302    """
303    mix = keccak512(
304        (
305            le_uint32_sequence_to_uint(cache[index % len(cache)]) ^ index
306        ).to_le_bytes(number_bytes=HASH_BYTES)
307    )
308
309    mix_integers = le_bytes_to_uint32_sequence(mix)
310
311    for j in range(DATASET_PARENTS):
312        mix_word: U32 = mix_integers[j % 16]
313        cache_index = fnv(index ^ j, mix_word) % len(cache)
314        parent = cache[cache_index]
315        mix_integers = fnv_hash(mix_integers, parent)
316
317    mix = Hash64(le_uint32_sequence_to_bytes(mix_integers))
318
319    return keccak512(mix)

Generate a particular dataset item 0-indexed by index by hashing pseudorandomly-selected entries from cache together. See [fnv] and [fnv_hash] for the digest function, [generate_cache] for generating cache, and [generate_dataset] for the full dataset generation algorithm.

def generate_dataset( block_number: ethereum.base_types.Uint) -> Tuple[ethereum.base_types.Bytes64, ...]:
322def generate_dataset(block_number: Uint) -> Tuple[Hash64, ...]:
323    """
324    Generate the full dataset for the block identified by `block_number`.
325
326    This function is present only for demonstration purposes. It is not used
327    while validating blocks.
328    """
329    dataset_size_bytes: Uint = dataset_size(block_number)
330    cache: Tuple[Tuple[U32, ...], ...] = generate_cache(block_number)
331
332    # TODO: Parallelize this later on if it adds value
333    return tuple(
334        generate_dataset_item(cache, Uint(index))
335        for index in range(dataset_size_bytes // HASH_BYTES)
336    )

Generate the full dataset for the block identified by block_number.

This function is present only for demonstration purposes. It is not used while validating blocks.

def hashimoto( header_hash: ethereum.base_types.Bytes32, nonce: ethereum.base_types.Bytes8, dataset_size: ethereum.base_types.Uint, fetch_dataset_item: Callable[[ethereum.base_types.Uint], Tuple[ethereum.base_types.U32, ...]]) -> Tuple[bytes, ethereum.base_types.Bytes32]:
339def hashimoto(
340    header_hash: Hash32,
341    nonce: Bytes8,
342    dataset_size: Uint,
343    fetch_dataset_item: Callable[[Uint], Tuple[U32, ...]],
344) -> Tuple[bytes, Hash32]:
345    """
346    Obtain the mix digest and the final value for a header, by aggregating
347    data from the full dataset.
348
349    #### Parameters
350
351    - `header_hash` is a valid [RLP hash] of a block header.
352    - `nonce` is the propagated nonce for the given block.
353    - `dataset_size` is the size of the dataset. See [`dataset_size`].
354    - `fetch_dataset_item` is a function that retrieves a specific dataset item
355      based on its index.
356
357    #### Returns
358
359    - The mix digest generated from the header hash and propagated nonce.
360    - The final result obtained which will be checked for leading zeros (in
361      byte representation) in correspondence with the block difficulty.
362
363    [RLP hash]: ref:ethereum.rlp.rlp_hash
364    [`dataset_size`]: ref:ethereum.ethash.dataset_size
365    """
366    nonce_le = bytes(reversed(nonce))
367    seed_hash = keccak512(header_hash + nonce_le)
368    seed_head = U32.from_le_bytes(seed_hash[:4])
369
370    rows = dataset_size // 128
371    mix = le_bytes_to_uint32_sequence(seed_hash) * (MIX_BYTES // HASH_BYTES)
372
373    for i in range(HASHIMOTO_ACCESSES):
374        new_data: Tuple[U32, ...] = ()
375        parent = fnv(i ^ seed_head, mix[i % len(mix)]) % rows
376        for j in range(MIX_BYTES // HASH_BYTES):
377            # Typecasting `parent` from U32 to Uint as 2*parent + j may
378            # overflow U32.
379            new_data += fetch_dataset_item(2 * Uint(parent) + j)
380
381        mix = fnv_hash(mix, new_data)
382
383    compressed_mix = []
384    for i in range(0, len(mix), 4):
385        compressed_mix.append(
386            fnv(fnv(fnv(mix[i], mix[i + 1]), mix[i + 2]), mix[i + 3])
387        )
388
389    mix_digest = le_uint32_sequence_to_bytes(compressed_mix)
390    result = keccak256(seed_hash + mix_digest)
391
392    return mix_digest, result

Obtain the mix digest and the final value for a header, by aggregating data from the full dataset.

Parameters

  • header_hash is a valid [RLP hash] of a block header.
  • nonce is the propagated nonce for the given block.
  • dataset_size is the size of the dataset. See [dataset_size].
  • fetch_dataset_item is a function that retrieves a specific dataset item based on its index.

Returns

  • The mix digest generated from the header hash and propagated nonce.
  • The final result obtained which will be checked for leading zeros (in byte representation) in correspondence with the block difficulty.
def hashimoto_light( header_hash: ethereum.base_types.Bytes32, nonce: ethereum.base_types.Bytes8, cache: Tuple[Tuple[ethereum.base_types.U32, ...], ...], dataset_size: ethereum.base_types.Uint) -> Tuple[bytes, ethereum.base_types.Bytes32]:
395def hashimoto_light(
396    header_hash: Hash32,
397    nonce: Bytes8,
398    cache: Tuple[Tuple[U32, ...], ...],
399    dataset_size: Uint,
400) -> Tuple[bytes, Hash32]:
401    """
402    Run the [`hashimoto`] algorithm by generating dataset item using the cache
403    instead of loading the full dataset into main memory.
404
405    #### Parameters
406
407    - `header_hash` is a valid [RLP hash] of a block header.
408    - `nonce` is the propagated nonce for the given block.
409    - `cache` is the cache generated by [`generate_cache`].
410    - `dataset_size` is the size of the dataset. See [`dataset_size`].
411
412    #### Returns
413
414    - The mix digest generated from the header hash and propagated nonce.
415    - The final result obtained which will be checked for leading zeros (in
416      byte representation) in correspondence with the block difficulty.
417
418    [RLP hash]: ref:ethereum.rlp.rlp_hash
419    [`dataset_size`]: ref:ethereum.ethash.dataset_size
420    [`generate_cache`]: ref:ethereum.ethash.generate_cache
421    [`hashimoto`]: ref:ethereum.ethash.hashimoto
422    """
423
424    def fetch_dataset_item(index: Uint) -> Tuple[U32, ...]:
425        item: Hash64 = generate_dataset_item(cache, index)
426        return le_bytes_to_uint32_sequence(item)
427
428    return hashimoto(header_hash, nonce, dataset_size, fetch_dataset_item)

Run the [hashimoto] algorithm by generating dataset item using the cache instead of loading the full dataset into main memory.

Parameters

  • header_hash is a valid [RLP hash] of a block header.
  • nonce is the propagated nonce for the given block.
  • cache is the cache generated by [generate_cache].
  • dataset_size is the size of the dataset. See [dataset_size].

Returns

  • The mix digest generated from the header hash and propagated nonce.
  • The final result obtained which will be checked for leading zeros (in byte representation) in correspondence with the block difficulty.