ethereum.ethash
Ethash is a proof-of-work algorithm designed to be [ASIC] resistant through memory hardness.
To achieve memory hardness, computing Ethash requires access to subsets of a
large structure. The particular subsets chosen are based on the nonce and block
header, while the set itself is changed every [epoch
].
At a high level, the Ethash algorithm is as follows:
- Create a seed value, generated with [
generate_seed
] and based on the preceding block numbers. - From the seed, compute a pseudorandom cache with [
generate_cache
]. - From the cache, generate a dataset with [
generate_dataset
]. The dataset grows over time based on [DATASET_EPOCH_GROWTH_SIZE
]. - Miners hash slices of the dataset together, which is where the memory hardness is introduced. Verification of the proof-of-work only requires the cache to be able to recompute a much smaller subset of the full dataset.
1""" 2Ethash is a proof-of-work algorithm designed to be [ASIC] resistant through 3[memory hardness][mem-hard]. 4 5To achieve memory hardness, computing Ethash requires access to subsets of a 6large structure. The particular subsets chosen are based on the nonce and block 7header, while the set itself is changed every [`epoch`]. 8 9At a high level, the Ethash algorithm is as follows: 10 111. Create a **seed** value, generated with [`generate_seed`] and based on the 12 preceding block numbers. 131. From the seed, compute a pseudorandom **cache** with [`generate_cache`]. 141. From the cache, generate a **dataset** with [`generate_dataset`]. The 15 dataset grows over time based on [`DATASET_EPOCH_GROWTH_SIZE`]. 161. Miners hash slices of the dataset together, which is where the memory 17 hardness is introduced. Verification of the proof-of-work only requires the 18 cache to be able to recompute a much smaller subset of the full dataset. 19 20[`DATASET_EPOCH_GROWTH_SIZE`]: ref:ethereum.ethash.DATASET_EPOCH_GROWTH_SIZE 21[`generate_dataset`]: ref:ethereum.ethash.generate_dataset 22[`generate_cache`]: ref:ethereum.ethash.generate_cache 23[`generate_seed`]: ref:ethereum.ethash.generate_seed 24[`epoch`]: ref:ethereum.ethash.epoch 25[ASIC]: https://en.wikipedia.org/wiki/Application-specific_integrated_circuit 26[mem-hard]: https://en.wikipedia.org/wiki/Memory-hard_function 27""" 28 29from typing import Callable, Tuple, Union 30 31from ethereum.base_types import U32, Bytes8, Uint 32from ethereum.crypto.hash import Hash32, Hash64, keccak256, keccak512 33from ethereum.utils.numeric import ( 34 is_prime, 35 le_bytes_to_uint32_sequence, 36 le_uint32_sequence_to_bytes, 37 le_uint32_sequence_to_uint, 38) 39 40EPOCH_SIZE = 30000 41""" 42Number of blocks before a dataset needs to be regenerated (known as an 43"epoch".) See [`epoch`]. 44 45[`epoch`]: ref:ethereum.ethash.epoch 46""" 47 48INITIAL_CACHE_SIZE = 2**24 49""" 50Size of the cache (in bytes) during the first epoch. Each subsequent epoch's 51cache roughly grows by [`CACHE_EPOCH_GROWTH_SIZE`] bytes. See [`cache_size`]. 52 53[`CACHE_EPOCH_GROWTH_SIZE`]: ref:ethereum.ethash.CACHE_EPOCH_GROWTH_SIZE 54[`cache_size`]: ref:ethereum.ethash.cache_size 55""" 56 57CACHE_EPOCH_GROWTH_SIZE = 2**17 58""" 59After the first epoch, the cache size grows by roughly this amount. See 60[`cache_size`]. 61 62[`cache_size`]: ref:ethereum.ethash.cache_size 63""" 64 65INITIAL_DATASET_SIZE = 2**30 66""" 67Size of the dataset (in bytes) during the first epoch. Each subsequent epoch's 68dataset roughly grows by [`DATASET_EPOCH_GROWTH_SIZE`] bytes. See 69[`dataset_size`]. 70 71[`DATASET_EPOCH_GROWTH_SIZE`]: ref:ethereum.ethash.DATASET_EPOCH_GROWTH_SIZE 72[`dataset_size`]: ref:ethereum.ethash.dataset_size 73""" 74 75DATASET_EPOCH_GROWTH_SIZE = 2**23 76""" 77After the first epoch, the dataset size grows by roughly this amount. See 78[`dataset_size`]. 79 80[`dataset_size`]: ref:ethereum.ethash.dataset_size 81""" 82 83HASH_BYTES = 64 84""" 85Length of a hash, in bytes. 86""" 87 88MIX_BYTES = 128 89""" 90Width of mix, in bytes. See [`generate_dataset_item`]. 91 92[`generate_dataset_item`]: ref:ethereum.ethash.generate_dataset_item 93""" 94 95CACHE_ROUNDS = 3 96""" 97Number of times to repeat the [`keccak512`] step while generating the hash. See 98[`generate_cache`]. 99 100[`keccak512`]: ref:ethereum.crypto.hash.keccak512 101[`generate_cache`]: ref:ethereum.ethash.generate_cache 102""" 103 104DATASET_PARENTS = 256 105""" 106Number of parents of each dataset element. See [`generate_dataset_item`]. 107 108[`generate_dataset_item`]: ref:ethereum.ethash.generate_dataset_item 109""" 110 111HASHIMOTO_ACCESSES = 64 112""" 113Number of accesses in the [`hashimoto`] loop. 114 115[`hashimoto`]: ref:ethereum.ethash.hashimoto 116""" 117 118 119def epoch(block_number: Uint) -> Uint: 120 """ 121 Obtain the epoch number to which the block identified by `block_number` 122 belongs. The first epoch is numbered zero. 123 124 An Ethash epoch is a fixed number of blocks ([`EPOCH_SIZE`]) long, during 125 which the dataset remains constant. At the end of each epoch, the dataset 126 is generated anew. See [`generate_dataset`]. 127 128 [`EPOCH_SIZE`]: ref:ethereum.ethash.EPOCH_SIZE 129 [`generate_dataset`]: ref:ethereum.ethash.generate_dataset 130 """ 131 return block_number // EPOCH_SIZE 132 133 134def cache_size(block_number: Uint) -> Uint: 135 """ 136 Obtain the cache size (in bytes) of the epoch to which `block_number` 137 belongs. 138 139 See [`INITIAL_CACHE_SIZE`] and [`CACHE_EPOCH_GROWTH_SIZE`] for the initial 140 size and linear growth rate, respectively. The cache is generated in 141 [`generate_cache`]. 142 143 The actual cache size is smaller than simply multiplying 144 `CACHE_EPOCH_GROWTH_SIZE` by the epoch number to minimize the risk of 145 unintended cyclic behavior. It is defined as the highest prime number below 146 what linear growth would calculate. 147 148 [`INITIAL_CACHE_SIZE`]: ref:ethereum.ethash.INITIAL_CACHE_SIZE 149 [`CACHE_EPOCH_GROWTH_SIZE`]: ref:ethereum.ethash.CACHE_EPOCH_GROWTH_SIZE 150 [`generate_cache`]: ref:ethereum.ethash.generate_cache 151 """ 152 size = INITIAL_CACHE_SIZE + (CACHE_EPOCH_GROWTH_SIZE * epoch(block_number)) 153 size -= HASH_BYTES 154 while not is_prime(size // HASH_BYTES): 155 size -= 2 * HASH_BYTES 156 157 return size 158 159 160def dataset_size(block_number: Uint) -> Uint: 161 """ 162 Obtain the dataset size (in bytes) of the epoch to which `block_number` 163 belongs. 164 165 See [`INITIAL_DATASET_SIZE`] and [`DATASET_EPOCH_GROWTH_SIZE`][ds] for the 166 initial size and linear growth rate, respectively. The complete dataset is 167 generated in [`generate_dataset`], while the slices used in verification 168 are generated in [`generate_dataset_item`]. 169 170 The actual dataset size is smaller than simply multiplying 171 `DATASET_EPOCH_GROWTH_SIZE` by the epoch number to minimize the risk of 172 unintended cyclic behavior. It is defined as the highest prime number below 173 what linear growth would calculate. 174 175 [`INITIAL_DATASET_SIZE`]: ref:ethereum.ethash.INITIAL_DATASET_SIZE 176 [ds]: ref:ethereum.ethash.DATASET_EPOCH_GROWTH_SIZE 177 [`generate_dataset`]: ref:ethereum.ethash.generate_dataset 178 [`generate_dataset_item`]: ref:ethereum.ethash.generate_dataset_item 179 """ 180 size = INITIAL_DATASET_SIZE + ( 181 DATASET_EPOCH_GROWTH_SIZE * epoch(block_number) 182 ) 183 size -= MIX_BYTES 184 while not is_prime(size // MIX_BYTES): 185 size -= 2 * MIX_BYTES 186 187 return size 188 189 190def generate_seed(block_number: Uint) -> Hash32: 191 """ 192 Obtain the cache generation seed for the block identified by 193 `block_number`. See [`generate_cache`]. 194 195 [`generate_cache`]: ref:ethereum.ethash.generate_cache 196 """ 197 epoch_number = epoch(block_number) 198 199 seed = b"\x00" * 32 200 while epoch_number != 0: 201 seed = keccak256(seed) 202 epoch_number -= 1 203 204 return Hash32(seed) 205 206 207def generate_cache(block_number: Uint) -> Tuple[Tuple[U32, ...], ...]: 208 """ 209 Generate the cache for the block identified by `block_number`. See 210 [`generate_dataset`] for how the cache is used. 211 212 The cache is generated in two steps: filling the array with a chain of 213 [`keccak512`] hashes, then running two rounds of Sergio Demian Lerner's 214 [RandMemoHash] on those bytes. 215 216 [`keccak512`]: ref:ethereum.crypto.hash.keccak512 217 [`generate_dataset`]: ref:ethereum.ethash.generate_dataset 218 [RandMemoHash]: http://www.hashcash.org/papers/memohash.pdf 219 """ 220 seed = generate_seed(block_number) 221 cache_size_bytes = cache_size(block_number) 222 223 cache_size_words = cache_size_bytes // HASH_BYTES 224 cache = [keccak512(seed)] 225 226 for index in range(1, cache_size_words): 227 cache_item = keccak512(cache[index - 1]) 228 cache.append(cache_item) 229 230 for _ in range(CACHE_ROUNDS): 231 for index in range(cache_size_words): 232 # Converting `cache_size_words` to int as `-1 + Uint(5)` is an 233 # error. 234 first_cache_item = cache[ 235 (index - 1 + int(cache_size_words)) % cache_size_words 236 ] 237 second_cache_item = cache[ 238 U32.from_le_bytes(cache[index][0:4]) % cache_size_words 239 ] 240 result = bytes( 241 [a ^ b for a, b in zip(first_cache_item, second_cache_item)] 242 ) 243 cache[index] = keccak512(result) 244 245 return tuple( 246 le_bytes_to_uint32_sequence(cache_item) for cache_item in cache 247 ) 248 249 250def fnv(a: Union[Uint, U32], b: Union[Uint, U32]) -> U32: 251 """ 252 A non-associative substitute for XOR, inspired by the [FNV] hash by Fowler, 253 Noll, and Vo. See [`fnv_hash`], [`generate_dataset_item`], and 254 [`hashimoto`]. 255 256 Note that here we multiply the prime with the full 32-bit input, in 257 contrast with the [FNV-1] spec which multiplies the prime with one byte 258 (octet) in turn. 259 260 [`hashimoto`]: ref:ethereum.ethash.hashimoto 261 [`generate_dataset_item`]: ref:ethereum.ethash.generate_dataset_item 262 [`fnv_hash`]: ref:ethereum.ethash.fnv_hash 263 [FNV]: https://w.wiki/XKZ 264 [FNV-1]: http://www.isthe.com/chongo/tech/comp/fnv/#FNV-1 265 """ 266 # This is a faster way of doing `number % (2 ** 32)`. 267 result = ((Uint(a) * 0x01000193) ^ Uint(b)) & U32.MAX_VALUE 268 return U32(result) 269 270 271def fnv_hash( 272 mix_integers: Tuple[U32, ...], data: Tuple[U32, ...] 273) -> Tuple[U32, ...]: 274 """ 275 Combines `data` into `mix_integers` using [`fnv`]. See [`hashimoto`] and 276 [`generate_dataset_item`]. 277 278 [`hashimoto`]: ref:ethereum.ethash.hashimoto 279 [`generate_dataset_item`]: ref:ethereum.ethash.generate_dataset_item 280 [`fnv`]: ref:ethereum.ethash.fnv 281 """ 282 return tuple( 283 fnv(mix_integers[i], data[i]) for i in range(len(mix_integers)) 284 ) 285 286 287def generate_dataset_item( 288 cache: Tuple[Tuple[U32, ...], ...], index: Uint 289) -> Hash64: 290 """ 291 Generate a particular dataset item 0-indexed by `index` by hashing 292 pseudorandomly-selected entries from `cache` together. See [`fnv`] and 293 [`fnv_hash`] for the digest function, [`generate_cache`] for generating 294 `cache`, and [`generate_dataset`] for the full dataset generation 295 algorithm. 296 297 [`fnv`]: ref:ethereum.ethash.fnv 298 [`fnv_hash`]: ref:ethereum.ethash.fnv_hash 299 [`generate_dataset`]: ref:ethereum.ethash.generate_dataset 300 [`generate_cache`]: ref:ethereum.ethash.generate_cache 301 """ 302 mix = keccak512( 303 ( 304 le_uint32_sequence_to_uint(cache[index % len(cache)]) ^ index 305 ).to_le_bytes(number_bytes=HASH_BYTES) 306 ) 307 308 mix_integers = le_bytes_to_uint32_sequence(mix) 309 310 for j in range(DATASET_PARENTS): 311 mix_word: U32 = mix_integers[j % 16] 312 cache_index = fnv(index ^ j, mix_word) % len(cache) 313 parent = cache[cache_index] 314 mix_integers = fnv_hash(mix_integers, parent) 315 316 mix = Hash64(le_uint32_sequence_to_bytes(mix_integers)) 317 318 return keccak512(mix) 319 320 321def generate_dataset(block_number: Uint) -> Tuple[Hash64, ...]: 322 """ 323 Generate the full dataset for the block identified by `block_number`. 324 325 This function is present only for demonstration purposes. It is not used 326 while validating blocks. 327 """ 328 dataset_size_bytes: Uint = dataset_size(block_number) 329 cache: Tuple[Tuple[U32, ...], ...] = generate_cache(block_number) 330 331 # TODO: Parallelize this later on if it adds value 332 return tuple( 333 generate_dataset_item(cache, Uint(index)) 334 for index in range(dataset_size_bytes // HASH_BYTES) 335 ) 336 337 338def hashimoto( 339 header_hash: Hash32, 340 nonce: Bytes8, 341 dataset_size: Uint, 342 fetch_dataset_item: Callable[[Uint], Tuple[U32, ...]], 343) -> Tuple[bytes, Hash32]: 344 """ 345 Obtain the mix digest and the final value for a header, by aggregating 346 data from the full dataset. 347 348 #### Parameters 349 350 - `header_hash` is a valid [RLP hash] of a block header. 351 - `nonce` is the propagated nonce for the given block. 352 - `dataset_size` is the size of the dataset. See [`dataset_size`]. 353 - `fetch_dataset_item` is a function that retrieves a specific dataset item 354 based on its index. 355 356 #### Returns 357 358 - The mix digest generated from the header hash and propagated nonce. 359 - The final result obtained which will be checked for leading zeros (in 360 byte representation) in correspondence with the block difficulty. 361 362 [RLP hash]: ref:ethereum.rlp.rlp_hash 363 [`dataset_size`]: ref:ethereum.ethash.dataset_size 364 """ 365 nonce_le = bytes(reversed(nonce)) 366 seed_hash = keccak512(header_hash + nonce_le) 367 seed_head = U32.from_le_bytes(seed_hash[:4]) 368 369 rows = dataset_size // 128 370 mix = le_bytes_to_uint32_sequence(seed_hash) * (MIX_BYTES // HASH_BYTES) 371 372 for i in range(HASHIMOTO_ACCESSES): 373 new_data: Tuple[U32, ...] = () 374 parent = fnv(i ^ seed_head, mix[i % len(mix)]) % rows 375 for j in range(MIX_BYTES // HASH_BYTES): 376 # Typecasting `parent` from U32 to Uint as 2*parent + j may 377 # overflow U32. 378 new_data += fetch_dataset_item(2 * Uint(parent) + j) 379 380 mix = fnv_hash(mix, new_data) 381 382 compressed_mix = [] 383 for i in range(0, len(mix), 4): 384 compressed_mix.append( 385 fnv(fnv(fnv(mix[i], mix[i + 1]), mix[i + 2]), mix[i + 3]) 386 ) 387 388 mix_digest = le_uint32_sequence_to_bytes(compressed_mix) 389 result = keccak256(seed_hash + mix_digest) 390 391 return mix_digest, result 392 393 394def hashimoto_light( 395 header_hash: Hash32, 396 nonce: Bytes8, 397 cache: Tuple[Tuple[U32, ...], ...], 398 dataset_size: Uint, 399) -> Tuple[bytes, Hash32]: 400 """ 401 Run the [`hashimoto`] algorithm by generating dataset item using the cache 402 instead of loading the full dataset into main memory. 403 404 #### Parameters 405 406 - `header_hash` is a valid [RLP hash] of a block header. 407 - `nonce` is the propagated nonce for the given block. 408 - `cache` is the cache generated by [`generate_cache`]. 409 - `dataset_size` is the size of the dataset. See [`dataset_size`]. 410 411 #### Returns 412 413 - The mix digest generated from the header hash and propagated nonce. 414 - The final result obtained which will be checked for leading zeros (in 415 byte representation) in correspondence with the block difficulty. 416 417 [RLP hash]: ref:ethereum.rlp.rlp_hash 418 [`dataset_size`]: ref:ethereum.ethash.dataset_size 419 [`generate_cache`]: ref:ethereum.ethash.generate_cache 420 [`hashimoto`]: ref:ethereum.ethash.hashimoto 421 """ 422 423 def fetch_dataset_item(index: Uint) -> Tuple[U32, ...]: 424 item: Hash64 = generate_dataset_item(cache, index) 425 return le_bytes_to_uint32_sequence(item) 426 427 return hashimoto(header_hash, nonce, dataset_size, fetch_dataset_item)
Number of blocks before a dataset needs to be regenerated (known as an
"epoch".) See [epoch
].
Size of the cache (in bytes) during the first epoch. Each subsequent epoch's
cache roughly grows by [CACHE_EPOCH_GROWTH_SIZE
] bytes. See [cache_size
].
After the first epoch, the cache size grows by roughly this amount. See
[cache_size
].
Size of the dataset (in bytes) during the first epoch. Each subsequent epoch's
dataset roughly grows by [DATASET_EPOCH_GROWTH_SIZE
] bytes. See
[dataset_size
].
After the first epoch, the dataset size grows by roughly this amount. See
[dataset_size
].
Length of a hash, in bytes.
Width of mix, in bytes. See [generate_dataset_item
].
Number of times to repeat the [keccak512
] step while generating the hash. See
[generate_cache
].
Number of parents of each dataset element. See [generate_dataset_item
].
Number of accesses in the [hashimoto
] loop.
120def epoch(block_number: Uint) -> Uint: 121 """ 122 Obtain the epoch number to which the block identified by `block_number` 123 belongs. The first epoch is numbered zero. 124 125 An Ethash epoch is a fixed number of blocks ([`EPOCH_SIZE`]) long, during 126 which the dataset remains constant. At the end of each epoch, the dataset 127 is generated anew. See [`generate_dataset`]. 128 129 [`EPOCH_SIZE`]: ref:ethereum.ethash.EPOCH_SIZE 130 [`generate_dataset`]: ref:ethereum.ethash.generate_dataset 131 """ 132 return block_number // EPOCH_SIZE
Obtain the epoch number to which the block identified by block_number
belongs. The first epoch is numbered zero.
An Ethash epoch is a fixed number of blocks ([EPOCH_SIZE
]) long, during
which the dataset remains constant. At the end of each epoch, the dataset
is generated anew. See [generate_dataset
].
135def cache_size(block_number: Uint) -> Uint: 136 """ 137 Obtain the cache size (in bytes) of the epoch to which `block_number` 138 belongs. 139 140 See [`INITIAL_CACHE_SIZE`] and [`CACHE_EPOCH_GROWTH_SIZE`] for the initial 141 size and linear growth rate, respectively. The cache is generated in 142 [`generate_cache`]. 143 144 The actual cache size is smaller than simply multiplying 145 `CACHE_EPOCH_GROWTH_SIZE` by the epoch number to minimize the risk of 146 unintended cyclic behavior. It is defined as the highest prime number below 147 what linear growth would calculate. 148 149 [`INITIAL_CACHE_SIZE`]: ref:ethereum.ethash.INITIAL_CACHE_SIZE 150 [`CACHE_EPOCH_GROWTH_SIZE`]: ref:ethereum.ethash.CACHE_EPOCH_GROWTH_SIZE 151 [`generate_cache`]: ref:ethereum.ethash.generate_cache 152 """ 153 size = INITIAL_CACHE_SIZE + (CACHE_EPOCH_GROWTH_SIZE * epoch(block_number)) 154 size -= HASH_BYTES 155 while not is_prime(size // HASH_BYTES): 156 size -= 2 * HASH_BYTES 157 158 return size
Obtain the cache size (in bytes) of the epoch to which block_number
belongs.
See [INITIAL_CACHE_SIZE
] and [CACHE_EPOCH_GROWTH_SIZE
] for the initial
size and linear growth rate, respectively. The cache is generated in
[generate_cache
].
The actual cache size is smaller than simply multiplying
CACHE_EPOCH_GROWTH_SIZE
by the epoch number to minimize the risk of
unintended cyclic behavior. It is defined as the highest prime number below
what linear growth would calculate.
161def dataset_size(block_number: Uint) -> Uint: 162 """ 163 Obtain the dataset size (in bytes) of the epoch to which `block_number` 164 belongs. 165 166 See [`INITIAL_DATASET_SIZE`] and [`DATASET_EPOCH_GROWTH_SIZE`][ds] for the 167 initial size and linear growth rate, respectively. The complete dataset is 168 generated in [`generate_dataset`], while the slices used in verification 169 are generated in [`generate_dataset_item`]. 170 171 The actual dataset size is smaller than simply multiplying 172 `DATASET_EPOCH_GROWTH_SIZE` by the epoch number to minimize the risk of 173 unintended cyclic behavior. It is defined as the highest prime number below 174 what linear growth would calculate. 175 176 [`INITIAL_DATASET_SIZE`]: ref:ethereum.ethash.INITIAL_DATASET_SIZE 177 [ds]: ref:ethereum.ethash.DATASET_EPOCH_GROWTH_SIZE 178 [`generate_dataset`]: ref:ethereum.ethash.generate_dataset 179 [`generate_dataset_item`]: ref:ethereum.ethash.generate_dataset_item 180 """ 181 size = INITIAL_DATASET_SIZE + ( 182 DATASET_EPOCH_GROWTH_SIZE * epoch(block_number) 183 ) 184 size -= MIX_BYTES 185 while not is_prime(size // MIX_BYTES): 186 size -= 2 * MIX_BYTES 187 188 return size
Obtain the dataset size (in bytes) of the epoch to which block_number
belongs.
See [INITIAL_DATASET_SIZE
] and DATASET_EPOCH_GROWTH_SIZE">DATASET_EPOCH_GROWTH_SIZE
for the
initial size and linear growth rate, respectively. The complete dataset is
generated in [generate_dataset
], while the slices used in verification
are generated in [generate_dataset_item
].
The actual dataset size is smaller than simply multiplying
DATASET_EPOCH_GROWTH_SIZE
by the epoch number to minimize the risk of
unintended cyclic behavior. It is defined as the highest prime number below
what linear growth would calculate.
191def generate_seed(block_number: Uint) -> Hash32: 192 """ 193 Obtain the cache generation seed for the block identified by 194 `block_number`. See [`generate_cache`]. 195 196 [`generate_cache`]: ref:ethereum.ethash.generate_cache 197 """ 198 epoch_number = epoch(block_number) 199 200 seed = b"\x00" * 32 201 while epoch_number != 0: 202 seed = keccak256(seed) 203 epoch_number -= 1 204 205 return Hash32(seed)
Obtain the cache generation seed for the block identified by
block_number
. See [generate_cache
].
208def generate_cache(block_number: Uint) -> Tuple[Tuple[U32, ...], ...]: 209 """ 210 Generate the cache for the block identified by `block_number`. See 211 [`generate_dataset`] for how the cache is used. 212 213 The cache is generated in two steps: filling the array with a chain of 214 [`keccak512`] hashes, then running two rounds of Sergio Demian Lerner's 215 [RandMemoHash] on those bytes. 216 217 [`keccak512`]: ref:ethereum.crypto.hash.keccak512 218 [`generate_dataset`]: ref:ethereum.ethash.generate_dataset 219 [RandMemoHash]: http://www.hashcash.org/papers/memohash.pdf 220 """ 221 seed = generate_seed(block_number) 222 cache_size_bytes = cache_size(block_number) 223 224 cache_size_words = cache_size_bytes // HASH_BYTES 225 cache = [keccak512(seed)] 226 227 for index in range(1, cache_size_words): 228 cache_item = keccak512(cache[index - 1]) 229 cache.append(cache_item) 230 231 for _ in range(CACHE_ROUNDS): 232 for index in range(cache_size_words): 233 # Converting `cache_size_words` to int as `-1 + Uint(5)` is an 234 # error. 235 first_cache_item = cache[ 236 (index - 1 + int(cache_size_words)) % cache_size_words 237 ] 238 second_cache_item = cache[ 239 U32.from_le_bytes(cache[index][0:4]) % cache_size_words 240 ] 241 result = bytes( 242 [a ^ b for a, b in zip(first_cache_item, second_cache_item)] 243 ) 244 cache[index] = keccak512(result) 245 246 return tuple( 247 le_bytes_to_uint32_sequence(cache_item) for cache_item in cache 248 )
Generate the cache for the block identified by block_number
. See
[generate_dataset
] for how the cache is used.
The cache is generated in two steps: filling the array with a chain of
[keccak512
] hashes, then running two rounds of Sergio Demian Lerner's
[RandMemoHash] on those bytes.
251def fnv(a: Union[Uint, U32], b: Union[Uint, U32]) -> U32: 252 """ 253 A non-associative substitute for XOR, inspired by the [FNV] hash by Fowler, 254 Noll, and Vo. See [`fnv_hash`], [`generate_dataset_item`], and 255 [`hashimoto`]. 256 257 Note that here we multiply the prime with the full 32-bit input, in 258 contrast with the [FNV-1] spec which multiplies the prime with one byte 259 (octet) in turn. 260 261 [`hashimoto`]: ref:ethereum.ethash.hashimoto 262 [`generate_dataset_item`]: ref:ethereum.ethash.generate_dataset_item 263 [`fnv_hash`]: ref:ethereum.ethash.fnv_hash 264 [FNV]: https://w.wiki/XKZ 265 [FNV-1]: http://www.isthe.com/chongo/tech/comp/fnv/#FNV-1 266 """ 267 # This is a faster way of doing `number % (2 ** 32)`. 268 result = ((Uint(a) * 0x01000193) ^ Uint(b)) & U32.MAX_VALUE 269 return U32(result)
A non-associative substitute for XOR, inspired by the [FNV] hash by Fowler,
Noll, and Vo. See [fnv_hash
], [generate_dataset_item
], and
[hashimoto
].
Note that here we multiply the prime with the full 32-bit input, in contrast with the [FNV-1] spec which multiplies the prime with one byte (octet) in turn.
272def fnv_hash( 273 mix_integers: Tuple[U32, ...], data: Tuple[U32, ...] 274) -> Tuple[U32, ...]: 275 """ 276 Combines `data` into `mix_integers` using [`fnv`]. See [`hashimoto`] and 277 [`generate_dataset_item`]. 278 279 [`hashimoto`]: ref:ethereum.ethash.hashimoto 280 [`generate_dataset_item`]: ref:ethereum.ethash.generate_dataset_item 281 [`fnv`]: ref:ethereum.ethash.fnv 282 """ 283 return tuple( 284 fnv(mix_integers[i], data[i]) for i in range(len(mix_integers)) 285 )
Combines data
into mix_integers
using [fnv
]. See [hashimoto
] and
[generate_dataset_item
].
288def generate_dataset_item( 289 cache: Tuple[Tuple[U32, ...], ...], index: Uint 290) -> Hash64: 291 """ 292 Generate a particular dataset item 0-indexed by `index` by hashing 293 pseudorandomly-selected entries from `cache` together. See [`fnv`] and 294 [`fnv_hash`] for the digest function, [`generate_cache`] for generating 295 `cache`, and [`generate_dataset`] for the full dataset generation 296 algorithm. 297 298 [`fnv`]: ref:ethereum.ethash.fnv 299 [`fnv_hash`]: ref:ethereum.ethash.fnv_hash 300 [`generate_dataset`]: ref:ethereum.ethash.generate_dataset 301 [`generate_cache`]: ref:ethereum.ethash.generate_cache 302 """ 303 mix = keccak512( 304 ( 305 le_uint32_sequence_to_uint(cache[index % len(cache)]) ^ index 306 ).to_le_bytes(number_bytes=HASH_BYTES) 307 ) 308 309 mix_integers = le_bytes_to_uint32_sequence(mix) 310 311 for j in range(DATASET_PARENTS): 312 mix_word: U32 = mix_integers[j % 16] 313 cache_index = fnv(index ^ j, mix_word) % len(cache) 314 parent = cache[cache_index] 315 mix_integers = fnv_hash(mix_integers, parent) 316 317 mix = Hash64(le_uint32_sequence_to_bytes(mix_integers)) 318 319 return keccak512(mix)
Generate a particular dataset item 0-indexed by index
by hashing
pseudorandomly-selected entries from cache
together. See [fnv
] and
[fnv_hash
] for the digest function, [generate_cache
] for generating
cache
, and [generate_dataset
] for the full dataset generation
algorithm.
322def generate_dataset(block_number: Uint) -> Tuple[Hash64, ...]: 323 """ 324 Generate the full dataset for the block identified by `block_number`. 325 326 This function is present only for demonstration purposes. It is not used 327 while validating blocks. 328 """ 329 dataset_size_bytes: Uint = dataset_size(block_number) 330 cache: Tuple[Tuple[U32, ...], ...] = generate_cache(block_number) 331 332 # TODO: Parallelize this later on if it adds value 333 return tuple( 334 generate_dataset_item(cache, Uint(index)) 335 for index in range(dataset_size_bytes // HASH_BYTES) 336 )
Generate the full dataset for the block identified by block_number
.
This function is present only for demonstration purposes. It is not used while validating blocks.
339def hashimoto( 340 header_hash: Hash32, 341 nonce: Bytes8, 342 dataset_size: Uint, 343 fetch_dataset_item: Callable[[Uint], Tuple[U32, ...]], 344) -> Tuple[bytes, Hash32]: 345 """ 346 Obtain the mix digest and the final value for a header, by aggregating 347 data from the full dataset. 348 349 #### Parameters 350 351 - `header_hash` is a valid [RLP hash] of a block header. 352 - `nonce` is the propagated nonce for the given block. 353 - `dataset_size` is the size of the dataset. See [`dataset_size`]. 354 - `fetch_dataset_item` is a function that retrieves a specific dataset item 355 based on its index. 356 357 #### Returns 358 359 - The mix digest generated from the header hash and propagated nonce. 360 - The final result obtained which will be checked for leading zeros (in 361 byte representation) in correspondence with the block difficulty. 362 363 [RLP hash]: ref:ethereum.rlp.rlp_hash 364 [`dataset_size`]: ref:ethereum.ethash.dataset_size 365 """ 366 nonce_le = bytes(reversed(nonce)) 367 seed_hash = keccak512(header_hash + nonce_le) 368 seed_head = U32.from_le_bytes(seed_hash[:4]) 369 370 rows = dataset_size // 128 371 mix = le_bytes_to_uint32_sequence(seed_hash) * (MIX_BYTES // HASH_BYTES) 372 373 for i in range(HASHIMOTO_ACCESSES): 374 new_data: Tuple[U32, ...] = () 375 parent = fnv(i ^ seed_head, mix[i % len(mix)]) % rows 376 for j in range(MIX_BYTES // HASH_BYTES): 377 # Typecasting `parent` from U32 to Uint as 2*parent + j may 378 # overflow U32. 379 new_data += fetch_dataset_item(2 * Uint(parent) + j) 380 381 mix = fnv_hash(mix, new_data) 382 383 compressed_mix = [] 384 for i in range(0, len(mix), 4): 385 compressed_mix.append( 386 fnv(fnv(fnv(mix[i], mix[i + 1]), mix[i + 2]), mix[i + 3]) 387 ) 388 389 mix_digest = le_uint32_sequence_to_bytes(compressed_mix) 390 result = keccak256(seed_hash + mix_digest) 391 392 return mix_digest, result
Obtain the mix digest and the final value for a header, by aggregating data from the full dataset.
Parameters
header_hash
is a valid [RLP hash] of a block header.nonce
is the propagated nonce for the given block.dataset_size
is the size of the dataset. See [dataset_size
].fetch_dataset_item
is a function that retrieves a specific dataset item based on its index.
Returns
- The mix digest generated from the header hash and propagated nonce.
- The final result obtained which will be checked for leading zeros (in byte representation) in correspondence with the block difficulty.
395def hashimoto_light( 396 header_hash: Hash32, 397 nonce: Bytes8, 398 cache: Tuple[Tuple[U32, ...], ...], 399 dataset_size: Uint, 400) -> Tuple[bytes, Hash32]: 401 """ 402 Run the [`hashimoto`] algorithm by generating dataset item using the cache 403 instead of loading the full dataset into main memory. 404 405 #### Parameters 406 407 - `header_hash` is a valid [RLP hash] of a block header. 408 - `nonce` is the propagated nonce for the given block. 409 - `cache` is the cache generated by [`generate_cache`]. 410 - `dataset_size` is the size of the dataset. See [`dataset_size`]. 411 412 #### Returns 413 414 - The mix digest generated from the header hash and propagated nonce. 415 - The final result obtained which will be checked for leading zeros (in 416 byte representation) in correspondence with the block difficulty. 417 418 [RLP hash]: ref:ethereum.rlp.rlp_hash 419 [`dataset_size`]: ref:ethereum.ethash.dataset_size 420 [`generate_cache`]: ref:ethereum.ethash.generate_cache 421 [`hashimoto`]: ref:ethereum.ethash.hashimoto 422 """ 423 424 def fetch_dataset_item(index: Uint) -> Tuple[U32, ...]: 425 item: Hash64 = generate_dataset_item(cache, index) 426 return le_bytes_to_uint32_sequence(item) 427 428 return hashimoto(header_hash, nonce, dataset_size, fetch_dataset_item)
Run the [hashimoto
] algorithm by generating dataset item using the cache
instead of loading the full dataset into main memory.
Parameters
header_hash
is a valid [RLP hash] of a block header.nonce
is the propagated nonce for the given block.cache
is the cache generated by [generate_cache
].dataset_size
is the size of the dataset. See [dataset_size
].
Returns
- The mix digest generated from the header hash and propagated nonce.
- The final result obtained which will be checked for leading zeros (in byte representation) in correspondence with the block difficulty.