Memory-hard function
In cryptography, a memory-hard function (MHF) is a function that costs a significant amount of memory to evaluate. It differs from a memory-bound function, which incurs cost by slowing down computation through memory latency. MHFs can be used as proof of work.
Memory hard measure
There are different ways to measure the memory hardness of a function, and a commonly seen measure is Cumulative Memory Complexity (CMC). In a parallel model, CMC is the sum of the memory required to compute a function over every time step of the computation.[1][2]
Other viable measures are integrating memory against physical time.[3], and measuring memory bandwidth consumption on a memory bus.[4] Functions requiring high memory bandwidth are sometimes referred to as "bandwidth-hard functions."[5]
Motivation
MHFs were designed to use a lot of memory instead of a different resource, such as CPU cycles. Bitcoin's proof-of-work used repeated evaluation of the SHA-2 function, but modern general-purpose processors, such as off-the-shelf CPUs, are inefficient when computing a fixed function many times over. Specialized hardware, such as application-specific integrated circuits (ASICs) designed for Bitcoin mining can compute these hashes up to 100,000 times faster. This led to concerns about the centralization of mining for Bitcoin and other cryptocurrencies. Because of this inequality between miners using ASICs and miners using CPUs or off-the shelf hardware, designers of later proof-of-work systems wanted to design hash functions for which it was difficult to ASIC that could evaluate the hash function significantly faster than a CPU.
Over time, it has been demonstrated that memory costs remains relatively constant between CPUs and more specialized hardware, which is why MHFs have found use in cryptocurrency mining. They are also useful in password hashing, because they significantly increase the cost of trying many possible passwords against a leaked database of hashed passwords.
Variants
MHFs can be categorized into two different groups based on their evaluation patterns: data-dependent Memory-Hard Functions (dMHF) and data-independent Memory-Hard Functions (iMHF). dMHFs are MHFs where it is not clear which data is needed for the later steps of evaluating the function until the function is evaluated, while iMHFs are functions where there is no such ambiguity. Examples of dMHFs are scrypt and Argon2d. Examples of iMHFs are Argon2i and catena. Many of these MHFs are developed to be used as password hashing functions exactly because of their memory hardness.
A notable problem of dMHFs is that they are prone to side-channel attacks like cache timing. People tend toward iMHFs for this reason, especially for password hashing. However, iMHFs are mathematically proven to have weaker memory hardness properties than dMHFs.[6]
References
- (AS15) Alwen, Serbineko, High Parallel Complexity Graphs and Memory-Hard Functions, 2015
- Alwen, Joel; Blocki, Jeremiah; Pietrzak, Krzysztof (2017-07-07). "Sustained Space Complexity". arXiv:1705.05313 [cs.CR].
- (MO16) Moran, Orlov, Simple Proofs of Space-Time and Rational Proofs of Storage, 2016
- (BR18) Blocki, Ren, Bandwidth-Hard Functions: Reductions and Lower Bounds, 2018
- Blocki, Jeremiah; Liu, Peiyuan; Ren, Ling; Zhou, Samson (2022). "Bandwidth-Hard Functions: Reductions and Lower Bounds" (PDF). Cryptology ePrint Archive. Archived (PDF) from the original on 2023-01-11. Retrieved 2023-01-11.
- Alwen, J., Blocki, J. (2016). Efficiently Computing Data-Independent Memory-Hard Functions.