您现在的位置：程式師世界 >> 編程語言 > >> 更多編程語言 >> Python

Cache elimination algorithm and LRU in Python_ Implementation of cache decorator

編輯：Python

1. introduction

In the previous article , We have introduced two common cache architectures — Penetration cache and bypass cache . Common cache architectures — Penetration cache and bypass cache

The main difference between the penetration cache and the bypass cache architecture is the processing method when there is no accessed data in the cache . however , The capacity of the cache is limited , Over time , Whether it is an architecture that uses a penetration cache or a bypass cache , Eventually, the cache will be full , So at this time , To cache data in real time , We need to get rid of some old 、 Low access data to increase free space for the continued use of the cache . How to select the data to be eliminated ？ This is the goal of the cache elimination algorithm .

2. Cache elimination algorithm

The most common cache obsolescence algorithms are the following ：

2.1. fifo — FIFO

FIFO namely First In First Out Abbreviation , This can be said to be the simplest algorithm to implement . The cache maintains a queue , Always insert data at the head of the team , When the cache is full , Delete the end of queue data .

The advantage of this algorithm lies in its simple implementation , But the disadvantages are also obvious ：

Sequential write and read are often not in line with our business scenarios , Although we can avoid this shortcoming by superimposing other data structures so that the queue can support random access
Data written first is not necessarily less important than data written later , This is the main problem of the algorithm

2.2. The least frequently used algorithm — LFU

LFU yes least frequently used Abbreviation . The idea of the algorithm is to eliminate the data that has been accessed the least times . The implementation method is also very simple , Only one queue needs to be maintained , The corresponding relationship between the stored data and the number of accesses . Every time data is accessed , Increase the corresponding access times , And move the node to the team leader in the linked list , Until the whole queue from the number of pairs to the end of the queue, it still keeps decreasing storage according to the number of visits . When the elimination algorithm needs to be executed , Only part of the data at the end of the team can be eliminated .

This algorithm is compared with FIFO algorithm , He takes into account the number of data accesses , So as to avoid hot data being eliminated before cold data . however , This algorithm still has some problems , That is, once a certain data is accessed in a large amount in a short time , Even if there is no access for a long time , This data is still not eliminated by virtue of its huge number of visits .

2.3. Least recently used algorithm — LRU

LRU yes Least Recently Used Abbreviation . This is the most widely used cache elimination algorithm in practice , His algorithmic thinking is similar to LFU Very similar , But he removed the number of visits , In the queue, the policy of "access will rise" , This avoids the impact caused by the excessive number of visits mentioned above . Because the algorithm is widely used , We will follow with python It is a very common method to cache parameters and results — functools.lru_cache, Let's introduce the algorithm in detail .

2.4. Recently, algorithms are most commonly used — MRU

And LRU relative ,MRU yes Most recently used Abbreviation . This algorithm works with LRU On the contrary ,MRU The algorithm preferentially removes recently used data , It is used to deal with the situation that the data that has not been used for a long time is easier to access .

3. LRU The implementation of the — python Standard library functools.lru_cache The realization of decorator

python In the standard library functools.lru_cache The decorator implements a LRU Algorithm cache , Used to cache the correspondence between all parameters of the method and the return value , It is used to improve the performance of a method when it is frequently called with the same parameters . About python Closures and decorators of , Refer to the previous article ： python The closure property of

python Decorator and its principle in

3.1. Simplified source code

The following is the simplified python Standard library functools.lru_cache Source code ：

def make_key(args, kwds):
"""
Get cached through method parameters key
:param args: Method primitive parameters
:param kwds: Method key value pair parameter
:return: After calculation key Of hash value
"""
key = args
if kwds:
key += object()
for item in kwds.items():
key += item
elif len(key) == 1 and type(key[0]) in {int, str, frozenset, type(None)}:
return key[0]
return hash(key)
def lru_cache(maxsize=128):
def decorating_function(user_function):
PREV, NEXT, KEY, RESULT = 0, 1, 2, 3
cache = {} # Cache structure , Storage key Mapping to cached data
hits = misses = 0 # A cache hit 、 Loss statistics
full = False # Whether the buffer is full is marked
cache_len = cache.__len__ # Buffer size
root = []
root[:] = [root, root, None, None] # Queue header node is empty
def lru_cache_wrapper(*args, **kwds):
nonlocal root, hits, misses, full
key = make_key(args, kwds)
with RLock(): # Thread safety lock
link = cache.get(key)
""" A cache hit , Move the hit node , Return the preset result directly """
if link is not None:
""" Remove the hit node from the linked list """
link_prev, link_next, _key, result = link
link_prev[NEXT] = link_next
link_next[PREV] = link_prev
""" Move the hit node to the end of the queue """
last = root[PREV]
last[NEXT] = root[PREV] = link
link[PREV] = last
link[NEXT] = root
hits += 1
return result
""" Cache miss , Call the method to get the return , Create nodes , Elimination algorithm """
result = user_function(*args, **kwds)
with RLock():
if key in cache:
pass
elif full:
""" Buffer full , Eliminate the first element of the team , Delete the corresponding buffer element """
oldroot = root
oldroot[KEY] = key
oldroot[RESULT] = result
root = oldroot[NEXT]
oldkey = root[KEY]
root[KEY] = root[RESULT] = None
del cache[oldkey]
cache[key] = oldroot
else:
""" The buffer is not full , Create nodes directly , insert data """
last = root[PREV]
link = [last, root, key, result]
last[NEXT] = root[PREV] = cache[key] = link
full = (cache_len() >= maxsize)
misses += 1
return result
return lru_cache_wrapper
return decorating_function

3.2. Algorithm flow

The implementation logic of the above code is very clear , The implementation method is also very clever . He implemented a buffer using a dictionary , At the same time, a circular bidirectional linked list is created , Each node in the linked list is a list,list The four elements in represent the reference of the precursor 、 Subsequent references 、key、 Function return value . Through the synchronous operation between the buffer and the ring bidirectional linked list LRU Implementation of algorithm .

【 The initial state 】 In the initial state ,cache The dictionary is empty , In the circular two-way linked list, only key、result Are all None Of root node
【 A cache hit 】 When the inserted element hits the cache , Remove the node from the linked list , And insert the node into root Before , To promote the position of the most recently used data in the linked list
【 Cache misses and queue is not full 】 When an insert element misses the cache , Then create the node of the element , And directly in the circular double line linked list root Insert node before ,cache[key] Assign to insert node
【 Cache misses and queue is full 】 You need to trigger LRU Cache elimination algorithm , At this time will be root Of key And result Assign values corresponding to the nodes to be inserted , Move backward root, take root Of key、result Assign the values to None, So as to achieve root After the adjacent nodes are cleared ,cache[key] Assign to insert node , Delete cache Node removed from

The following figure shows the algorithm execution process in the two scenarios of buffer hit and buffer elimination ：

4. utilize lru_cache Optimize method execution

We mentioned earlier , because python No tail recursive optimization , The efficiency of recursive execution algorithm is very low . In previous articles , In this case , We implemented a simple tail recursion optimization by ourselves . Classical dynamic programming problem — The frog goes up the steps and python Recursive optimization of

4.1. Recursive generation of Fibonacci sequence

Let's add... From the previous article clock Decorator , Let's look at the procedure of recursively generating Fibonacci sequence .

def clock(func):
@functools.wraps(func)
def clocked(*args, **kwargs):
t0 = time.time()
result = func(*args, **kwargs)
elapsed = time.time() - t0
name = func.__name__
arg_lst = []
if args:
arg_lst.append(', '.join(repr(arg) for arg in args))
if kwargs:
pairs = ['%s=%r' % (k, w) for k, w in sorted(kwargs.items())]
arg_lst.append(', '.join(pairs))
arg_str = ', '.join(arg_lst)
print('[%0.8fs] %s(%s) -> %r ' % (elapsed, name, arg_str, result))
return result
return clocked
@lru_cache()
@clock
def fibonacci(n):
if n < 2:
return n
return fibonacci(n - 2) + fibonacci(n - 1)
if __name__ == '__main__':
print(fibonacci(6))

Execution procedure , Printed out ：

[0.00000000s] fibonacci(0) -> 0 [0.00000000s] fibonacci(1) -> 1 [0.00000000s] fibonacci(2) -> 1 [0.00000000s] fibonacci(1) -> 1 [0.00000000s] fibonacci(0) -> 0 [0.00000000s] fibonacci(1) -> 1 [0.00000000s] fibonacci(2) -> 1 [0.00000000s] fibonacci(3) -> 2 [0.00050116s] fibonacci(4) -> 3 [0.00000000s] fibonacci(1) -> 1 [0.00000000s] fibonacci(0) -> 0 [0.00000000s] fibonacci(1) -> 1 [0.00000000s] fibonacci(2) -> 1 [0.00000000s] fibonacci(3) -> 2 [0.00000000s] fibonacci(0) -> 0 [0.00000000s] fibonacci(1) -> 1 [0.00000000s] fibonacci(2) -> 1 [0.00000000s] fibonacci(1) -> 1 [0.00000000s] fibonacci(0) -> 0 [0.00000000s] fibonacci(1) -> 1 [0.00000000s] fibonacci(2) -> 1 [0.00000000s] fibonacci(3) -> 2 [0.00000000s] fibonacci(4) -> 3 [0.00000000s] fibonacci(5) -> 5 [0.00050116s] fibonacci(6) -> 8 8

You can see , There are many repeated parameter calls that consume program performance , As the size of the series grows , The number of repeated calls will multiply . An effective optimization condition is to cache the results of these repeated calls , When calling again, you can return directly , That's exactly what it is. lru_cache Use of .

4.2. Use lru_cache Optimize the generation of Fibonacci sequence

import functools
import time
from _thread import RLock
def make_key(args, kwds):
"""
Get cached through method parameters key
:param args: Method primitive parameters
:param kwds: Method key value pair parameter
:return: After calculation key Of hash value
"""
key = args
if kwds:
key += object()
for item in kwds.items():
key += item
elif len(key) == 1 and type(key[0]) in {int, str, frozenset, type(None)}:
return key[0]
return hash(key)
def lru_cache(maxsize=128):
def decorating_function(user_function):
PREV, NEXT, KEY, RESULT = 0, 1, 2, 3
cache = {} # Cache structure , Storage key Mapping to cached data
hits = misses = 0 # A cache hit 、 Loss statistics
full = False # Whether the buffer is full is marked
cache_len = cache.__len__ # Buffer size
root = []
root[:] = [root, root, None, None] # Queue header node is empty
def lru_cache_wrapper(*args, **kwds):
nonlocal root, hits, misses, full
key = make_key(args, kwds)
with RLock(): # Thread safety lock
link = cache.get(key)
""" A cache hit , Move the hit node , Return the preset result directly """
if link is not None:
""" Remove the hit node from the linked list """
link_prev, link_next, _key, result = link
link_prev[NEXT] = link_next
link_next[PREV] = link_prev
""" Move the hit node to the end of the queue """
last = root[PREV]
last[NEXT] = root[PREV] = link
link[PREV] = last
link[NEXT] = root
hits += 1
return result
""" Cache miss , Call the method to get the return , Create nodes , Elimination algorithm """
result = user_function(*args, **kwds)
with RLock():
if key in cache:
pass
elif full:
""" Buffer full , Eliminate the first element of the team , Delete the corresponding buffer element """
oldroot = root
oldroot[KEY] = key
oldroot[RESULT] = result
root = oldroot[NEXT]
oldkey = root[KEY]
root[KEY] = root[RESULT] = None
del cache[oldkey]
cache[key] = oldroot
else:
""" The buffer is not full , Create nodes directly , insert data """
last = root[PREV]
link = [last, root, key, result]
last[NEXT] = root[PREV] = cache[key] = link
full = (cache_len() >= maxsize)
misses += 1
return result
return lru_cache_wrapper
return decorating_function
def clock(func):
@functools.wraps(func)
def clocked(*args, **kwargs):
t0 = time.time()
result = func(*args, **kwargs)
elapsed = time.time() - t0
name = func.__name__
arg_lst = []
if args:
arg_lst.append(', '.join(repr(arg) for arg in args))
if kwargs:
pairs = ['%s=%r' % (k, w) for k, w in sorted(kwargs.items())]
arg_lst.append(', '.join(pairs))
arg_str = ', '.join(arg_lst)
print('[%0.8fs] %s(%s) -> %r ' % (elapsed, name, arg_str, result))
return result
return clocked
@lru_cache()
@clock
def fibonacci(n):
if n < 2:
return n
return fibonacci(n - 2) + fibonacci(n - 1)
if __name__ == '__main__':
print(fibonacci(6))

Printed out ：

[0.00000000s] fibonacci(0) -> 0 [0.00000000s] fibonacci(1) -> 1 [0.00000000s] fibonacci(2) -> 1 [0.00000000s] fibonacci(3) -> 2 [0.00000000s] fibonacci(4) -> 3 [0.00000000s] fibonacci(5) -> 5 [0.00000000s] fibonacci(6) -> 8 8

We see , Method from the original 25 Recursive calls become only 7 Recursive calls , The optimization effect on execution time is also quite obvious .