Python multiprocessing map function with shared memory object as additional parameter

Suppose you want to create a bunch of processes to consume a list of elements:

import multiprocessing
def consume(ele):
    return ele ** 2    

if __name__ == '__main__': 
    m_list = [9,8,7,6,5]
    pool = multiprocessing.Pool(processes=300)
    result = pool.map(consume, m_list)
    print result

 

Now, if you want to have a dictionary to be shared across processes, and pass it as an additional parameter to `consume`:

import multiprocessing
from multiprocessing import Manager
import functools

def consume(ele, dic):
    if dic.get(ele) is None:
        dic[ele] = ele ** 2
    return ele ** 2    

if __name__ == '__main__': 
    m_list = [9,8,7,6,5]
    # Use proc manager to create a shared memory dict across processes
    proc_manager = Manager()
    m_dict = proc_manager.dict([(s, s**2) for s in [3,4,5,6]])
    pool = multiprocessing.Pool(processes=300)
    result = pool.map(functools.partial(consume, dic=m_dict), m_list)
    print result
    print m_dict

 

Notice that we use functools to add additional parameter. We use Manager to create a shared memory object across processes.

 

p.s.: This post may be helpful to understand how (when) processes are forked in Python: http://stackoverflow.com/questions/8640367/python-manager-dict-in-multiprocessing/9536888#9536888

Leave a comment

Your email address will not be published. Required fields are marked *