Jupyter Parallelism Tutorial

In this post, I am going to introduce my favorite way to make cells in Jupyter notebook run in parallel.

1. Initialize cluster using command lines or use Python `popen` (In the example below, I create a cluster with 2 workers):

from subprocess import Popen
p = Popen(['ipcluster', 'start', '-n', '2'])

2. Then, programmatically set up the client. I usually set each worker non-blocking so that even when workers are running heavy tasks I can still do experiments in other cells:

from ipyparallel import Client
rc = Client()
dview = rc[:]
print "%d workers are running" % len(rc)

e0 = rc[0]
e0.block = False
e0.activate('0')
e1 = rc[1]
e1.block = False
e1.activate('1')

3. Then, to run one cell, add the magic command `%px0` at the first line of the cell. Here `0` means you designate the cell to be run on the first worker. You can replace `0` with any number as long as it is within the number of your workers. Here is one example:

%%px1
# An example of asynchronous parallelism 
print len(df)
import time
c = 0
print time.time()
while c < 3:
    time.sleep(3)
    c += 1
print time.time()

4. You can think each cell starting with `%px[num]` as an independent workspace. Therefore, you need to explicitly import modules or data objects you want to use within the parallel cells. For example, in the example above, I must write `import time` in order to use the `time` module in the cell. The alternative is to import module/data programmatically:

# push data to all workers. the passing objects must be in a dict form.
dview.push({'churn':churn, 'data_end_date':data_end_date, 'CHURN_INT':CHURN_INT})

# import any modules you want
with dview.sync_imports():
    import sys

5. Finally, to get results, use `%pxresult0` (Similarly, you can replace `0` with other number denoting specific worker.)

%pxresult0

Note that `%pxresult0` is blocking if the result has not come out yet. If you want to do experiments in other cells, don’t run `%pxresult0` too early.

 

 

Reference:

http://minrk.github.io/drop/nbconverttest.html

https://ipython.org/ipython-doc/3/parallel/magics.html (for old ipython notebook)

http://ipyparallel.readthedocs.io/en/latest/ (for jupyter)

Leave a comment

Your email address will not be published. Required fields are marked *