If you do not have root privilege and want to install a python module, you can try the following approach: python setup.py install –user This will install packages into subdirectories of site.USER_BASE. To check what is the value of site.USER_BASE, use: import site print site.USER_BASE reference: https://docs.python.org/2/install/ Update 2018/01/06: using pip to …
Category Archives: Python
Right way to put test codes in a Python project
I’ve been struggled about where to put test files in a python project for a long time. Ideally, I think it is succinct to create a folder called “test” with all test files in it. However, the test files nested in the test folder need to import modules from parent folder. It is troublesome to import Python module …
Continue reading “Right way to put test codes in a Python project”
Jupyter Parallelism Tutorial
In this post, I am going to introduce my favorite way to make cells in Jupyter notebook run in parallel. 1. Initialize cluster using command lines or use Python `popen` (In the example below, I create a cluster with 2 workers): from subprocess import Popen p = Popen([‘ipcluster’, ‘start’, ‘-n’, ‘2’]) 2. Then, programmatically set …
Add permanent key bindings for Jupyter Notebook
This post shows how to add customized permanent key bindings for jupyter notebook. 1. check the location of your jupyter config folder using the command: sudo ~/.local/bin/jupyter –config-dir I am running Ubuntu. The config folder, by default is, `/home/your_user_name/.jupyter` 2. Create a folder `custom` in the config folder. 3. Create a file `custom.js` in the …
Continue reading “Add permanent key bindings for Jupyter Notebook”
slicing in numpy array and list
For normal list in Python, slicing copies the references without copying underlying contents. (See the fact that`id(a[1])` and `id(b[0])` are identical below.) >>> a = [1,2,3] >>> b = a[1:3] >>> a [1, 2, 3] >>> b [2, 3] >>> id(a[1]) 25231680 >>> id(b[0]) 25231680 >>> b[0] = 999 >>> a [1, 2, 3] >>> …
Create 2D array in Python
I used to create 2D zero array by the following way, for example: arr = [[0] * 3] * 4 However, `arr` actually references the list [0,0,0] 4 times. If you set `arr[1][1] = 5`, for example, you will find all “other” lists in `arr` have 5 then. >>> arr[1][1] = 5 >>> arr [[0, …
Configure PySpark in Eclipse/Pydev
Go here and download some prebuilt version for spark. (So that you don’t need to compile spark locally later) Follow this slide: http://www.slideshare.net/prossblad/install-eclipse-for-sparkpython-49100874 to setup up Python interpretor/environment in Eclipes/Pydev
Use PDB to check variables before crashes
1. Use `python -i your_script.py` to execute your program with interactive mode. This means, after your program finishes executing, or your program crashes in the midway, you will enter a python shell. 2. Suppose your script has a bug so that you enter the python shell after it crashes. Now you can play with pdb …
Continue reading “Use PDB to check variables before crashes”
Python multiprocessing map function with shared memory object as additional parameter
Suppose you want to create a bunch of processes to consume a list of elements: import multiprocessing def consume(ele): return ele ** 2 if __name__ == ‘__main__’: m_list = [9,8,7,6,5] pool = multiprocessing.Pool(processes=300) result = pool.map(consume, m_list) print result Now, if you want to have a dictionary to be shared across processes, and pass …
Enable GPU for Theano
1. Install Theano http://deeplearning.net/software/theano/install.html 2. Use the following script to test Theano can work at least in CPU mode: ”’ test whether theano is using cpu or gpu ”’ from theano import function, config, shared, sandbox import theano.tensor as T import numpy import time vlen = 10 * 30 * 768 # 10 x …