Python vs NumPy vs Nim
Yesterday I’ve stumbled on the article Pure Python vs NumPy vs TensorFlow Performance Comparison where the author gives a performance comparison of different implementations of gradient descent algorithm for a simple linear regression example.
Lately I’ve been experimenting with the Nim programming language, which promises to offer a Python-like easy to read syntax, while having C-like speeds. This seemed like a nice and simple example to compare speed between Nim and Python.
As everybody would expect, the article has shown that the pure Python version is much slower than the other two versions, but nobody would write numerical code like that.
NumPy allows us to write both more readable and much faster code, as it takes advantage of vectorised operations on NumPy arrays, and usually calls optimized C or Fortran code.
The code from the original article is used without modifications, I have just re-run it on my machine (i7-970 @ 3.20 GHz) to get the base values for a later comparison:
Python time: 34.62 seconds NumPy time: 0.71 seconds
Based on my previous experience of using both Nim and Python, I knew I could expect Nim to be noticeably faster than (pure) Python. The question is – can Nim compete against NumPy’s speed?
We’ll take “pure Nim” approach – no array/tensor library, meaning we need to iterate the arrays element by element, something that is known to be very costly in Python.
Let’s go through our program bit by bit:
import random, times, math randomize(444) const N = 10_000 sigma = 0.1 f = 2 / N mu = 0.001 nEpochs = 10_000
Compared to Python, in Nim all imports are written on the same line, and importing a module in Nim is analogous to
from foo import * in Python.
We take the same seed as in the original, and we define all the needed constants. (All indented lines are part of the
Nim is a statically typed language, but the types can be inferred from the values.
Next we need to define vectors
x = np.linspace(0, 2, N) and
d = 3 + 2 * x + noise in Python), and we need to do it element by element:
var x, d: array[N, float] for i in 0 ..< N: x[i] = f * i.float d[i] = 3.0 + 2.0 * x[i] + sigma * randomNormal()
..< iterates until the upper limit. (Operator
.. would iterate to the limit, including it.)
The thing to notice here is that we cannot combine integers and floats –
i needs to be converted to float. (Nim has UFCS support so
i.float is the same as
randomNormal, which gives us a Gaussian distribution (
np.random.randn in the Python version), is taken from Arraymancer, which is Nim’s tensor library (still in early stage, but it is rapidly developed).
The remaining thing to do is to define the
proc gradientDescent(x, d: array[N, float], mu: float, nEpochs: int): tuple[w0, w1: float] = var y: array[N, float] err: float w0, w1: float for n in 1 .. nEpochs: var grad0, grad1: float for i in 0 ..< N: err = f * (d[i] - y[i]) grad0 += err grad1 += err * x[i] w0 += mu * grad0 w1 += mu * grad1 for i in 0 ..< N: y[i] = w0 + w1 * x[i] return (w0, w1)
When declaring variables, they are initialized with their default values (0.0 for floats), meaning that
var y: array[N, float] is similar to
y = np.zeros(N, dtype=float) in Python.
Lastly, we print the values of
w1, to see if the values are close to the ones we expect (
w0 = 3,
w1 = 2) and measure the time needed for the calculation:
let start = cpuTime() echo gradientDescent(x, d, mu, nEpochs) echo "Nim time: ", cpuTime() - start, " seconds"
We compile the program in release mode, which turns off runtime checks and turns on the optimizer, and run it:
$ nim c -d:release gradDesc.nim $ ./gradDesc (w0: 2.968954757075724, w1: 2.02593328759163) Nim time: 0.226344 seconds
|Version||Time (s)||Speedup vs Python||Speedup vs NumPy|
While NumPy has offered significant speedups compared to the pure Python version, Nim manages to further improve upon that, and not just marginally – it is two orders of a magnitude faster than Python, and, what is more impressive, it is three times faster than NumPy.
Nim has managed to deliver on its promise – for this example it offers 3x performance compared to NumPy, while keeping the code readable.
Discussion on Reddit and Hacker News.
Source files (both .py and .nim) are available here.