Python and JavaScript performance
I came across a cheeky little demonstration on YouTube comparing the same algorithm written in Python, C, and x86 assembly.
It compares the following pessimal C code:
#include <stdio.h>
int isPrime(int n){
for (int i=2; i <= n/2; i++)
if(!(n%i))
return 0;
return 1;
}
int main(){
int numPrimes = 0;
for (int i=2; i<250001; i++)
numPrimes += isPrime(i);
printf("%d\n", numPrimes);
return 0;
}
... with the naive equivalent in Python:
def isPrime(n):
for i in range(2, n//2 + 1):
if (not (n%i)):
return 0
return 1
numPrimes = 0
for i in range(2, 250001):
numPrimes+=isPrime(i)
print(str(numPrimes))
Here are my own quick-and-dirty performance results:
% gcc -o primes primes.c
% time ./primes
22044
./primes 2.79s user 0.00s system 99% cpu 2.795 total
% time python primes.py
22043
python primes.py 233.53s user 0.44s system 99% cpu 3:54.17 total
Wow! 83 times slower. Nothing too suprising, though: just the difference between an interpreted language like Python, and a compiled language like C.
But wait, I do use another scripting language regularly: JavaScript. Let's see how JavaScript fares.
function isPrime(n) {
for (let i = 2; i <= Math.floor(n / 2); i++) {
if (!(n % i)) return 0;
}
return 1;
}
var numPrimes = 0;
for (let i = 2; i < 250001; i++) {
numPrimes += isPrime(i);
}
console.log(numPrimes)
Since JavaScript is another interpreted language, we'd expect its performance to be closer to Python than to compiled C, right?
% time node primes.js
22044
node primes.js 1.49s user 0.04s system 96% cpu 1.574 total
What?! 87 percent faster than the compiled C program? Is JavaScript really faster than C?
Well, not quite, we forgot to turn optimization on when we compiled our C code:
% gcc -O -o primes primes.c
% time ./primes
22044
./primes 0.87s user 0.01s system 84% cpu 1.032 total
We can let out a sigh of relief, C with basic optimizations is still 71% faster than our JavaScript code.1
So why is JavaScript so fast? Well, node uses V8, which utilizes a tracing JIT compiler. While code is running, it looks for segments of "hot" code to optimize into machine code. Since our program is two big hot loops, they get optimized and JavaScript runs this simple algorithm on the order of as fast as C.
But wait, Python also has an implementation with a tracing JIT compiler, called pypy. So can Python achieve the same performance?
% time pypy primes.py
22044
pypy test.py 102.81s user 0.39s system 99% cpu 1:43.39 total
Over twice as fast as the CPython running time, but still orders of magnitude slower than JavaScript. What's going on? If both pypy and the V8 engine use the same basic techniques, shouldn't the performance be similar?
Let's look at the code: they all have pretty much the same structure. What stands out in the Python version and not in the C and JavaScript versions is the use of Python's range
iterator.
Python doesn't have C-style loops, but we can emulate it with a while loop and post-increment:
def isPrime(n):
i = 2
while i < n//2 + 1:
if (not (n%i)):
return 0
i += 1
return 1
numPrimes = 0
i = 2
while i < 250001:
numPrimes+=isPrime(i)
i += 1
print(str(numPrimes))
And now let's try pypy again:
% time pypy primes2.py
22044
pypy primes2.py 4.18s user 0.07s system 97% cpu 4.364 total
Wow! Of course, this is still 2.8 times slower than our node result, but at least we seem to be in the same ballpark.
Now, before you run off to your Python code bases to remove idiomatic loops and iterators, and replace them with while
loops and imperative looping variables, keep in mind: when running with CPython, there is no such win to be had. A CPython implementation will be just as slow either way. Stick with the nice, readable iterator loops.
The lesson here is that pypy and tracing JIT is no silver bullet to Python performance. V8-like results on idomatic Python are not here yet, and a Python for loop is still not a place you want to be if you're running a \(\Theta(n^2)\) algorithm.
-
Higher optimization levels don't help (in fact, they actually hurt a bit (a not uncommon state of affairs)). ↩