CPython v/s pypy

  • CPython is the official implementation of python in C.
  • pypy is the python implementation in python.

If you want your code to run faster, you should probably just use PyPy.” Guido van Rossum (created Python)

pypy is faster than CPython because the latter is interpreted, while the former performs Just-in-time compilation(JIT). The just-in-time compilation is the middle ground between interpretation and conventional ahead-of-time compilation. Just-in-time compilers do not execute the source code itself but instead generate a set of lower-level instructions (assembly, most often) that are executed almost immediately ( source ).

Enough talk, let's benchmark python and pypy. For this, you need to install pip for both python and pypy.

sudo apt install pypy3 -y  # install pypy
wget https://bootstrap.pypa.io/get-pip.py
sudo python3 get-pip.py  # install pip for python3
sudo pypy get-pip.py # install pip for pypy3
python3 -m pip install pytest pytest-benchmark
pypy3 -m pip install pytest pytest-benchmark

Simple test

Let's do a simple test to generate digits up to 1_000_000 (you can use underscore in python between digits, it's valid and improves readability)

# main.py
def million():
    a = (i for i in range(1_000_000))
    return a

def test_million(benchmark):
    result = benchmark(million)
python3 -m pytest main.py # benchmark this code using python3
pypy3 -m pytest main.py # benchmark this code using pypy3
Name(time in us)MinMaxMean
test_million(python3)399.99991,391,100.0001624.8846
test_million(pypy3)24.1000994.200032.2871

624 / 32 ~ 20, pypy3 is 20 times faster!!😲. Note that the time is in us

Prime number test (brute-force search)

Returns all primes within the given limit.

def is_prime(number: int):
    for i in range(2, number // 2):
        if number % i == 0:
            return False
    return True

def get_primes(limit: int):
    primes = [i for i in range(2, limit + 1) if is_prime(i)]
    return primes

def test_primes_1000(benchmark):
    benchmark(get_primes, 1000)

def test_primes_10000(benchmark):
    benchmark(get_primes, 10000)

def test_primes_100000(benchmark):
    benchmark(get_primes, 100000)
Name(time in ms)MinMaxMean
test_1000(python3)1.82964.54072.0861
test_1000(pypy3)0.42221.58010.4683
test_10000(python3)146.2487183.1678163.0123
test_10000(pypy3)30.576752.859932.6391
test_100000(python3)14,368.893515,069.584014,768.0549
test_100000(pypy3)2510.64162832.09972652.4264

pypy3 ~5x faster than python3😲. Note that the time is in ms

Prime number test (sieve of Eratosthenes )

# pardon the code and the author, if not well written
# does the job and won't affect performance
def get_primes(limit: int):
    numbers = list(True for _ in range(2, limit))
    for index in range(2, limit):
        if numbers[index - 2]:
            value = index ** 2 - 2
            while value < len(numbers):
                numbers[value] = False
                value += index
    start = 2
    outputs = []
    for idx, val in enumerate(numbers):
        if val is True:
            outputs.append(start + idx)
    return outputs

def test_number_1000(benchmark):
    result = benchmark(get_primes, 1000)

def test_number_10000(benchmark):
    result = benchmark(get_primes, 10000)

def test_number_100000(benchmark):
    result = benchmark(get_primes, 100000)
Name(time in us)MinMaxMean
test_1000(python3)266.60001,140.2000315.2115
test_1000(pypy3)19.70007,806.000029.2414
test_10000(python3)2,865.70007,761.80003,490.7817
test_10000(pypy3)227.700018,049.4000378.3946
test_100000(python3)30,466.700065,760.600037,785.9692
test_100000(pypy3)3,250.000026,242.50005,935.1304

pypy3 is faster, ~ 8-10 times🤭. I'm not sure why the Max is worse for pypy3😕(please comment). Note that the time is in us

Final test - pypy JIT v/s numba JIT

# install numba for python3
python3 -m pip install numba

Same bruteforce code, except for numba

import numba

@numba.jit(nopython=True, parallel=True)
def is_prime(number: int):
...
...
Name(time in us)MinMaxMean
test_1000(numba)328.6000348.4000338.7400
test_1000(pypy3)433.20001,849.1000581.9026
test_10000(numba)11,077.000024,149.100012,012.4456
test_10000(pypy3)30,527.100058,231.700034,301.3938
test_100000(numba)766,478.0000794,072.1000782,760.6200
test_100000(pypy3)2,562,814.30002,689,518.70002,615,935.2200

pypy3 getting slower compared to numba😵 🤐. Note that the time is in us

Use PyPy instead of the standard Python interpreter. According to 20 official benchmarks it is on average 7 times faster than Python 2. PyPy2 is in my opinion the best option at the moment for competitive programming. - codeforces

Numba isn't supported by any coding competitions, 👺

Drawbacks of using pypy

  • Not all Python is supported. Most of your code is, but if you deal with low-level CPython implementation details or have some Cython binding, this would not work ( source )
# versions
❯ python -V
Python 3.8.2
❯ py -V
Python 3.6.9 (7.3.1+dfsg-4, Apr 22 2020, 05:15:29)
[PyPy 7.3.1 with GCC 9.3.0]

No Comments Yet