NumExpr: The “Faster than Numpy” Library Most Data Scientists Have Never Used

the opposite day, I got here throughout a library I’d by no means heard of earlier than. It was referred to as NumExpr.

I used to be instantly due to some claims made in regards to the library. Specifically, it said that for some complicated numerical calculations, it was as much as 15 occasions sooner than NumPy.

I used to be intrigued as a result of, up till now, NumPy has remained unchallenged in its dominance within the numerical computation area in Python. Specifically with Data Science, NumPy is a cornerstone for machine studying, exploratory information evaluation and mannequin coaching. Something we are able to use to squeeze out each final little bit of efficiency in our programs will likely be welcomed. So, I made a decision to place the claims to the check myself.

You’ll find a hyperlink to the NumExpr repository on the finish of this text.

What’s NumExpr?

In line with its GitHub web page, NumExpr is a quick numerical expression evaluator for Numpy. Utilizing it, expressions that function on arrays are accelerated and use much less reminiscence than performing the identical calculations in Python with different numerical libraries, akin to NumPy.

As well as, as it’s multithreaded, NumExpr can use all of your CPU cores, which usually leads to substantial efficiency scaling in comparison with NumPy.

Organising a growth surroundings

Earlier than we begin coding, let’s arrange our growth surroundings. The very best observe is to create a separate Python surroundings the place you may set up any essential software program and experiment with coding, figuring out that something you do on this surroundings received’t have an effect on the remainder of your system. I exploit conda for this, however you should use no matter methodology you understand greatest that fits you.

If you wish to go down the Miniconda route and don’t have already got it, you have to set up Miniconda first. Get it utilizing this hyperlink:

https://www.anaconda.com/docs/main

1/ Create our new dev surroundings and set up the required libraries

(base) $ conda create -n numexpr_test python=3.12-y
(base) $ conda activate numexpr
(numexpr_test) $ pip set up numexpr
(numexpr_test) $ pip set up jupyter

2/ Begin Jupyter
Now kind in jupyter pocket book into your command immediate. You need to see a jupyter pocket book open in your browser. If that doesn’t occur robotically, you’ll doubtless see a screenful of data after the jupyter pocket book command. Close to the underside, one can find a URL that you must copy and paste into your browser to launch the Jupyter Pocket book.

Your URL will likely be completely different to mine, nevertheless it ought to look one thing like this:-

http://127.0.0.1:8888/tree?token=3b9f7bd07b6966b41b68e2350721b2d0b6f388d248cc69

Evaluating NumExpr and NumPy efficiency

To match the efficiency, we’ll run a sequence of numerical computations utilizing NumPy and NumExpr, and time each programs.

Instance 1 — A easy array addition calculation
On this instance, we run a vectorised addition of two giant arrays 5000 occasions.

import numpy as np
import numexpr as ne
import timeit

a = np.random.rand(1000000)
b = np.random.rand(1000000)

# Utilizing timeit with lambda capabilities
time_np_expr = timeit.timeit(lambda: 2*a + 3*b, quantity=5000)
time_ne_expr = timeit.timeit(lambda: ne.consider("2*a + 3*b"), quantity=5000)

print(f"Execution time (NumPy): {time_np_expr} seconds")
print(f"Execution time (NumExpr): {time_ne_expr} seconds")

>>>>>>>>>>>


Execution time (NumPy): 12.03680682599952 seconds
Execution time (NumExpr): 1.8075962659931974 seconds

I’ve to say, that’s a reasonably spectacular begin from the NumExpr library already. I make {that a} 6 occasions enchancment over the NumPy runtime.

Let’s double-check that each operations return the identical consequence set.


# Arrays to retailer the outcomes
result_np = 2*a + 3*b
result_ne = ne.consider("2*a + 3*b")

# Guarantee the 2 new arrays are equal
arrays_equal = np.array_equal(result_np, result_ne)
print(f"Arrays equal: {arrays_equal}")

>>>>>>>>>>>>

Arrays equal: True

Instance 2 — Calculate Pi utilizing a Monte Carlo simulation

Our second instance will look at a extra difficult use case with extra real-world functions.

Monte Carlo simulations contain working many iterations of a random course of to estimate a system’s properties, which could be computationally intensive.

On this case, we’ll use Monte Carlo to calculate the worth of Pi. This can be a well-known instance the place we take a sq. with a facet size of 1 unit and inscribe 1 / 4 circle inside it with a radius of 1 unit. The ratio of the quarter circle’s space to the sq.’s space is (π/4)/1, and we are able to multiply this expression by 4 to get π by itself.

So, if we take into account quite a few random (x,y) factors that each one lie inside or on the bounds of the sq., as the overall variety of these factors tends to infinity, the ratio of factors that lie on or contained in the quarter circle to the overall variety of factors tends in the direction of Pi.

First, the NumPy implementation.

import numpy as np
import timeit

def monte_carlo_pi_numpy(num_samples):
    x = np.random.rand(num_samples)
    y = np.random.rand(num_samples)
    inside_circle = (x**2 + y**2) >>>>>>>

Estimated Pi (NumPy): 3.144832
Execution Time (NumPy): 10.642843848007033 seconds

Now, utilizing NumExpr.

import numpy as np
import numexpr as ne
import timeit

def monte_carlo_pi_numexpr(num_samples):
    x = np.random.rand(num_samples)
    y = np.random.rand(num_samples)
    inside_circle = ne.consider("(x**2 + y**2) >>>>>>>>>>>>>>

Estimated Pi (NumExpr): 3.141684
Execution Time (NumExpr): 8.077501275009126 seconds

OK, so the speed-up was not as spectacular that point, however a 20% enchancment isn’t horrible both. A part of the reason being that NumExpr doesn’t have an optimised SUM() operate, so we needed to default again to NumPy for that operation.

Instance 3 — Implementing a Sobel picture filter

On this instance, we’ll implement a Sobel filter for pictures. The Sobel filter is usually utilized in picture processing for edge detection. It calculates the picture depth gradient at every pixel, highlighting edges and depth transitions. Our enter picture is of the Taj Mahal in India.

Unique picture by Yury Taranik (licensed from Shutterstock)

Let’s see the NumPy code working first and time it.

import numpy as np
from scipy.ndimage import convolve
from PIL import Picture
import timeit

# Sobel kernels
sobel_x = np.array([[-1, 0, 1],
                    [-2, 0, 2],
                    [-1, 0, 1]])

sobel_y = np.array([[-1, -2, -1],
                    [ 0,  0,  0],
                    [ 1,  2,  1]])

def sobel_filter_numpy(picture):
    """Apply Sobel filter utilizing NumPy."""
    img_array = np.array(picture.convert('L'))  # Convert to grayscale
    gradient_x = convolve(img_array, sobel_x)
    gradient_y = convolve(img_array, sobel_y)
    gradient_magnitude = np.sqrt(gradient_x**2 + gradient_y**2)
    gradient_magnitude *= 255.0 / gradient_magnitude.max()  # Normalize to 0-255
    
    return Picture.fromarray(gradient_magnitude.astype(np.uint8))

# Load an instance picture
picture = Picture.open("/mnt/d/check/taj_mahal.png")

# Benchmark the NumPy model
time_np_sobel = timeit.timeit(lambda: sobel_filter_numpy(picture), quantity=100)
sobel_image_np = sobel_filter_numpy(picture)
sobel_image_np.save("/mnt/d/check/sobel_taj_mahal_numpy.png")

print(f"Execution Time (NumPy): {time_np_sobel} seconds")

>>>>>>>>>

Execution Time (NumPy): 8.093792188999942 seconds

And now the NumExpr code.

import numpy as np
import numexpr as ne
from scipy.ndimage import convolve
from PIL import Picture
import timeit

# Sobel kernels
sobel_x = np.array([[-1, 0, 1],
                    [-2, 0, 2],
                    [-1, 0, 1]])

sobel_y = np.array([[-1, -2, -1],
                    [ 0,  0,  0],
                    [ 1,  2,  1]])

def sobel_filter_numexpr(picture):
    """Apply Sobel filter utilizing NumExpr for gradient magnitude computation."""
    img_array = np.array(picture.convert('L'))  # Convert to grayscale
    gradient_x = convolve(img_array, sobel_x)
    gradient_y = convolve(img_array, sobel_y)
    gradient_magnitude = ne.consider("sqrt(gradient_x**2 + gradient_y**2)")
    gradient_magnitude *= 255.0 / gradient_magnitude.max()  # Normalize to 0-255
    
    return Picture.fromarray(gradient_magnitude.astype(np.uint8))

# Load an instance picture
picture = Picture.open("/mnt/d/check/taj_mahal.png")

# Benchmark the NumExpr model
time_ne_sobel = timeit.timeit(lambda: sobel_filter_numexpr(picture), quantity=100)
sobel_image_ne = sobel_filter_numexpr(picture)
sobel_image_ne.save("/mnt/d/check/sobel_taj_mahal_numexpr.png")

print(f"Execution Time (NumExpr): {time_ne_sobel} seconds")

>>>>>>>>>>>>>

Execution Time (NumExpr): 4.938702256011311 seconds

On this event, utilizing NumExpr led to an incredible consequence, with a efficiency that was near double that of NumPy.

Here’s what the edge-detected picture appears like.

Instance 4 — Fourier sequence approximation

It’s well-known that complicated periodic capabilities could be simulated by making use of a sequence of sine waves superimposed on one another. On the excessive, even a sq. wave could be simply modelled on this manner. The tactic is named the Fourier sequence approximation. Though an approximation, we are able to get as near the goal wave form as reminiscence and computational capability enable.

The maths behind all this isn’t the first focus. Simply bear in mind that after we enhance the variety of iterations, the run-time of the answer rises markedly.

import numpy as np
import numexpr as ne
import time
import matplotlib.pyplot as plt

# Outline the fixed pi explicitly
pi = np.pi

# Generate a time vector and a sq. wave sign
t = np.linspace(0, 1, 1000000) # Decreased measurement for higher visualization
sign = np.signal(np.sin(2 * np.pi * 5 * t))

# Variety of phrases within the Fourier sequence
n_terms = 10000

# Fourier sequence approximation utilizing NumPy
start_time = time.time()
approx_np = np.zeros_like(t)
for n in vary(1, n_terms + 1, 2):
    approx_np += (4 / (np.pi * n)) * np.sin(2 * np.pi * n * 5 * t)
numpy_time = time.time() - start_time

# Fourier sequence approximation utilizing NumExpr
start_time = time.time()
approx_ne = np.zeros_like(t)
for n in vary(1, n_terms + 1, 2):
    approx_ne = ne.consider("approx_ne + (4 / (pi * n)) * sin(2 * pi * n * 5 * t)", local_dict={"pi": pi, "n": n, "approx_ne": approx_ne, "t": t})
numexpr_time = time.time() - start_time

print(f"NumPy Fourier sequence time: {numpy_time:.6f} seconds")
print(f"NumExpr Fourier sequence time: {numexpr_time:.6f} seconds")

# Plotting the outcomes
plt.determine(figsize=(10, 6))

plt.plot(t, sign, label='Unique Sign (Sq. Wave)', coloration='black', linestyle='--')
plt.plot(t, approx_np, label='Fourier Approximation (NumPy)', coloration='blue')
plt.plot(t, approx_ne, label='Fourier Approximation (NumExpr)', coloration='purple', linestyle='dotted')

plt.title('Fourier Sequence Approximation of a Sq. Wave')
plt.xlabel('Time')
plt.ylabel('Amplitude')
plt.legend()
plt.grid(True)
plt.present()

And the output?

That’s one other fairly good consequence. NumExpr reveals a 5 occasions enchancment over Numpy on this event.

Abstract

NumPy and NumExpr are each highly effective libraries used for Python numerical computations. They every have distinctive strengths and use circumstances, making them appropriate for various kinds of duties. Right here, we in contrast their efficiency and suitability for particular computational duties, specializing in examples akin to easy array addition to extra complicated functions, like utilizing a Sobel filter for picture edge detection.

Whereas I didn’t fairly see the claimed 15x pace enhance over NumPy in my exams, there’s little question that NumExpr could be considerably sooner than NumPy in lots of circumstances.

In the event you’re a heavy consumer of NumPy and must extract each little bit of efficiency out of your code, I like to recommend making an attempt the NumExpr library. Moreover the truth that not all NumPy code could be replicated utilizing NumExpr, there’s virtually no draw back, and the upside may shock you.

For extra particulars on the NumExpr library, try the GitHub web page here.

Source link

How AI Agents “Talk” to Each Other

Stop Building AI Platforms | Towards Data Science

What If I had AI in 2018: Rent the Runway Fulfillment Center Optimization

Why your AI investments aren’t paying off

Naive Bayes Multi-Classifiers for Mixed Data Types | by Kuriko | May, 2025

How We Teach AI to Speak Einfaches Deutsch: The Science Behind Intra-Language Translation | by Khushi Pitroda | Jun, 2025

Trump's tariff war, the federal election and your personal finances: Join our Q&A today at noon

AI in Business Analytics: Transforming Data into Insights

Most Popular

Automated playing styles using unsupervised learning: Handball case study | by Data in Motion | Mar, 2025

FP Answers: How is a coin collection taxed when the coins are sold?

A Multitasking App That Builds Your Websites and Runs Your Business for Less Than $400

Our Picks

Why the world is looking to ditch US AI models

News Bytes 20250609: AI Defying Human Control, Huawei’s 5nm Chips, WSTS Semiconductor Forecast

Why Vertical AI Agents Are the Future of SaaS

NumExpr: The “Faster than Numpy” Library Most Data Scientists Have Never Used

What’s NumExpr?

Organising a growth surroundings

Evaluating NumExpr and NumPy efficiency

Abstract

Related Posts