Skip to content
Snippets Groups Projects
Commit 76723ea3 authored by Alessandro Romeo's avatar Alessandro Romeo
Browse files

updating files

parent 6c6461b5
Branches main
No related tags found
No related merge requests found
%% Cell type:markdown id:10fb3eb9 tags:
# Benchmarking Julia
1. Define the sum function
2. Implementations & benchmarking of __sum function__ in:
* Julia (built-in)
* Julia (hand-written)
* C (hand-written)
* python (built-in)
* python (numpy)
* python (hand-written)
Consider the __sum__ function `sum(a)`, which computes
$$
\mathrm{sum}(a) = \sum_{i=1}^n a_i,
$$
where $n$ is the length of `a`.
%% Cell type:markdown id:5812ea46 tags:
## 1. Julia built-in `sum` function
%% Cell type:code id:a17adb59 tags:
``` julia
import Pkg; Pkg.instantiate()
```
%% Cell type:code id:12536e98 tags:
``` julia
a = rand(10^7) # 1D vector of random numbers, uniform on [0,1]
```
%% Cell type:code id:1c220406 tags:
``` julia
@which sum(a)
```
%% Cell type:code id:8a7ab7a3 tags:
``` julia
sum(a)
```
%% Cell type:markdown id:4fcbcdbe tags:
The expected result is ~ $ 0.5 * 10^7 $, since the mean of each entry is 0.5.
So let's try to time the execution time of this function by using `@time` macro:
%% Cell type:code id:edd112c1 tags:
``` julia
?@time
```
%% Cell type:markdown id:320efc31 tags:
So what is the performance of Julia's built-in sum?
%% Cell type:code id:3b17f7df tags:
``` julia
@time sum(a) # try to repeat the execution of this cell!
```
%% Cell type:markdown id:69d7b634 tags:
The `@time` macro can yield noisy results, so it's not our best choice for benchmarking!
Luckily, Julia has a `BenchmarkTools.jl` package to make benchmarking easy and accurate:
%% Cell type:code id:40217f68 tags:
``` julia
import Pkg; Pkg.add("BenchmarkTools")
```
%% Cell type:code id:a916508a tags:
``` julia
using BenchmarkTools
```
%% Cell type:code id:4efb5fcf tags:
``` julia
@benchmark sum(a)
```
%% Cell type:markdown id:3bfc362f tags:
If the expression to benchmark depends on external variables, one should use `$` to "interpolate" them into the benchmark expression to avoid the problems of benchmarking with globals. Essentially, any interpolated variable `$x` or expression `$(...)` is "pre-computed" before benchmarking begins. So in short with `@btime` `$` is used to "interpolate" them into the benchmarked expression in order to get a correct benchmark results.
%% Cell type:code id:bcc80f47 tags:
``` julia
@benchmark sum($a)
```
%% Cell type:code id:5b8abfde tags:
``` julia
x = 1
@btime (y = 0; for _ in 1:10^6 y += x end; y)
@btime (y = 0; for _ in 1:10^6 y += $x end; y)
```
%% Cell type:markdown id:777708f1 tags:
We have seen before the performances of Julia built-in sum function. Let's save them in a dictionary:
%% Cell type:code id:ac55aa78 tags:
``` julia
j_bench = @benchmark sum($a)
```
%% Cell type:code id:ddf52068 tags:
``` julia
d = Dict()
d["Julia built-in"] = minimum(j_bench.times) / 1e6
d
```
%% Cell type:markdown id:2af9911a tags:
But that could be doing any number of tricks to be fast, including not using Julia at all in the first place! Of course, it is indeed written in Julia, but would it perform if we write a naive implementation ourselves?
## 2. DIY Julia `sum` function
%% Cell type:code id:05bdb0a4 tags:
``` julia
FIXME mysum(A)
s = 0.0
for FIXME
s += a
return FIXME
```
%% Cell type:code id:7ef98bdf tags:
``` julia
TO HIDE
function mysum(A)
s = 0.0
for a in A
s += a
end
return s
end
```
%% Cell type:code id:22454cc3 tags:
``` julia
j_bench_hand = @benchmark mysum($a)
```
%% Cell type:code id:d269e7d2 tags:
``` julia
d["Julia hand-written"] = minimum(j_bench_hand.times) / 1e6
d
```
%% Cell type:markdown id:545ab9e4 tags:
So that's about 2x slower than the builtin definition. We'll see why later on.
But first: is this fast? How would we know? Let's compare it to some other languages...
%% Cell type:markdown id:7c591bf6 tags:
## 3. C `sum` function
C is often considered the gold standard: difficult on the human, nice for the machine. Getting within a factor of 2 of C is often satisfying. Nonetheless, even within C, there are many kinds of optimizations possible that a naive C writer may or may not get the advantage of.
If you do not speak C, do not read the cell below, but one could be happy to know that it is possible to put C code in a Julia session, compile it, and run it. Note that the `"""` wrap a multi-line string.
%% Cell type:code id:6c1e8f7d tags:
``` julia
using Libdl
C_code = """
#include <stddef.h>
double c_sum(size_t n, double *X) {
double s = 0.0;
for (size_t i = 0; i < n; ++i) {
s += X[i];
}
return s;
}
"""
const Clib = tempname() # make a temporary file
# compile to a shared library by piping C_code to gcc
# (works only if you have gcc installed):
open(`gcc -fPIC -O3 -misel -xc -shared -o $(Clib * "." * Libdl.dlext) -`, "w") do f
print(f, C_code)
end
# define a Julia function that calls the C function:
c_sum(X::Array{Float64}) = ccall(("c_sum", Clib), Float64, (Csize_t, Ptr{Float64}), length(X), X)
```
%% Cell type:code id:d3b912c0 tags:
``` julia
c_sum(a)
c_sum(a) sum(a) # type \approx and then <TAB> to get the ≈ symbol
```
%% Cell type:code id:f2659088 tags:
``` julia
c_bench = @benchmark c_sum($a)
```
%% Cell type:code id:b64d7233 tags:
``` julia
d["C"] = minimum(c_bench.times) / 1e6 # in milliseconds
d
```
%% Cell type:markdown id:69124ca3 tags:
## 4. Python's built in `sum`
The `PyCall` package provides a Julia interface to Python:
%% Cell type:code id:34a95ee9 tags:
``` julia
import Pkg; Pkg.add("PyCall")
```
%% Cell type:code id:357dd956 tags:
``` julia
using PyCall
```
%% Cell type:code id:99f540ab tags:
``` julia
# get the Python built-in "sum" function:
pysum = pybuiltin("sum")
```
%% Cell type:code id:986968c8 tags:
``` julia
pysum(a)
```
%% Cell type:code id:89ec8102 tags:
``` julia
py_list_bench = @benchmark $pysum($a)
```
%% Cell type:code id:35c95364 tags:
``` julia
d["Python built-in"] = minimum(py_list_bench.times) / 1e6
d
```
%% Cell type:markdown id:9084038c tags:
## 5. Python's DIY `sum`
%% Cell type:code id:eb87f43b tags:
``` julia
py"""
def py_sum(A):
s = 0.0
for a in A:
s += a
return s
"""
sum_py = py"py_sum"
```
%% Cell type:code id:d78d2891 tags:
``` julia
py_hand = @benchmark $sum_py($a)
```
%% Cell type:code id:38b153bc tags:
``` julia
sum_py(a)
```
%% Cell type:code id:a55fc8ec tags:
``` julia
d["Python hand-written"] = minimum(py_hand.times) / 1e6
d
```
%% Cell type:markdown id:84e1a285 tags:
## 6. Summary
%% Cell type:code id:7e01b8ed tags:
``` julia
for (key, value) in sort(collect(d), by=last)
println(rpad(key, 25, "."), lpad(round(value; digits=1), 6, "."))
end
```
Source diff could not be displayed: it is too large. Options to address this: view the blob.
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment