DIPimage 2.0 released
Last week the new release for DIPimage and DIPlib was made available at diplib.org.
The change list is pretty substantial,
though there should be no real compatibility concerns. One of the most important changes is
that, for both Windows and Linux, some image processing functionality now can use multithreading to make best use of
multi-processor and multi-core systems. For example, all separable filters will use all available cores by default. You
can overrule this behaviour with the new 'NumberOfThreads'
preference setting. Let’s first see how many threads
DIPimage uses by default on my computer:
dipgetpref('NumberOfThreads')
ans =
4
I’m using an Intel Xeon with four cores. Now let’s see how this affects computation. Again, I’m using the timeit
utility you can find here:
a = noise(newim(500,500));
f = @() gaussf(a,15);
dipsetpref('NumberOfThreads',4)
t4 = timeit(f)
dipsetpref('NumberOfThreads',2)
t2 = timeit(f)
dipsetpref('NumberOfThreads',1)
t1 = timeit(f)
t4 =
0.0193
t2 =
0.0266
t1 =
0.0443
Using four cores decreases execution time by a little over 50%. Part of the execution time is the actual calculation,
but a significant part of the time is needed for memory access. All processor cores share the same memory, which makes
it impossible to obtain the ideal decrease of 75% of the computation time. Note how the transition from one to two cores
produces a larger advantage than the transition from two to four cores. MATLAB has the maxNumCompThreads
command,
which changes the number of threads used for linear algebra calculations:
b = double(a);
f = @() b*b;
n = maxNumCompThreads(4);
t4 = timeit(f)
maxNumCompThreads(2);
t2 = timeit(f)
maxNumCompThreads(1);
t1 = timeit(f)
maxNumCompThreads(n);
t4 =
0.0082
t2 =
0.0142
t1 =
0.0267
Expect the 'NumberOfThreads'
preference to be linked to the the maxNumCompThreads
command in a future release of
DIPimage.
Not all filters are affected by the number of threads. For example, the dilation with a circular structuring element is not separable:
f = @() dilation(a,15);
dipsetpref('NumberOfThreads',4)
t4 = timeit(f)
dipsetpref('NumberOfThreads',2)
t2 = timeit(f)
dipsetpref('NumberOfThreads',1)
t1 = timeit(f)
t4 =
0.0310
t2 =
0.0311
t1 =
0.0311
However, the rectangular structuring element is:
f = @() dilation(a,15,'rectangular');
dipsetpref('NumberOfThreads',4)
t4 = timeit(f)
dipsetpref('NumberOfThreads',2)
t2 = timeit(f)
dipsetpref('NumberOfThreads',1)
t1 = timeit(f)
t4 =
0.0065
t2 =
0.0075
t1 =
0.0102
Note how the rectangular dilation only reduced execution time by 36% when spread over four cores. The dilation uses relatively few computations, meaning it is limited more by the memory access than by the computation. The same is true for the dot-product on matrices, for example, which therefore is not implemented as a multi-threaded operation in MATLAB.