Cris’ Image Analysis Blog | Improving the selection of labeled objects

Improving the selection of labeled objects

By Cris Luengo on Thu 30 May 2024.

Say, you have a labeled image, like this:

A labeled image

And say, some of those labeled regions are not objects of interest. You want to erase those labels from the image. This is a really simple concept, and very generic. It should be easy to do.

import diplib as dip
lab = dip.ImageRead("labels.png")  # the labeled image above
lab.SetPixelSize(0.1, "μm")
msr = dip.MeasurementTool.Measure(lab, features=["Size", "Circularity"])

We decided on a threshold of 2.0 μm for the size, anything smaller is not of interest. How do we go about erasing those objects?

The simple method is to paint each object with the measurement value, then threshold the resulting image and use that to mask the original label image:

mask = dip.ObjectToMeasurement(lab, msr["Size"]) >= 2.0
new_lab = lab.Copy()
new_lab *= mask

This is, however, not at all efficient. And doing this for multiple conditions on multiple features it quickly becomes comical.

Another way to erase these objects is to iterate over the rows (objects) of the measurement table, and for each row (object) we make a decision whether to keep the object or not. To make erasing objects efficient, we create a lookup table that maps all labels to themselves, except the labels we want to erase, which should map to 0. Finally, we apply the lookup table to the labeled image, and produce the new image with only the selected objects.

In Python, we can use NumPy to avoid loops. For example,

import numpy as np
ids = np.asarray(msr.Objects())
size = np.asarray(msr["Size"]).squeeze()
ids = ids[size >= 2.0]

ids now contains the list of labels we want to keep. We create and apply the lookup table:

lut = np.zeros(msr.NumberOfObjects() + 1, dtype=np.uint32)
lut[ids] = ids
new_lab = dip.LookupTable(lut).Apply(lab)

In C++, the code would be something like this:

#include <vector>
#include <diplib.h>
#include <diplib/lookup_table.h>
#include <diplib/measurement.h>
#include <diplib/regions.h>
#include <diplib/simple_file_io.h>

int main() {
   dip::Image lab = dip::ImageRead("labels.png");  // the labeled image above
   lab.SetPixelSize(0.1 * dip::Units::Micrometer());
   dip::MeasurementTool measurementTool;
   dip::Measurement msr = measurementTool.Measure(lab, {}, {"Size", "Circularity"});

   std::vector<dip::uint32> lut(msr.NumberOfObjects() + 1, 0);
   auto it = msr["Size"].FirstObject();
   do {
      if (*it >= 2.0) {
         lut[it.ObjectID()] = it.ObjectID();
      }
   } while(++it);
   dip::Image new_lab = dip::LookupTable(lut.begin(), lut.end()).Apply(lab);
   // ...
}

This is not any more complicated than in Python with NumPy, it’s just… different.

Either way, the result is the following, identical to the first image but with the smaller objects removed:

A modified labeled image

A simple API

Usually, when a concept is this simple, and it takes me more than a minute to work out how to do it, I get miffed. And when it’s a highly generic concept, I want to add functionality in DIPlib to make it easy to do.

I’ll spare you the messy details of the design processes. What I finally came up with is this:

The comparison operators are overloaded to compare a measurement object column (e.g. the result of msr["Size"]) to a value. The result is a new object, dip::LabelMap.
The new object dip::LabelMap maps labels to either themselves or to 0. It’s a specialized version of dip::LookupTable, specifically for object labels and label images.
dip::LabelMap::Apply() applies the lookup table to a label image.

With this new functionality, that last section of C++ code can be written as:

dip::LabelMap map = msr["Size"] >= 2.0;
dip::Image new_lab = map.Apply(lab);

Neat, eh?

But this is not all:

Bitwise logical operators (&, |, ^ and ~) are overloaded to combine multiple dip::LabelMap objects. This means we can be more refined in our selection:
```
dip::LabelMap map = (msr["Size"] >= 2.0) & (msr["Circularity"] < 0.1);
```
The map object can also map labels to other labels. map.Relabel() has the same functionality as dip::Relabel(lab), and we can further modify the mapping by manually assigning labels with map[5] = 4. This should allow for quite interesting workflows. For example relabeling labeled tiles of a larger image before stitching them back together. Or reassigning labels of 2D slices of a 3D image, or of frames in a video, so that they match from slice to slice or frame to frame.
map.Apply(msr) returns a new dip::Measurement object with rows deleted and the remaining object IDs mapped according to the dip::LabelMap.

Implementation details

If you’re interested in seeing the implementation details, you’ll find them here:

All this functionality is identically available in Python, so the Python code becomes similar to the C++ code:

new_lab = (msr["Size"] >= 2.0).Apply(lab)

I’m sure this functionality will expand in the future. A few notes:

I have not overloaded arithmetic operators for measurement columns, though that would be quite neat to have. It would require a new object class, distinct from the current one, to hold the result of the operation. The current column object references data in the dip::Measurement object, and cannot be updated to hold its own data. If we create such a new class, we’d have to overload all functions that currently take a measurement column to also work with this new object – quite a lot of work!
But, it would be fairly simple to overload comparison operators to compare two measurement columns. Would this be generally useful?
I overloaded the bit-wise logical operators (&) rather than the regular logical operators (&&) because the latter, when overloaded, don’t do short-circuiting, which I think might lead to programming errors. But because we don’t really do a bit-wise operation, this could also be misleading. So don’t think of these operators as bit-wise logical operators. Instead, think of them as element-wise logical operators. For an integer, the elements are the bits. For a dip::LabelMap, the elements are the labels.
Note that in Python, the bit-wise logical operators have higher precedence than the comparison operators. So you do need the parentheses in the expression (msr["Size"] >= 2.0) & (msr["Circularity"] < 0.1). In C++ this is not the case (though I like to be explicit and put in the parentheses any way).

Edit 2024-05-31

A reader suggested that I warn Python users about operator chaining. You cannot use operator chaining with these dip.LabelMap objects in Python. In Python, 1 < x < 5 evaluates as 1 < x and x < 5. But because we cannot overload and, this does not do what the user was expecting. This is true also for NumPy arrays and DIPlib images, as well as, I imagine, many objects across many packages. So don’t use operator chaining!

Please let me know what you think!

Filed under tutorials.

Tags: DIPlib, PyDIP, measure, object, selection, lookup table, syntax, performance.

Questions or comments on this topic?
Join the discussion on LinkedIn or Mastodon