Flood Algorithm
Operation name  PDB operation name  GUI menu item  Implementation file  Description 

gimp:flood 
gimpselectionflood 
Select ⇒ Remove holes 
gimpoperationflood.c  Flood algorithm 
Note: operations starting with gimp:
are in GIMP, operations starting with gegl:
are in GEGL.
Abstract
The flood operation eliminates “holes” – darker areas surrounded by lighter areas – in singlecomponent (grayscale) images. It is particularly useful for eliminating such holes from selections; see GIMP bug #761060 for more details.
The conceptual model considers the input image as the heightmap of a hilly terrain. After heavy rain completely floods the terrain, the remaining water form lakes in its depressions.
_______
/.\ /\_____________
____________/...\ /..\ Water /..\
/....\ /.....\/....\__ /....\____
/......\____/....Ground.....\___/......\ /\
__/........................................\/..\_
Figure: A depiction of a onedimensional flood.
Valleys correspond to “holes” in the input image, filled with “water” according to their surrounding.
The result of the flood operation is the heightmap of the terrain after the flood, taking both ground and waterlevel into account.
More formally, the flood operation assigns to each pixel the minimum of the
maximal inputimage values along all paths from the pixel to the “outside”.
That is, the output value o(x)
at pixel ‘x’ is given by
[[File:Flood Algorithm  Formula 1.gifcenter]]
where P(x)
is the set of all paths from x
to the outside, and i(y)
is
the value of the input image at pixel y
.
Algorithm
In accord with the conceptual flood model, we refer to the values of the input image as the “ground level”, and to the values of the output image as the “water level”; these values range from 0 to 1, and are considered to be 0 outside the bounds of the image. Note not to confuse “water level” with “water depth”; we use the term “water level” simply to refer to the elevation of either the ground or the water at a certain point.
Our starting point is modeling the problem as a cellular automaton. The state of each cell (pixel) is its current water level, which is initially 1 everywhere inside the image. The water level at each cell is updated according to the rule
[[File:Flood Algorithm  Formula 2.gifcenter]]
where ‘‘w_{n}(x)’’ is the water level at pixel ‘‘x’’ on generation ‘’n’’, ‘‘g(x)’’ is the ground level at ‘‘x’’, and ‘‘N(x)’’ is the set of (orthogonal) neighbors of ‘‘x’’, including itself. In other words, the new water level at each pixel is the maximum of its ground level, and the minimum of its own water level, and that of its neighbors. This automaton converges to the output of the operation.
The automaton converges after, at most, ‘’n’’ generations, where ‘’n’’ is the number of pixels in the image. Therefore, a naive implementation, where at most ‘‘O(n)’’ cells are processed on each generation, has a worstcase time complexity of ‘‘O(n^{2})’’. By making a few observations, we can do better, at least in the most common cases.
First, note that the final state doesn’t depend on the order of the individual steps. That is, we don’t actually have to update the water level an entire generation at a time, but rather we can apply the transformation rule to any pixel, at any time, arbitrarily, until convergence. Furthermore, we don’t even have to consider all the neighbors of a pixel each time we apply the rule: as long as we make sure to never increase the water level of a pixel, i.e., as long as we consider the pixel’s own water level as part of the minimum term, we can take a different subset of the neighbors into account each time.
Second, using the above observation, note that we can solve a one dimensional automaton (i.e., compute its final state) in linear time, using two passes: On the first pass, we iterate over the pixels lefttoright, applying the transformation rule while considering only the left neighbor of each pixel (using the water level assigned to the neighbor on the previous iteration; recall that the water level of the left neighbor of the leftmost pixel of the image is considered to be 0.) On the second pass, we work in reverse – we iterate over the pixels righttoleft, applying the rule while considering only the right neighbors.
_________________________________________________
/.\ /\ __
____ /...\ /..\ /..\
(a) /....\ /.....\/....\__ /....\
/......\____/...............\___/......\ /\
__/........................................\/..\_
______________________________
/.\ /\ __
____________/...\ /..\ /..\
(b) /....\ /.....\/....\__ /....\
/......\____/...............\___/......\ /\
__/........................................\/..\_
_______
/.\ /\_____________
____________/...\ /..\ /..\
(c) /....\ /.....\/....\__ /....\____
/......\____/...............\___/......\ /\
__/........................................\/..\_
Figure: Water level of a onedimensional automaton. (a) initially; (b) after the first pass; (c) after the second pass.
While this technique doesn’t extend directly to two dimensions, we can leverage it by processing onedimensional strips of pixels in batch, as described above, instead of transforming pixels individually.
Finally, another obvious way to speed things up is to minimize the amount of unnecessary work we’re doing. In particular, we only need to process pixels whose neighbors’ state changed.
Taking all of the above into consideration, this is what we do:
We maintain a queue of “segments” – onedimensional, contiguous, strips of pixels – whose water level needs to be updated, as a result of a change in the water level of the pixels of a neighboring segment, referred to as the “source segment”. Although, for efficiency reasons, we allow segments to be either horizontal or vertical, for simplicity, we treat all segments as though they’re horizontal, and perform the necessary coordinatesystem transformation when dealing with vertical segments, as explained later.
Each segment is processed using the following steps.
1. Vertical propagation
The segment’s pixels are updated, using the above transformation rule, considering only the corresponding neighboring pixels of the source segment. During the process, we inspect which of the segment’s pixels actually changed, and create a list of “dirty ranges” of modified pixels. We construct the ranges such that all pixels of each range have the same water level; this becomes important in the next step.
  ++++++++  
Source        
Segment               
  ++++++++  
Current  V  V  V  V  V  V  V 
Segment   x  x   y  z  
  ++++++++  
Dirty ranges  
The current segment's pixels are updated
according to the neighboring pixels of the
source segment, and contiguous runs of
modified, equivalent pixels form a list of
dirty ranges.
2. Horizontal propagation
The segment’s pixels are updated, considering only their left and right neighbors, using the twopass process described above. Though semantically equivalent, this process slightly more intricate than the one described above, since we use the dirty ranges from the previous step to take a few shortcuts.
Recall that all pixels of a single dirty range are equal, and therefore, unless modified as part of the current pass, don’t affect each other’s state. On the other hand, all pixels outside any dirty range didn’t change, and therefore, unless modified as part of the current pass, don’t affect each other’s state either. As a result, initially, only the pixels that directly neighbor a dirty range, in the direction of the pass, need to be updated. If the water level of such pixel changes, we need to update the following pixel, and so on. Once the water level of a pixel remains the same, we don’t have to update the next pixel, but can rather jump directly to the pixel at the edge of the next dirty range, and so on.
For example, when scanning lefttoright, we start at the pixel directly to the right of the first (leftmost) dirty range. We apply the transformation rule to this pixel, and to the pixels to its right, until the water level of one of them is unaffected. At this point, we jump directly to the pixel to the right of the next dirty range.
 ++++++++++++ 
    1  2 (3)   4 (5) 
 ++++++++++++ 
 
Pixel traversal order on a lefttoright
pass. Traversal starts to the right of
the first dirty range, at pixel ''1''.
Pixel ''(3)'' is unaffected, and so we jump
directly to the right of the second dirty
range.
Of course, when scanning righttoleft, it all reverses, and we start to the left of the last (rightmost) dirty range, etc.
During each pass, we extend the dirty ranges, in the direction of the scan, to include the newly modified pixels. Note that, while scanning a sequence of pixels next to one of the dirty ranges, we may reach the edge of the next range. In such case, we keep scanning the pixels of the second range, but we don’t extend the previous range any further, so that the two ranges meet, but don’t overlap.
 ++++++++++++ 
    1  2 (3)   4  5  6 
 ++++++++++++ 
Original   
Extended  
The dirty ranges are extended, in the
direction of the scan, to include the
newly modified pixels. The scan can
"leak" into existing ranges (notice the
third range in the illustration), in which
case the previous range is only extended
as far as the leakedinto range.
Note that the rightmost and leftmost ranges may be extended past the bounds of the segment, during the lefttoright and righttoleft passes, respectively (recall that a segment doesn’t necessarily span an entire row.)
Also note that, if a dirty range is extended, or if its existing pixels are modified, during the first, lefttoright, pass, then it’s possible that its water level will not be uniform come the second, righttoleft, pass; this seems to break our assumption about the state of the dirty ranges, which allowed us to take the shortcut described above. This shortcut is still valid on the second pass, though. It turns out that we only need the ranges to meet a weaker condition – it’s enough for the water level of the pixels of each dirty range to be monotonically decreasing in the direction of the scan (righttoleft, in our case). This condition is still met at the end of the first pass.
One final detail: each dirty range has an associated ‘‘modified’’ flag, which is initially cleared. If, during the above process, the range is extended, or its existing pixels are modified, then its ‘‘modified’’ flag is set. This flag is used by the next step.
3. Distribution
The changes to the current segment’s water level may affect the two neighboring rows. For each dirty range, we push two new segments into the queue – one for each neighboring row – using the current row as their source segment.
There’s one exception to this, however: if a dirty range hasn’t been modified during the horizontal propagation step, i.e., if its ‘‘modified’’ flag is clear, then it necessarily doesn’t affect the neighboring pixels of the source segment. Hence, in this case, we can avoid pushing a corresponding segment for the row of the source segment.
+++++ . . . +++ . .
Source        
+++++++++++ . .
Current         
+++++++++++ . .
         
+++++ . ++ +++ . .
  
Modified Modified
New segments, corresponding to the dirty
ranges, are pushed into the queue for each
of the current segment's neighboring rows.
No segments are pushed for the row of the
source segment for nonmodified dirty
ranges.
To amortize the cost of maintaining and processing multiple separate segments, dirty ranges that are separated by a smallenough gap are coalesced into a single range prior to this step; the gap between the ranges, if exists, becomes part of the coalesced range; the ‘‘modified’’ flag of the coalesced range is the logicalOR of the ‘‘modified’’ flags of the individual ranges.
Start and Termination
Recall that segments are pushed into the queue as a result of a change in the water level of a neighboring segment. To kick this process off, we pretend that the water level outside the image instantaneously dropped from 1 to 0, and push four segments, referred to as the “seed segments”, corresponding to the four edges of the image (there may, in fact, be less than four seed segments, if the image is 1 or 2pixel wide or high.) The source segment of the seed segments, hypothetically, lies outside the image; in particular, the water level of the neighboring pixels in the vertical propagation step is taken to be 0 for the seed segments.
++
 
++++
   
   
   
   
   
   
   
++++
 
++
Figure: The four seed segments – one for each edge of the image.
The process terminates when there are no more segments left in the queue. At this point, the automaton has converged, and the water level corresponds to the output of the flood operation.
Coordinate Systems
As mentioned above, segments can be either horizontal or vertical, but are treated internally as horizontal. Additionally, the regionofinterest (ROI) of the operation might not span the entire image; in this case, the operation is performed on the ROI in isolation, and what we’ve been calling the “image” up until now is in fact the ROI (in particular, the ground level outside the ROI is considered to be 0, even if the input image isn’t completely black outside the ROI.)
To deal with this, we employ three coordinate systems:

Imagephysical: This is the “real” coordinate system of the operation, used when talking to the outside world (i.e., GEGL). Its origin is at the topleft corner of the image, its xaxis points right, and its yaxis points down.

Imagevirtual: This is the same as the imagephysical coordinate system, except that the x and ycoordinates are swapped when dealing with vertical segments. In other words, when processing a vertical segment, we pretend that image is transposed (i.e., reflected along the southeast diagonal). We transform to/from this coordinate system on the boundary between GEGL and the rest of the algorithm.

ROIvirtual: This is the same as the imagevirtual coordinate system, except that its origin is translated to the topleft corner of the ROI. Internal coordinates, that aren’t communicated to GEGL, are given in this coordinate system.
x y
+>       + +>       + +         +
y   x    y 
 +    +  +    + +>  +
v    v     x   
ROI ROI  ROI
         v  
+    + +    + +    +
     
+         + +         + +         +
(a) (b) (c)
The three coordinate systems: (a) image
physical, (b) imagevirtual (here shown
transposed), and (c) ROIvirtual.
To sum it up, internal coordinates (e.g., the ycoordinate of the current segment, or the xcoordinates of the dirty ranges) are given in the ROI
virtual coordinate system. Coordinates of GeglRectangle
s (such as the ROI rectangle, or the rectangles used when reading and writing to the GEGL
buffers) are given in the imagevirtual coordinate system, but are transformed to/from the imagephysical coordinate system before being
passedto/receivedfrom GEGL.