sieveFilter()
is a wrapper for GDALSieveFilter()
in the GDAL Algorithms
API. It removes raster polygons smaller than a provided threshold size
(in pixels) and replaces them with the pixel value of the largest neighbour
polygon.
Usage
sieveFilter(
src_filename,
src_band,
dst_filename,
dst_band,
size_threshold,
connectedness,
mask_filename = "",
mask_band = 0L,
options = NULL,
quiet = FALSE
)
Arguments
- src_filename
Filename of the source raster to be processed.
- src_band
Band number in the source raster to be processed.
- dst_filename
Filename of the output raster. It may be the same as
src_filename
to update the source file in place.- dst_band
Band number in
dst_filename
to write output. It may be the same assrc_band
to update the source raster in place.- size_threshold
Integer. Raster polygons with sizes (in pixels) smaller than this value will be merged into their largest neighbour.
- connectedness
Integer. Either
4
indicating that diagonal pixels are not considered directly adjacent for polygon membership purposes, or8
indicating they are.- mask_filename
Optional filename of raster to use as a mask.
- mask_band
Band number in
mask_filename
to use as a mask. All pixels in the mask band with a value other than zero will be considered suitable for inclusion in polygons.- options
Algorithm options as a character vector of name=value pairs. None currently supported.
- quiet
Logical scalar. If
TRUE
, a progress bar will not be displayed. Defaults toFALSE
.
Details
Polygons are determined as regions of the raster where the pixels all have the same value, and that are contiguous (connected). Pixels determined to be "nodata" per the mask band will not be treated as part of a polygon regardless of their pixel values. Nodata areas will never be changed nor affect polygon sizes. Polygons smaller than the threshold with no neighbours that are as large as the threshold will not be altered. Polygons surrounded by nodata areas will therefore not be altered.
The algorithm makes three passes over the input file to enumerate the polygons and collect limited information about them. Memory use is proportional to the number of polygons (roughly 24 bytes per polygon), but is not directly related to the size of the raster. So very large raster files can be processed effectively if there aren't too many polygons. But extremely noisy rasters with many one pixel polygons will end up being expensive (in memory) to process.
The input dataset is read as integer data which means that floating point values are rounded to integers.
Examples
## remove single-pixel polygons from the vegetation type layer (EVT)
evt_file <- system.file("extdata/storml_evt.tif", package="gdalraster")
# create a blank raster to hold the output
evt_mmu_file <- file.path(tempdir(), "storml_evt_mmu2.tif")
rasterFromRaster(srcfile = evt_file,
dstfile = evt_mmu_file,
init = 32767)
#> initializing destination raster...
#> done
# create a mask to exclude water pixels from the algorithm
# recode water (7292) to 0
expr <- "ifelse(EVT == 7292, 0, EVT)"
mask_file <- calc(expr = expr,
rasterfiles = evt_file,
var.names = "EVT")
#> calculating from 1 input layer(s)...
#> ================================================================================
#> output written to: /tmp/Rtmpc9VA3S/rastcalc1cc319aba4f4.tif
# create a version of EVT with two-pixel minimum mapping unit
sieveFilter(src_filename = evt_file,
src_band = 1,
dst_filename = evt_mmu_file,
dst_band = 1,
size_threshold = 2,
connectedness = 8,
mask_filename = mask_file,
mask_band = 1)
#> 0...10...20...30...40...50...60...70...80...90...100 - done.