polygonize()
creates vector polygons for all connected regions of pixels
in a source raster sharing a common pixel value. Each polygon is created
with an attribute indicating the pixel value of that polygon. A raster mask
may also be provided to determine which pixels are eligible for processing.
The function will create the output vector layer if it does not already
exist, otherwise it will try to append to an existing one.
This function is a wrapper of GDALPolygonize
in the GDAL Algorithms API.
It provides essentially the same functionality as the gdal_polygonize.py
command-line program (https://gdal.org/programs/gdal_polygonize.html).
Usage
polygonize(
raster_file,
out_dsn,
out_layer,
fld_name = "DN",
out_fmt = NULL,
connectedness = 4,
src_band = 1,
mask_file = NULL,
nomask = FALSE,
overwrite = FALSE,
dsco = NULL,
lco = NULL,
quiet = FALSE
)
Arguments
- raster_file
Filename of the source raster.
- out_dsn
The destination vector filename to which the polygons will be written (or database connection string).
- out_layer
Name of the layer for writing the polygon features. For single-layer file formats such as
"ESRI Shapefile"
, the layer name is the same as the filename without the path or extension (e.g.,out_dsn = "path_to_file/polygon_output.shp"
, the layer name is"polygon_output"
).- fld_name
Name of an integer attribute field in
out_layer
to which the pixel values will be written. Will be created if necessary when using an existing layer.- out_fmt
GDAL short name of the output vector format. If unspecified, the function will attempt to guess the format from the filename/connection string.
- connectedness
Integer scalar. Must be either
4
or8
. For the default 4-connectedness, pixels with the same value are considered connected only if they touch along one of the four sides, while 8-connectedness also includes pixels that touch at one of the corners.- src_band
The band on
raster_file
to build the polygons from (default is1
).- mask_file
Use the first band of the specified raster as a validity mask (zero is invalid, non-zero is valid). If not specified, the default validity mask for the input band (such as nodata, or alpha masks) will be used (unless
nomask
is set toTRUE
).- nomask
Logical scalar. If
TRUE
, do not use the default validity mask for the input band (such as nodata, or alpha masks). Default isFALSE
.- overwrite
Logical scalar. If
TRUE
, overwriteout_layer
if it already exists. Default isFALSE
.- dsco
Optional character vector of format-specific creation options for
out_dsn
("NAME=VALUE"
pairs).- lco
Optional character vector of format-specific creation options for
out_layer
("NAME=VALUE"
pairs).- quiet
Logical scalar. If
TRUE
, a progress bar will not be displayed. Defaults toFALSE
.
Details
Polygon features will be created on the output layer, with polygon geometries representing the polygons. The polygon geometries will be in the georeferenced coordinate system of the raster (based on the geotransform of the source dataset). It is acceptable for the output layer to already have features. If the output layer does not already exist, it will be created with coordinate system matching the source raster.
The algorithm attempts to minimize memory use so that very large rasters can be processed. However, if the raster has many polygons or very large/complex polygons, the memory use for holding polygon enumerations and active polygon geometries may grow to be quite large.
The algorithm will generally produce very dense polygon geometries, with edges that follow exactly on pixel boundaries for all non-interior pixels. For non-thematic raster data (such as satellite images) the result will essentially be one small polygon per pixel, and memory and output layer sizes will be substantial. The algorithm is primarily intended for relatively simple thematic rasters, masks, and classification results.
Note
The source pixel band values are read into a signed 64-bit integer buffer
(Int64
) by GDALPolygonize
, so floating point or complex bands will be
implicitly truncated before processing.
When 8-connectedness is used, many of the resulting polygons will likely be
invalid due to ring self-intersection (in the strict OGC definition of
polygon validity). They may be suitable as-is for certain purposes such as
calculating geometry attributes (area, perimeter). Package sf has
st_make_valid()
, PostGIS has ST_MakeValid()
, and QGIS has vector
processing utility "Fix geometries" (single polygons can become MultiPolygon
in the case of self-intersections).
If writing to a SQLite database format as either GPKG
(GeoPackage
vector) or SQLite
(Spatialite vector), setting the
SQLITE_USE_OGR_VFS
and OGR_SQLITE_JOURNAL
configuration options may
increase performance substantially. If writing to PostgreSQL
(PostGIS vector), setting PG_USE_COPY=YES
is faster:
# SQLite: GPKG (.gpkg) and Spatialite (.sqlite)
# enable extra buffering/caching by the GDAL/OGR I/O layer
set_config_option("SQLITE_USE_OGR_VFS", "YES")
# set the journal mode for the SQLite database to MEMORY
set_config_option("OGR_SQLITE_JOURNAL", "MEMORY")
# PostgreSQL / PostGIS
# use COPY for inserting data rather than INSERT
set_config_option("PG_USE_COPY", "YES")
Examples
evt_file <- system.file("extdata/storml_evt.tif", package="gdalraster")
dsn <- file.path(tempdir(), "storm_lake.gpkg")
layer <- "lf_evt"
fld <- "evt_value"
set_config_option("SQLITE_USE_OGR_VFS", "YES")
set_config_option("OGR_SQLITE_JOURNAL", "MEMORY")
polygonize(evt_file, dsn, layer, fld)
#> 0...10...20...30...40...50...60...70...80...90...100 - done.
set_config_option("SQLITE_USE_OGR_VFS", "")
set_config_option("OGR_SQLITE_JOURNAL", "")