Vector Read Benchmarks
Source:vignettes/articles/vector-read-benchmarks.Rmd
vector-read-benchmarks.Rmd
Vector read benchmarks for R package
gdalraster
2025-05-25 (v. 1.0)
Benchmark tests follow the format of benchmarks described in GDAL RFC 86: Column-oriented read API for vector layers using the same dataset. The timings here cannot be compared directly with the GDAL timings due to hardware differences (hardware used for the GDAL benchmarks is not specified). The timings reported here should be conservative in the sense that hardware was a relatively slow, six year old laptop at the time of writing (with Intel(R) Core(TM) i5-8250U CPU @ 1.60GHz, 8 GB RAM and SSD). The benchmarks are intended as a sanity check on read performance in the context of ranges seen for two different I/O methods and multiple implementations, in compiled code and Python libraries as described for the GDAL benchmarks, and using R packages described here.
Software environment
Linux Ubuntu 24.04, R 4.5.0 (2025-04-11), GDAL 3.10.3 (2025-04-01), gdalraster 2.0.0.9002, sf 1.0.21
Vector data
NZ Building Outlines, https://data.linz.govt.nz/layer/101290-nz-building-outlines/, from Land Information New Zealand: “This dataset provides current outlines of buildings within mainland New Zealand captured from the latest aerial imagery.”
Tests used the GeoPackage file nz-building-outlines.gpkg (1.5 GB). The layer contains 3.3 million features, each with 13 attribute fields (2 fields of type Integer, 8 of type String, 3 of type DateTime) and polygon geometries.
Benchmark programs
Each program reads all features from the layer and populates an R
data frame. Code for the programs along with output generated by
reprex::reprex()
is given in a separate section further
below.
bench_gdalraster_fetch.R
Uses the class method GDALVector$fetch()
in
gdalraster for traditional row-level reading done in
C++ iterating over features with OGRLayer::GetNextFeature()
in the GDAL API. The method is an analog of function
DBI::dbFetch()
in the DBI R package.
bench_gdalraster_arrow_stream.R (requires GDAL >= 3.6)
Uses the class method GDALVector$getArrowStream()
in
gdalraster to expose an Arrow C stream on the layer as
a nanoarrow_array_stream
object (external pointer to an
ArrowArrayStream). Provides direct access to the stream object and
retrieves features in a column-oriented memory layout. The required
package nanoarrow provides S3 methods for
as.data.frame()
to import a nanoarrow_array
(one batch at a time), or the nanoarrow_array_stream
itself
(pulling all batches in the stream).
bench_gdalraster_fetch_conv_to_sf.R
The same as bench_gdalraster_fetch.R (traditional row-level access)
but with conversion to a classed sf data frame via
sf::st_sf()
included in the timing.
bench_sf_read_sf.R
Traditional row-level read using package sf for its
function sf::read_sf()
. Populates a classed data frame,
with geometries contained in a classed list column.
bench_sf_read_sf_use_stream.R (requires GDAL >= 3.6)
Uses sf::read_sf()
with argument
use_stream = TRUE
: “use the experimental columnar interface
introduced in GDAL 3.6”.
Timings
Bench program | Time (s) | Data frame class | Geom list column |
---|---|---|---|
bench_gdalraster_fetch.R | 25.90 | OGRFeatureSet | WKB raw vectors |
bench_gdalraster_arrow_stream.R | 5.90 | base data.frame | WKB raw vectors |
bench_gdalraster_fetch_conv_to_sf.R | 56.98 | sf | classed sfc |
bench_sf_read_sf.R | 176.56 | sf (tibble) | classed sfc |
bench_sf_read_sf_use_stream.R | 25.30 | sf (tibble) | classed sfc |
Code
bench_gdalraster_fetch.R
library(gdalraster)
#> GDAL 3.10.3 (released 2025-04-01), GEOS 3.12.2, PROJ 9.4.1
f <- '/home/ctoney/data/gis/nz-building-outlines/nz-building-outlines.gpkg'
(lyr <- new(GDALVector, f))
#> C++ object of class GDALVector
#> Driver : GeoPackage (GPKG)
#> DSN : /home/ctoney/data/gis/nz-building-outlines/nz-building-outlines.gpkg
#> Layer : nz_building_outlines
#> CRS : NZGD2000 / New Zealand Transverse Mercator 2000 (EPSG:2193)
#> Geom : MULTIPOLYGON
lyr$getFeatureCount()
#> [1] 3289574
system.time(d <- lyr$fetch(-1))
#> user system elapsed
#> 25.006 0.881 25.901
(nrow(d) == lyr$getFeatureCount())
#> [1] TRUE
head(d)
#> OGR feature set
#> FID building_id name use suburb_locality town_city territorial_authority
#> 1 1 2292051 Unknown Marton Marton Rangitikei District
#> 2 2 2292353 Unknown Durie Hill Whanganui Whanganui District
#> 3 3 2292407 Unknown Durie Hill Whanganui Whanganui District
#> 4 4 2292675 Unknown Feilding Feilding Manawatu District
#> 5 5 2292771 Unknown Feilding Feilding Manawatu District
#> 6 6 2292825 Unknown Feilding Feilding Manawatu District
#> capture_method capture_source_group capture_source_id
#> 1 Feature Extraction NZ Aerial Imagery 1042
#> 2 Feature Extraction NZ Aerial Imagery 1042
#> 3 Feature Extraction NZ Aerial Imagery 1042
#> 4 Feature Extraction NZ Aerial Imagery 1042
#> 5 Feature Extraction NZ Aerial Imagery 1042
#> 6 Feature Extraction NZ Aerial Imagery 1042
#> capture_source_name capture_source_from
#> 1 Manawatu Whanganui 0.3m Rural Aerial Photos (2015-2016) 2015-12-27
#> 2 Manawatu Whanganui 0.3m Rural Aerial Photos (2015-2016) 2015-12-27
#> 3 Manawatu Whanganui 0.3m Rural Aerial Photos (2015-2016) 2015-12-27
#> 4 Manawatu Whanganui 0.3m Rural Aerial Photos (2015-2016) 2015-12-27
#> 5 Manawatu Whanganui 0.3m Rural Aerial Photos (2015-2016) 2015-12-27
#> 6 Manawatu Whanganui 0.3m Rural Aerial Photos (2015-2016) 2015-12-27
#> capture_source_to last_modified geom
#> 1 2016-04-21 2019-01-04 WKB MULTIPOLYGON: raw 01 06 00 00 ...
#> 2 2016-04-21 2019-01-04 WKB MULTIPOLYGON: raw 01 06 00 00 ...
#> 3 2016-04-21 2019-01-04 WKB MULTIPOLYGON: raw 01 06 00 00 ...
#> 4 2016-04-21 2019-01-04 WKB MULTIPOLYGON: raw 01 06 00 00 ...
#> 5 2016-04-21 2019-01-04 WKB MULTIPOLYGON: raw 01 06 00 00 ...
#> 6 2016-04-21 2019-01-04 WKB MULTIPOLYGON: raw 01 06 00 00 ...
lyr$close()
Created on 2025-05-25 with reprex v2.1.1
bench_gdalraster_arrow_stream.R
library(gdalraster)
#> GDAL 3.10.3 (released 2025-04-01), GEOS 3.12.2, PROJ 9.4.1
f <- '/home/ctoney/data/gis/nz-building-outlines/nz-building-outlines.gpkg'
(lyr <- new(GDALVector, f))
#> C++ object of class GDALVector
#> Driver : GeoPackage (GPKG)
#> DSN : /home/ctoney/data/gis/nz-building-outlines/nz-building-outlines.gpkg
#> Layer : nz_building_outlines
#> CRS : NZGD2000 / New Zealand Transverse Mercator 2000 (EPSG:2193)
#> Geom : MULTIPOLYGON
lyr$getFeatureCount()
#> [1] 3289574
lyr$testCapability()$FastGetArrowStream
#> [1] TRUE
options(nanoarrow.warn_unregistered_extension = FALSE)
(stream <- lyr$getArrowStream())
#> <nanoarrow_array_stream struct<fid: int64, building_id: int32, name: string, use: string, suburb_locality: string, town_city: string, territorial_authority: string, capture_method: string, capture_source_group: string, capture_source_id: int32, capture_source_name: string, capture_source_from: date32, capture_source_to: date32, last_modified: date32, geom: ogc.wkb{binary}>>
#> $ get_schema:function ()
#> $ get_next :function (schema = x$get_schema(), validate = TRUE)
#> $ release :function ()
system.time(d <- as.data.frame(stream))
#> user system elapsed
#> 7.388 1.390 5.895
stream$release()
(nrow(d) == lyr$getFeatureCount())
#> [1] TRUE
head(d)
#> fid building_id name use suburb_locality town_city territorial_authority
#> 1 1 2292051 Unknown Marton Marton Rangitikei District
#> 2 2 2292353 Unknown Durie Hill Whanganui Whanganui District
#> 3 3 2292407 Unknown Durie Hill Whanganui Whanganui District
#> 4 4 2292675 Unknown Feilding Feilding Manawatu District
#> 5 5 2292771 Unknown Feilding Feilding Manawatu District
#> 6 6 2292825 Unknown Feilding Feilding Manawatu District
#> capture_method capture_source_group capture_source_id
#> 1 Feature Extraction NZ Aerial Imagery 1042
#> 2 Feature Extraction NZ Aerial Imagery 1042
#> 3 Feature Extraction NZ Aerial Imagery 1042
#> 4 Feature Extraction NZ Aerial Imagery 1042
#> 5 Feature Extraction NZ Aerial Imagery 1042
#> 6 Feature Extraction NZ Aerial Imagery 1042
#> capture_source_name capture_source_from
#> 1 Manawatu Whanganui 0.3m Rural Aerial Photos (2015-2016) 2015-12-27
#> 2 Manawatu Whanganui 0.3m Rural Aerial Photos (2015-2016) 2015-12-27
#> 3 Manawatu Whanganui 0.3m Rural Aerial Photos (2015-2016) 2015-12-27
#> 4 Manawatu Whanganui 0.3m Rural Aerial Photos (2015-2016) 2015-12-27
#> 5 Manawatu Whanganui 0.3m Rural Aerial Photos (2015-2016) 2015-12-27
#> 6 Manawatu Whanganui 0.3m Rural Aerial Photos (2015-2016) 2015-12-27
#> capture_source_to last_modified geom
#> 1 2016-04-21 2019-01-04 blob[102 B]
#> 2 2016-04-21 2019-01-04 blob[102 B]
#> 3 2016-04-21 2019-01-04 blob[230 B]
#> 4 2016-04-21 2019-01-04 blob[102 B]
#> 5 2016-04-21 2019-01-04 blob[118 B]
#> 6 2016-04-21 2019-01-04 blob[102 B]
lyr$close()
Created on 2025-05-25 with reprex v2.1.1
bench_gdalraster_fetch_conv_to_sf.R
library(gdalraster)
#> GDAL 3.10.3 (released 2025-04-01), GEOS 3.12.2, PROJ 9.4.1
f <- '/home/ctoney/data/gis/nz-building-outlines/nz-building-outlines.gpkg'
(lyr <- new(GDALVector, f))
#> C++ object of class GDALVector
#> Driver : GeoPackage (GPKG)
#> DSN : /home/ctoney/data/gis/nz-building-outlines/nz-building-outlines.gpkg
#> Layer : nz_building_outlines
#> CRS : NZGD2000 / New Zealand Transverse Mercator 2000 (EPSG:2193)
#> Geom : MULTIPOLYGON
lyr$getFeatureCount()
#> [1] 3289574
system.time({
d <- lyr$fetch(-1)
d <- sf::st_sf(d, crs = lyr$getSpatialRef())
})
#> user system elapsed
#> 54.795 1.938 56.979
(nrow(d) == lyr$getFeatureCount())
#> [1] TRUE
head(d)
#> Simple feature collection with 6 features and 14 fields
#> Geometry type: MULTIPOLYGON
#> Dimension: XY
#> Bounding box: xmin: 1776318 ymin: 5544066 xmax: 1818438 ymax: 5576891
#> Projected CRS: NZGD2000 / New Zealand Transverse Mercator 2000
#> FID building_id name use suburb_locality town_city territorial_authority
#> 1 1 2292051 Unknown Marton Marton Rangitikei District
#> 2 2 2292353 Unknown Durie Hill Whanganui Whanganui District
#> 3 3 2292407 Unknown Durie Hill Whanganui Whanganui District
#> 4 4 2292675 Unknown Feilding Feilding Manawatu District
#> 5 5 2292771 Unknown Feilding Feilding Manawatu District
#> 6 6 2292825 Unknown Feilding Feilding Manawatu District
#> capture_method capture_source_group capture_source_id
#> 1 Feature Extraction NZ Aerial Imagery 1042
#> 2 Feature Extraction NZ Aerial Imagery 1042
#> 3 Feature Extraction NZ Aerial Imagery 1042
#> 4 Feature Extraction NZ Aerial Imagery 1042
#> 5 Feature Extraction NZ Aerial Imagery 1042
#> 6 Feature Extraction NZ Aerial Imagery 1042
#> capture_source_name capture_source_from
#> 1 Manawatu Whanganui 0.3m Rural Aerial Photos (2015-2016) 2015-12-27
#> 2 Manawatu Whanganui 0.3m Rural Aerial Photos (2015-2016) 2015-12-27
#> 3 Manawatu Whanganui 0.3m Rural Aerial Photos (2015-2016) 2015-12-27
#> 4 Manawatu Whanganui 0.3m Rural Aerial Photos (2015-2016) 2015-12-27
#> 5 Manawatu Whanganui 0.3m Rural Aerial Photos (2015-2016) 2015-12-27
#> 6 Manawatu Whanganui 0.3m Rural Aerial Photos (2015-2016) 2015-12-27
#> capture_source_to last_modified geom
#> 1 2016-04-21 2019-01-04 MULTIPOLYGON (((1796394 556...
#> 2 2016-04-21 2019-01-04 MULTIPOLYGON (((1776394 557...
#> 3 2016-04-21 2019-01-04 MULTIPOLYGON (((1776322 557...
#> 4 2016-04-21 2019-01-04 MULTIPOLYGON (((1818268 554...
#> 5 2016-04-21 2019-01-04 MULTIPOLYGON (((1818172 554...
#> 6 2016-04-21 2019-01-04 MULTIPOLYGON (((1818436 554...
lyr$close()
Created on 2025-05-25 with reprex v2.1.1
bench_sf_read_sf.R
library(sf)
#> Linking to GEOS 3.12.2, GDAL 3.10.3, PROJ 9.4.1; sf_use_s2() is TRUE
f <- '/home/ctoney/data/gis/nz-building-outlines/nz-building-outlines.gpkg'
system.time(d <- read_sf(f, "nz_building_outlines"))
#> user system elapsed
#> 156.162 7.710 176.556
nrow(d)
#> [1] 3289574
head(d)
#> Simple feature collection with 6 features and 13 fields
#> Geometry type: MULTIPOLYGON
#> Dimension: XY
#> Bounding box: xmin: 1776318 ymin: 5544066 xmax: 1818438 ymax: 5576891
#> Projected CRS: NZGD2000 / New Zealand Transverse Mercator 2000
#> # A tibble: 6 × 14
#> building_id name use suburb_locality town_city territorial_authority
#> <int> <chr> <chr> <chr> <chr> <chr>
#> 1 2292051 "" Unknown Marton Marton Rangitikei District
#> 2 2292353 "" Unknown Durie Hill Whanganui Whanganui District
#> 3 2292407 "" Unknown Durie Hill Whanganui Whanganui District
#> 4 2292675 "" Unknown Feilding Feilding Manawatu District
#> 5 2292771 "" Unknown Feilding Feilding Manawatu District
#> 6 2292825 "" Unknown Feilding Feilding Manawatu District
#> # ℹ 8 more variables: capture_method <chr>, capture_source_group <chr>,
#> # capture_source_id <int>, capture_source_name <chr>,
#> # capture_source_from <date>, capture_source_to <date>, last_modified <date>,
#> # geom <MULTIPOLYGON [m]>
Created on 2025-05-25 with reprex v2.1.1
bench_sf_read_sf_use_stream.R
library(sf)
#> Linking to GEOS 3.12.2, GDAL 3.10.3, PROJ 9.4.1; sf_use_s2() is TRUE
f <- '/home/ctoney/data/gis/nz-building-outlines/nz-building-outlines.gpkg'
system.time(d <- read_sf(f, "nz_building_outlines", use_stream = TRUE))
#> user system elapsed
#> 25.749 2.556 25.295
nrow(d)
#> [1] 3289574
head(d)
#> Simple feature collection with 6 features and 13 fields
#> Geometry type: MULTIPOLYGON
#> Dimension: XY
#> Bounding box: xmin: 1776318 ymin: 5544066 xmax: 1818438 ymax: 5576891
#> Projected CRS: NZGD2000 / New Zealand Transverse Mercator 2000
#> # A tibble: 6 × 14
#> building_id name use suburb_locality town_city territorial_authority
#> <int> <chr> <chr> <chr> <chr> <chr>
#> 1 2292051 "" Unknown Marton Marton Rangitikei District
#> 2 2292353 "" Unknown Durie Hill Whanganui Whanganui District
#> 3 2292407 "" Unknown Durie Hill Whanganui Whanganui District
#> 4 2292675 "" Unknown Feilding Feilding Manawatu District
#> 5 2292771 "" Unknown Feilding Feilding Manawatu District
#> 6 2292825 "" Unknown Feilding Feilding Manawatu District
#> # ℹ 8 more variables: capture_method <chr>, capture_source_group <chr>,
#> # capture_source_id <int>, capture_source_name <chr>,
#> # capture_source_from <date>, capture_source_to <date>, last_modified <date>,
#> # geom <MULTIPOLYGON [m]>
Created on 2025-05-25 with reprex v2.1.1