Skip to contents

Vector read benchmarks for R package gdalraster
2025-05-25 (v. 1.0)

Benchmark tests follow the format of benchmarks described in GDAL RFC 86: Column-oriented read API for vector layers using the same dataset. The timings here cannot be compared directly with the GDAL timings due to hardware differences (hardware used for the GDAL benchmarks is not specified). The timings reported here should be conservative in the sense that hardware was a relatively slow, six year old laptop at the time of writing (with Intel(R) Core(TM) i5-8250U CPU @ 1.60GHz, 8 GB RAM and SSD). The benchmarks are intended as a sanity check on read performance in the context of ranges seen for two different I/O methods and multiple implementations, in compiled code and Python libraries as described for the GDAL benchmarks, and using R packages described here.

Software environment

Linux Ubuntu 24.04, R 4.5.0 (2025-04-11), GDAL 3.10.3 (2025-04-01), gdalraster 2.0.0.9002, sf 1.0.21

Vector data

NZ Building Outlines, https://data.linz.govt.nz/layer/101290-nz-building-outlines/, from Land Information New Zealand: “This dataset provides current outlines of buildings within mainland New Zealand captured from the latest aerial imagery.”

Tests used the GeoPackage file nz-building-outlines.gpkg (1.5 GB). The layer contains 3.3 million features, each with 13 attribute fields (2 fields of type Integer, 8 of type String, 3 of type DateTime) and polygon geometries.

Benchmark programs

Each program reads all features from the layer and populates an R data frame. Code for the programs along with output generated by reprex::reprex() is given in a separate section further below.

bench_gdalraster_fetch.R

Uses the class method GDALVector$fetch() in gdalraster for traditional row-level reading done in C++ iterating over features with OGRLayer::GetNextFeature() in the GDAL API. The method is an analog of function DBI::dbFetch() in the DBI R package.

bench_gdalraster_arrow_stream.R (requires GDAL >= 3.6)

Uses the class method GDALVector$getArrowStream() in gdalraster to expose an Arrow C stream on the layer as a nanoarrow_array_stream object (external pointer to an ArrowArrayStream). Provides direct access to the stream object and retrieves features in a column-oriented memory layout. The required package nanoarrow provides S3 methods for as.data.frame() to import a nanoarrow_array (one batch at a time), or the nanoarrow_array_stream itself (pulling all batches in the stream).

bench_gdalraster_fetch_conv_to_sf.R

The same as bench_gdalraster_fetch.R (traditional row-level access) but with conversion to a classed sf data frame via sf::st_sf() included in the timing.

bench_sf_read_sf.R

Traditional row-level read using package sf for its function sf::read_sf(). Populates a classed data frame, with geometries contained in a classed list column.

bench_sf_read_sf_use_stream.R (requires GDAL >= 3.6)

Uses sf::read_sf() with argument use_stream = TRUE: “use the experimental columnar interface introduced in GDAL 3.6”.

Timings

Read nz-building-outlines.gpkg (1.5 GB) and populate a data frame.
Bench program Time (s) Data frame class Geom list column
bench_gdalraster_fetch.R 25.90 OGRFeatureSet WKB raw vectors
bench_gdalraster_arrow_stream.R 5.90 base data.frame WKB raw vectors
bench_gdalraster_fetch_conv_to_sf.R 56.98 sf classed sfc
bench_sf_read_sf.R 176.56 sf (tibble) classed sfc
bench_sf_read_sf_use_stream.R 25.30 sf (tibble) classed sfc

Code

bench_gdalraster_fetch.R

library(gdalraster)
#> GDAL 3.10.3 (released 2025-04-01), GEOS 3.12.2, PROJ 9.4.1

f <- '/home/ctoney/data/gis/nz-building-outlines/nz-building-outlines.gpkg'
(lyr <- new(GDALVector, f))
#> C++ object of class GDALVector
#>  Driver : GeoPackage (GPKG)
#>  DSN    : /home/ctoney/data/gis/nz-building-outlines/nz-building-outlines.gpkg
#>  Layer  : nz_building_outlines
#>  CRS    : NZGD2000 / New Zealand Transverse Mercator 2000 (EPSG:2193)
#>  Geom   : MULTIPOLYGON

lyr$getFeatureCount()
#> [1] 3289574

system.time(d <- lyr$fetch(-1))
#>    user  system elapsed 
#>  25.006   0.881  25.901

(nrow(d) == lyr$getFeatureCount())
#> [1] TRUE

head(d)
#> OGR feature set
#>   FID building_id name     use suburb_locality town_city territorial_authority
#> 1   1     2292051      Unknown          Marton    Marton   Rangitikei District
#> 2   2     2292353      Unknown      Durie Hill Whanganui    Whanganui District
#> 3   3     2292407      Unknown      Durie Hill Whanganui    Whanganui District
#> 4   4     2292675      Unknown        Feilding  Feilding     Manawatu District
#> 5   5     2292771      Unknown        Feilding  Feilding     Manawatu District
#> 6   6     2292825      Unknown        Feilding  Feilding     Manawatu District
#>       capture_method capture_source_group capture_source_id
#> 1 Feature Extraction    NZ Aerial Imagery              1042
#> 2 Feature Extraction    NZ Aerial Imagery              1042
#> 3 Feature Extraction    NZ Aerial Imagery              1042
#> 4 Feature Extraction    NZ Aerial Imagery              1042
#> 5 Feature Extraction    NZ Aerial Imagery              1042
#> 6 Feature Extraction    NZ Aerial Imagery              1042
#>                                       capture_source_name capture_source_from
#> 1 Manawatu Whanganui 0.3m Rural Aerial Photos (2015-2016)          2015-12-27
#> 2 Manawatu Whanganui 0.3m Rural Aerial Photos (2015-2016)          2015-12-27
#> 3 Manawatu Whanganui 0.3m Rural Aerial Photos (2015-2016)          2015-12-27
#> 4 Manawatu Whanganui 0.3m Rural Aerial Photos (2015-2016)          2015-12-27
#> 5 Manawatu Whanganui 0.3m Rural Aerial Photos (2015-2016)          2015-12-27
#> 6 Manawatu Whanganui 0.3m Rural Aerial Photos (2015-2016)          2015-12-27
#>   capture_source_to last_modified                                  geom
#> 1        2016-04-21    2019-01-04 WKB MULTIPOLYGON: raw 01 06 00 00 ...
#> 2        2016-04-21    2019-01-04 WKB MULTIPOLYGON: raw 01 06 00 00 ...
#> 3        2016-04-21    2019-01-04 WKB MULTIPOLYGON: raw 01 06 00 00 ...
#> 4        2016-04-21    2019-01-04 WKB MULTIPOLYGON: raw 01 06 00 00 ...
#> 5        2016-04-21    2019-01-04 WKB MULTIPOLYGON: raw 01 06 00 00 ...
#> 6        2016-04-21    2019-01-04 WKB MULTIPOLYGON: raw 01 06 00 00 ...

lyr$close()

Created on 2025-05-25 with reprex v2.1.1

bench_gdalraster_arrow_stream.R

library(gdalraster)
#> GDAL 3.10.3 (released 2025-04-01), GEOS 3.12.2, PROJ 9.4.1

f <- '/home/ctoney/data/gis/nz-building-outlines/nz-building-outlines.gpkg'
(lyr <- new(GDALVector, f))
#> C++ object of class GDALVector
#>  Driver : GeoPackage (GPKG)
#>  DSN    : /home/ctoney/data/gis/nz-building-outlines/nz-building-outlines.gpkg
#>  Layer  : nz_building_outlines
#>  CRS    : NZGD2000 / New Zealand Transverse Mercator 2000 (EPSG:2193)
#>  Geom   : MULTIPOLYGON

lyr$getFeatureCount()
#> [1] 3289574

lyr$testCapability()$FastGetArrowStream
#> [1] TRUE

options(nanoarrow.warn_unregistered_extension = FALSE)

(stream <- lyr$getArrowStream())
#> <nanoarrow_array_stream struct<fid: int64, building_id: int32, name: string, use: string, suburb_locality: string, town_city: string, territorial_authority: string, capture_method: string, capture_source_group: string, capture_source_id: int32, capture_source_name: string, capture_source_from: date32, capture_source_to: date32, last_modified: date32, geom: ogc.wkb{binary}>>
#>  $ get_schema:function ()  
#>  $ get_next  :function (schema = x$get_schema(), validate = TRUE)  
#>  $ release   :function ()

system.time(d <- as.data.frame(stream))
#>    user  system elapsed 
#>   7.388   1.390   5.895

stream$release()

(nrow(d) == lyr$getFeatureCount())
#> [1] TRUE

head(d)
#>   fid building_id name     use suburb_locality town_city territorial_authority
#> 1   1     2292051      Unknown          Marton    Marton   Rangitikei District
#> 2   2     2292353      Unknown      Durie Hill Whanganui    Whanganui District
#> 3   3     2292407      Unknown      Durie Hill Whanganui    Whanganui District
#> 4   4     2292675      Unknown        Feilding  Feilding     Manawatu District
#> 5   5     2292771      Unknown        Feilding  Feilding     Manawatu District
#> 6   6     2292825      Unknown        Feilding  Feilding     Manawatu District
#>       capture_method capture_source_group capture_source_id
#> 1 Feature Extraction    NZ Aerial Imagery              1042
#> 2 Feature Extraction    NZ Aerial Imagery              1042
#> 3 Feature Extraction    NZ Aerial Imagery              1042
#> 4 Feature Extraction    NZ Aerial Imagery              1042
#> 5 Feature Extraction    NZ Aerial Imagery              1042
#> 6 Feature Extraction    NZ Aerial Imagery              1042
#>                                       capture_source_name capture_source_from
#> 1 Manawatu Whanganui 0.3m Rural Aerial Photos (2015-2016)          2015-12-27
#> 2 Manawatu Whanganui 0.3m Rural Aerial Photos (2015-2016)          2015-12-27
#> 3 Manawatu Whanganui 0.3m Rural Aerial Photos (2015-2016)          2015-12-27
#> 4 Manawatu Whanganui 0.3m Rural Aerial Photos (2015-2016)          2015-12-27
#> 5 Manawatu Whanganui 0.3m Rural Aerial Photos (2015-2016)          2015-12-27
#> 6 Manawatu Whanganui 0.3m Rural Aerial Photos (2015-2016)          2015-12-27
#>   capture_source_to last_modified        geom
#> 1        2016-04-21    2019-01-04 blob[102 B]
#> 2        2016-04-21    2019-01-04 blob[102 B]
#> 3        2016-04-21    2019-01-04 blob[230 B]
#> 4        2016-04-21    2019-01-04 blob[102 B]
#> 5        2016-04-21    2019-01-04 blob[118 B]
#> 6        2016-04-21    2019-01-04 blob[102 B]

lyr$close()

Created on 2025-05-25 with reprex v2.1.1

bench_gdalraster_fetch_conv_to_sf.R

library(gdalraster)
#> GDAL 3.10.3 (released 2025-04-01), GEOS 3.12.2, PROJ 9.4.1

f <- '/home/ctoney/data/gis/nz-building-outlines/nz-building-outlines.gpkg'
(lyr <- new(GDALVector, f))
#> C++ object of class GDALVector
#>  Driver : GeoPackage (GPKG)
#>  DSN    : /home/ctoney/data/gis/nz-building-outlines/nz-building-outlines.gpkg
#>  Layer  : nz_building_outlines
#>  CRS    : NZGD2000 / New Zealand Transverse Mercator 2000 (EPSG:2193)
#>  Geom   : MULTIPOLYGON

lyr$getFeatureCount()
#> [1] 3289574

system.time({
    d <- lyr$fetch(-1)
    d <- sf::st_sf(d, crs = lyr$getSpatialRef())
})
#>    user  system elapsed 
#>  54.795   1.938  56.979

(nrow(d) == lyr$getFeatureCount())
#> [1] TRUE

head(d)
#> Simple feature collection with 6 features and 14 fields
#> Geometry type: MULTIPOLYGON
#> Dimension:     XY
#> Bounding box:  xmin: 1776318 ymin: 5544066 xmax: 1818438 ymax: 5576891
#> Projected CRS: NZGD2000 / New Zealand Transverse Mercator 2000
#>   FID building_id name     use suburb_locality town_city territorial_authority
#> 1   1     2292051      Unknown          Marton    Marton   Rangitikei District
#> 2   2     2292353      Unknown      Durie Hill Whanganui    Whanganui District
#> 3   3     2292407      Unknown      Durie Hill Whanganui    Whanganui District
#> 4   4     2292675      Unknown        Feilding  Feilding     Manawatu District
#> 5   5     2292771      Unknown        Feilding  Feilding     Manawatu District
#> 6   6     2292825      Unknown        Feilding  Feilding     Manawatu District
#>       capture_method capture_source_group capture_source_id
#> 1 Feature Extraction    NZ Aerial Imagery              1042
#> 2 Feature Extraction    NZ Aerial Imagery              1042
#> 3 Feature Extraction    NZ Aerial Imagery              1042
#> 4 Feature Extraction    NZ Aerial Imagery              1042
#> 5 Feature Extraction    NZ Aerial Imagery              1042
#> 6 Feature Extraction    NZ Aerial Imagery              1042
#>                                       capture_source_name capture_source_from
#> 1 Manawatu Whanganui 0.3m Rural Aerial Photos (2015-2016)          2015-12-27
#> 2 Manawatu Whanganui 0.3m Rural Aerial Photos (2015-2016)          2015-12-27
#> 3 Manawatu Whanganui 0.3m Rural Aerial Photos (2015-2016)          2015-12-27
#> 4 Manawatu Whanganui 0.3m Rural Aerial Photos (2015-2016)          2015-12-27
#> 5 Manawatu Whanganui 0.3m Rural Aerial Photos (2015-2016)          2015-12-27
#> 6 Manawatu Whanganui 0.3m Rural Aerial Photos (2015-2016)          2015-12-27
#>   capture_source_to last_modified                           geom
#> 1        2016-04-21    2019-01-04 MULTIPOLYGON (((1796394 556...
#> 2        2016-04-21    2019-01-04 MULTIPOLYGON (((1776394 557...
#> 3        2016-04-21    2019-01-04 MULTIPOLYGON (((1776322 557...
#> 4        2016-04-21    2019-01-04 MULTIPOLYGON (((1818268 554...
#> 5        2016-04-21    2019-01-04 MULTIPOLYGON (((1818172 554...
#> 6        2016-04-21    2019-01-04 MULTIPOLYGON (((1818436 554...

lyr$close()

Created on 2025-05-25 with reprex v2.1.1

bench_sf_read_sf.R

library(sf)
#> Linking to GEOS 3.12.2, GDAL 3.10.3, PROJ 9.4.1; sf_use_s2() is TRUE

f <- '/home/ctoney/data/gis/nz-building-outlines/nz-building-outlines.gpkg'
system.time(d <- read_sf(f, "nz_building_outlines"))
#>    user  system elapsed 
#> 156.162   7.710 176.556

nrow(d)
#> [1] 3289574

head(d)
#> Simple feature collection with 6 features and 13 fields
#> Geometry type: MULTIPOLYGON
#> Dimension:     XY
#> Bounding box:  xmin: 1776318 ymin: 5544066 xmax: 1818438 ymax: 5576891
#> Projected CRS: NZGD2000 / New Zealand Transverse Mercator 2000
#> # A tibble: 6 × 14
#>   building_id name  use     suburb_locality town_city territorial_authority
#>         <int> <chr> <chr>   <chr>           <chr>     <chr>                
#> 1     2292051 ""    Unknown Marton          Marton    Rangitikei District  
#> 2     2292353 ""    Unknown Durie Hill      Whanganui Whanganui District   
#> 3     2292407 ""    Unknown Durie Hill      Whanganui Whanganui District   
#> 4     2292675 ""    Unknown Feilding        Feilding  Manawatu District    
#> 5     2292771 ""    Unknown Feilding        Feilding  Manawatu District    
#> 6     2292825 ""    Unknown Feilding        Feilding  Manawatu District    
#> # ℹ 8 more variables: capture_method <chr>, capture_source_group <chr>,
#> #   capture_source_id <int>, capture_source_name <chr>,
#> #   capture_source_from <date>, capture_source_to <date>, last_modified <date>,
#> #   geom <MULTIPOLYGON [m]>

Created on 2025-05-25 with reprex v2.1.1

bench_sf_read_sf_use_stream.R

library(sf)
#> Linking to GEOS 3.12.2, GDAL 3.10.3, PROJ 9.4.1; sf_use_s2() is TRUE

f <- '/home/ctoney/data/gis/nz-building-outlines/nz-building-outlines.gpkg'
system.time(d <- read_sf(f, "nz_building_outlines", use_stream = TRUE))
#>    user  system elapsed 
#>  25.749   2.556  25.295

nrow(d)
#> [1] 3289574

head(d)
#> Simple feature collection with 6 features and 13 fields
#> Geometry type: MULTIPOLYGON
#> Dimension:     XY
#> Bounding box:  xmin: 1776318 ymin: 5544066 xmax: 1818438 ymax: 5576891
#> Projected CRS: NZGD2000 / New Zealand Transverse Mercator 2000
#> # A tibble: 6 × 14
#>   building_id name  use     suburb_locality town_city territorial_authority
#>         <int> <chr> <chr>   <chr>           <chr>     <chr>                
#> 1     2292051 ""    Unknown Marton          Marton    Rangitikei District  
#> 2     2292353 ""    Unknown Durie Hill      Whanganui Whanganui District   
#> 3     2292407 ""    Unknown Durie Hill      Whanganui Whanganui District   
#> 4     2292675 ""    Unknown Feilding        Feilding  Manawatu District    
#> 5     2292771 ""    Unknown Feilding        Feilding  Manawatu District    
#> 6     2292825 ""    Unknown Feilding        Feilding  Manawatu District    
#> # ℹ 8 more variables: capture_method <chr>, capture_source_group <chr>,
#> #   capture_source_id <int>, capture_source_name <chr>,
#> #   capture_source_from <date>, capture_source_to <date>, last_modified <date>,
#> #   geom <MULTIPOLYGON [m]>

Created on 2025-05-25 with reprex v2.1.1