Website

Professional geospatial services


Crowdfunding

Spatialys proposes different projects, related to the improvements of open source geospatial software on which we have recognized expertise (GDAL, PROJ, etc.), for which your financial contribution is needed to enable their fulfillment. The size of those projects is such that is hard to find a single sponsor to fund them.

Completed past projects

PROJ: use of remote grids

1. Stakes

PROJ 6 brings undeniable advances in the management of coordinate transformations between datums, by relying on information available in the PROJ database. To get accurate results, most of the time a grid file is needed. The proj-datumgrid project centralizes all the grids that are available under an open data license, and bundles them in different archives split along big geographical regions of the world. That approach assumes that the user of PROJ has downloaded and installed those optional, but strongly recommended, packages, otherwise it will get inaccurate results. However there are use cases, like serverless compute solutions, where it is impossible to bundle of them given their size (currently more than 700 MB of uncompressed data, and growing) and the restrictions set by the cloud provider.

2. Summary of the improvements brought by this project

The goal of this project is to provide additional capabilities to PROJ:

The use of grids locally available will of course still be available, and will be the default behaviour.

3. Detailed tasks

We have structured this work around three work packages, a core, and two optional, but strongly desirable additions.
  • Work Package 1 (core)
    • curl will be an optional build dependency in autoconf/cmake (if curl available, used by default)
    • Network access abstracted through an interface (with C callbacks through C API) attached to the PROJ context, and curl used as the default implementation when available.
    • Download of grids will not be enabled by default, and i will require the user to set an environment variable or set an attribute on the PROJ context.
    • When enabled, all grids known to PROJ in the database (that is in grid_alternatives table) will be assumed to be available through the CDN, and thus for sorting ad filtering logic in createOperations() will be treated as if there are local files.
    • Deep refactoring of the PROJ code dealing with grid use, so as to avoid to ingest everything in memory as done currently. This should also benefit to local access on big grids.
    • Network access will only be attempted if the file doesn't exist locally.
    • The network layer will use a in-memory cache of chunks like GDAL /vsicurl/, both to limit the number of small GET requests and have caching effect.
    • In download mode, download failures will be propagated as PROJ errors for coordinate computations.
    • The currently supported grid formats (CTable2, NTv1, NTv2 .gsb, GTX), using a line-based approach, will still be used.
    • Upload of the content of existing proj-datumgrid on the CDN storage.
  • Work Package 2 (local disk cache)
    • a SQLite3 database, located in a user writable directory, will hold partially downloaded chunks grid files
    • Access to it will be thread-safe and multiprocess-safe.
  • Work Package 3 (tiled GeoTIFF files for grids)
    • Tiles are better suited for piece-wise download than scanline oriented formats (although this is a bit difficult to anticipate how much benefit this will give concretely).
    • GeoTIFF is a well known format that has more tools to deal with. A number of websites with grids in that format. TIFF has built-in capability to add metadata, whereas dedicated grid formats have often few and limited provision for them.
    • As we cannot use libgeotiff, since libgeotiff uses PROJ, a minimal parsing of GeoTIFF information (extraction of the upper left pixel coordinate and resolution from the GeoKeys) will be done in the PROJ codebase.
    • libtiff will be added as an optional dependency, but required to be able to use GeoTIFF grids, and required to be able to use the download capability. Grids distributed on the CDN will only be in GeoTIFF format, while grids distributed as proj-datumgrid will be available with two options: with .gsb/.gtx files or with GeoTIFF file (might be subject to adjustment after discussion with the larger PROJ community).
    • The NTv2 format has an extra capability when compared to other formats, which is the possibility to have sub-grids. This is for example used by the original Canadian NTV2 gid, ntv2_0.gsb. However the way such subgrids are organized in the NTv2 format is not cloud optimized (the descriptor of each all subgrid is immediately before the data of the subgrid, so they are spread all over the file, and opening that NTv2 file thus may require a lot of seeks). In a TIFF storage, we could implement this subgrid concept with the TIFF IFD concept, and use a technique similar to Cloud Optimized GeoTIFF (COG) files where the descriptors for all subgrids would be put at the beginning of the file, so they can be fetched in a single GET request. So that subtask includes improvement in the GDAL GTiff driver to be able to write such an optimized TIFF file from a source dataset with subdatasets, and make PROJ to be able to use it.

4. Costs and state of funding

  • WP 1 (core). Funding target: 8000 euros. Reached
  • WP 2 (local disk cache). Funding target: 2000 euros. Reached
  • WP 2 (GeoTIFF grids). Funding target: 4800 euros. Reached

We would thank the sponsors of this crowdfunding:

GDAL Coordinate System Barn Raising

The gdalbarn.com website is dedicated to this successful compaign which led to the release of PROJ 6.0, GDAL 3.0 and libgeotiff 1.5.