Folks, Merry Christmas and Happy New Year! 🎅 🎄 We
@wherobots thrilled to announce the Havasu spatial table format, an
@ApacheIceberg based innovation integrated into SedonaDB. For more insights, visit our blog:
wherobots.com/havasu-a-table…
As Havasu garners interest among Wherobots Cloud users, we've prepared FAQs to clarify its features and uses.
Q1: What is Havasu format?
Havasu is a spatial data lake format powered by Iceberg, comprising a specification and implementation.
The open-source specification outlines storage methods for spatial data within Iceberg's framework (Apache-2.0 license):
github.com/wherobots/havasu.
SedonaDB, a proprietary database engine based on
@ApacheSedona (
sedona.apache.org/), implements Havasu and Iceberg.
Q2: Difference between Havasu and Apache Iceberg? Are you reinventing the wheel?
Definitely not! Havasu is nothing but an extension to the Apache Iceberg format. It adds geometry and raster data types, spatial column metadata, and specific spatial data encoding in Parquet files, without altering Iceberg's table file structure.
Q3: Havasu vs.
@GeoParquet file format?
Havasu uses the GeoParquet standard for geometry in Parquet files, ensuring compatibility with Parquet / GeoParquet readers. Its raster data is stored in a Parquet-native array<struct> format, supporting both integers and floating points.
Q4: Benefits of Havasu format?
Spatial DBMS Experience: Offers ACID-compliant spatial transactions, supporting concurrent insertion/deletion/query/update operations across multiple applications on the same Havasu table, with SedonaDB enforcing spatial integrity at the ingestion time.
Performance: Reduces the need for parsing and transforming spatial data at the application level, thanks to Havasu's efficient data handling.
Spatial Filtering and Query Optimization: Utilizes spatial statistics for Parquet files to prune irrelevant data files and optimize spatial join queries.
Inherits all Apache Iceberg features.
Q5: Is Havasu open-source? I don't want to run into the vendor lock-in situation.
Yes, Havasu's specification is open-source, building on Apache Iceberg to avoid vendor lock-in. Data is stored on AWS S3 in Parquet format, readable by any Parquet reader.
SedonaDB ensures Iceberg compatibility for all non-spatial applications. Our plans to integrate some Havasu features into Apache Iceberg are underway.
#geospatial #bigdata #cloudarchitecture #parquet