google cloud platform – gcp BigQuery for Dimensional Star Schema Data Warehouse build performance
Google states that BigQuery is for DWH’es that have append by and large and fewer updates.
For a star schema based DWH with optional fact table attributes that could be updated and dimensions that are historized, then is this a goer, or do we need the Redshift approach of small staging tables generated with the new or updated data that needs to be part of an UPSERT query?
Is this type of approach possible in BigQuery using Spark?
spark.sql(""" MERGE INTO CUSTOMERS_AT_REST
USING CUST_DELTA
ON CUSTOMERS_AT_REST.col_key = CUST_DELTA.col_key
WHEN MATCHED THEN
UPDATE SET *
WHEN NOT MATCHED THEN
INSERT *
""")
It is all good on delta on gcp cloud storage.
Read more here: Source link