Google states that BigQuery is for DWH’es that have append by and large and fewer updates.
For a star schema based DWH with optional fact table attributes that could be updated and dimensions that are historized, then is this a goer, or do we need the Redshift approach of small staging tables generated with the new or updated data that needs to be part of an UPSERT query?
Is this type of approach possible in BigQuery using Spark?
spark.sql(""" MERGE INTO CUSTOMERS_AT_REST USING CUST_DELTA ON CUSTOMERS_AT_REST.col_key = CUST_DELTA.col_key WHEN MATCHED THEN UPDATE SET * WHEN NOT MATCHED THEN INSERT * """)
It is all good on delta on gcp cloud storage.
Read more here: Source link