How to Delete Rows That do not Exist in Another Table
Often at work, we will pull data from an upstream system into a delta table, delete the matching rows in the main table and insert the updated rows. This works great and is fast but has a major assumption : That the rows from the upstream system will NEVER disappear.
As part of a process I am working on, I needed to be able to the delta table run but then remove any rows that disappeared from the upstream system. Below are a few ways you can do this. I suggest you use the first as it is the most efficient of the three assuming you have a index on your key columns.
Using LEFT JOIN/IS NULL:
DELETE FROM BLOB b LEFT JOIN FILES f ON f.id = b.fileid WHERE f.id IS NULL
Using NOT EXISTS:
DELETE FROM BLOB WHERE NOT EXISTS(SELECT NULL FROM FILES f WHERE f.id = fileid)
Using NOT IN:
DELETE FROM BLOB WHERE fileid NOT IN (SELECT f.id FROM FILES f)