Here we will load the HUBS and LINKS using simple Oracle PL/SQL. Loading HUBS is pretty straight forward. Following is a flow diagram. Using PL/SQL, I will use the Oracle MERGE command to perform a set-based UPSERT. This technique scales well.
Here is a code snippet in (PL/SQL) , note that we only have one STG source for this HUB. If there were more, I simply UNION all the sources together in the SELECT.
MERGE INTO HUB_GLOBAL_NAME hub
USING
(
SELECT DISTINCT
GLOBAL_NAME AS GLOBAL_NAME,
STG_DATA_SOURCE, STG_LAST_UPDATE_DTTM
FROM STG_ALL_TABLES
) stg
ON (hub.GLOBAL_NAME = stg.GLOBAL_NAME)
WHEN MATCHED THEN
UPDATE
SET
hub.HUB_LAST_SEEN_DTTM = hub.HUB_LOAD_DTTM
where (HUB_LAST_SEEN_DTTM != HUB_LOAD_DTTM)
WHEN NOT MATCHED THEN
INSERT
(
HUB_GLOBAL_NAME_SQN, GLOBAL_NAME, HUB_LOAD_DTTM,
HUB_DATA_SOURCE,HUB_LAST_SEEN_DTTM
)
VALUES (
HUB_GLOBAL_NAME_SQN.NEXTVAL,
stg.GLOBAL_NAME,
stg.STG_LAST_UPDATE_DTTM,
stg.STG_DATA_SOURCE,
stg.STG_LAST_UPDATE_DTTM
)
;
I will comment on DATEs used later, suffice to say the LOAD_DTTM comes from the STAGE table and the LAST_SEEN_DTTM is an indicator that show the date of the last point in time the Business Key(s) were not found in ANY feed. Loading LINKS is very similar to HUBs; the source for the mapping is a list of distinct combinations of keys. Again, the Oracle MERGE can do the trick.
MERGE INTO LNK_ALL_TABLES lnk
USING
(
SELECT /*+ RULE */ DISTINCT
HUB_GLOBAL_NAME.HUB_GLOBAL_NAME_SQN,
HUB_OWNER.HUB_OWNER_SQN,
HUB_TABLE_NAME.HUB_TABLE_NAME_SQN, STG_ALL_TABLES.STG_LAST_UPDATE_DTTM, STG_ALL_TABLES.STG_DATA_SOURCE
FROM STG_ALL_TABLES,
HUB_GLOBAL_NAME,
HUB_OWNER,
HUB_TABLE_NAME
WHERE
STG_ALL_TABLES.GLOBAL_NAME = HUB_GLOBAL_NAME.GLOBAL_NAME
AND HUB_GLOBAL_NAME.HUB_LOAD_DTTM = HUB_GLOBAL_NAME.HUB_LAST_SEEN_DTTM
AND STG_ALL_TABLES.OWNER = HUB_OWNER.OWNER
AND HUB_OWNER.HUB_LOAD_DTTM = HUB_OWNER.HUB_LAST_SEEN_DTTM
AND STG_ALL_TABLES.TABLE_NAME = HUB_TABLE_NAME.TABLE_NAME
AND HUB_TABLE_NAME.HUB_LOAD_DTTM = HUB_TABLE_NAME.HUB_LAST_SEEN_DTTM
) stg
ON (
stg.HUB_GLOBAL_NAME_SQN = lnk.HUB_GLOBAL_NAME_SQN
and stg.HUB_OWNER_SQN = lnk.HUB_OWNER_SQN
and stg.HUB_TABLE_NAME_SQN = lnk.HUB_TABLE_NAME_SQN
)
WHEN MATCHED THEN
UPDATE
SET
lnk.LNK_LAST_SEEN_DTTM = lnk.LNK_LOAD_DTTM
WHERE (LNK_LAST_SEEN_DTTM != LNK_LOAD_DTTM)
WHEN NOT MATCHED THEN
INSERT
(
LNK_ALL_TABLES_SQN,HUB_GLOBAL_NAME_SQN,HUB_OWNER_SQN,HUB_TABLE_NAME_SQN,LNK_LAST_SEEN_DTTM,LNK_LOAD_DTTM,LNK_DATA_SOURCE)
VALUES (
LNK_ALL_TABLES_SQN.NEXTVAL,
stg.HUB_GLOBAL_NAME_SQN,
stg.HUB_OWNER_SQN,
stg.HUB_TABLE_NAME_SQN,
stg.STG_LAST_UPDATE_DTTM, stg.STG_LAST_UPDATE_DTTM, stg.STG_DATA_SOURCE
);
This approach scales very nicely too. Next time I will load the SATELLITE srtucture.


