The merge feature enables you to merge two or more tables into one.
A typical use case for this feature is to blend information from similar-but-different source systems. For example, an insurance company might have different legacy systems for each type of cover that it offers. While the concept of “policy” may be similar, the structure and content of the data in each system may be different. The merge operation helps you to combine that data into one consolidated policy table.
In this tutorial, we will walk through the creation of a merge table from a data source.
Create Test Data
Using a text editor, create two CSV files with the following data:
merge1.csv: PolicyNumber,PolicyName,SumInsured 12345,Ron Dunn,10000 23456,Ian Nicholson,20000 34567,Rob Dumont,30000 45678,Ngo Phi Nga,15000 merge2.csv: Policy,Insured,PolicyHolder ABCDE,12000,Dan Drum FGHIJ,8000,Au How Leng KLMNO,23000,Russell Park-Miller PQRST,12345,Erik Baardse
Save the files in one of the following locations:
Windows: c:\ajilius2\samples\merge1.csv c:\ajilius2\samples\merge2.csv Linux / OSX: /opt/ajilius2/samples/merge1.csv /opt/ajilius2/samples/merge2.csv
In this step we will load the merge1.csv file into the data warehouse.
Using the CHINOOK sample data warehouse, drill into the CSV Files data source:
Load the metadata for the merge1 table:
Run the scripts to create and load the load_merge1 table:
View the load_merge1 table to confirm that the data has been loaded:
In this step we will load file merge2.csv into the data warehouse.
Repeat the steps that you took to load merge1.csv in the previous section, but this time load and verify merge2.csv instead of merge1.
When you view load_merge2 it should look like this:
Transform load_merge1 to stage_merge1
We now create a staging table from the load_merge1 load table. This will form the target for the later merge operation.
From the LOAD TABLES list, select CREATE STAGE on the context menu for the load_merge1 table that you created in the previous section:
Create the stage table:
Run the scripts to create and populate the stage table:
View the table data to confirm that it has been loaded:
Transform load_merge2 to stage_merge1
Now we are going to define the merge relationship between our second load table (load_merge2) and the original staging table (stage_merge1). This is different from the standard transform, in that we won’t be creating a new table as a result, but using an existing table instead.
From the LOAD TABLES list, select CREATE MERGE from the context menu for the load_merge2 table:
Create the merge definition by updating the table name (stage_merge2) and description, and selecting stage_merge1 as is the merge target:
Note the difference in the heading for the right-hand list of columns. For merge tables, we show the columns of the target table (stage_merge1):
Now, select the COLUMNS option from the context menu for the stage_merge2 table that you just created:
The COLUMN MAP screen shows all of the columns from the target table (stage_merge1), with a drop down list of the columns from the source table (load_merge2). Update the column mapping:
Run the scripts to populate the merge. Note that there is no CREATE script to run, because we are targeting a table that has already been created for stage_merge1.
Finally, view the data for stage_merge1, to verify that the table now contains the combined contents of both load tables:
Merge transforms help you to bring together different tables into a consolidated view of data. Their most common use is to combine similar data from different computer systems.
In this tutorial we demonstrated the creation of a merge from LOAD tables. Merges may also be created from STAGE tables, enabling you to clean and transform data before the merge.
The steps demonstrated in this tutorial are:
- Load two different sets of data
- Create a target table from one set of data
- Merge a second set of data into the target table
If you have any questions, please contact us for support.