Loading....

Oracle GoldenGate not only provides us a replication solution that is Oracle version independent as well as platform independent, but we can also use it to do data transformation and data manipulation between the source and the target.

So we can use GoldenGate when the source and database database differ in table structure as well as an ETL tool in a Datawarehouse type environment.

We will discuss below two examples to demonstrate this feature – column mapping and filtering of data.

In example 1, we will filter the records that are extracted on the source and applied on the target – only rows where the JOB column value equals ‘MANAGER” in the MYEMP table will be considered for extraction.

In example 2, we will deal with a case where the table structure is different between the source database and the target database and see how column mapping is performed in such cases.

Example 1

Initial load of all rows which match the filter from source to target. The target database MYEMP table will only be populated with rows from the EMP table where filter criteria of JOB=’MANAGER’ is met.

On Source

GGSCI (redhat346.localdomain) 4> add extract myload1, sourceistable
EXTRACT added.

GGSCI (redhat346.localdomain) 5> edit params myload1

EXTRACT myload1
USERID ggs_owner, PASSWORD ggs_owner
RMTHOST devu007, MGRPORT 7809
RMTTASK replicat, GROUP myload1
TABLE scott.myemp, FILTER (@STRFIND (job, “MANAGER”) > 0);

On Target

GGSCI (devu007) 2> add replicat myload1, specialrun
REPLICAT added.

GGSCI (devu007) 3> edit params myload1

“/u01/oracle/software/goldengate/dirprm/myload1.prm” [New file] REPLICAT myload1
USERID ggs_owner, PASSWORD ggs_owner
ASSUMETARGETDEFS
MAP scott.myemp, TARGET sh.myemp;

On Source – start the initial load extract

GGSCI (redhat346.localdomain) 6> start extract myload1

Sending START request to MANAGER …
EXTRACT MYLOAD1 starting

On SOURCE

SQL> select count(*) from myemp;

COUNT(*)
———-
14

SQL> select count(*) from myemp where job=’MANAGER’;

COUNT(*)
———-
9

On TARGET

SQL> select count(*) from myemp where job=’MANAGER’;

COUNT(*)
———-
9

Create an online change extract and replicat group using a Filter

GGSCI (redhat346.localdomain) 10> add extract myload2, tranlog, begin now
EXTRACT added.

GGSCI (redhat346.localdomain) 11> add rmttrail /u01/oracle/software/goldengate/dirdat/bb, extract myload2
RMTTRAIL added.

GGSCI (redhat346.localdomain) 11> edit params myload2

EXTRACT myload2
USERID ggs_owner, PASSWORD ggs_owner
RMTHOST 10.53.200.225, MGRPORT 7809
RMTTRAIL /u01/oracle/software/goldengate/dirdat/bb
TABLE scott.myemp, FILTER (@STRFIND (job, “MANAGER”) > 0);

On Target

GGSCI (devu007) 2> add replicat myload2, exttrail /u01/oracle/software/goldengate/dirdat/bb
REPLICAT added.

GGSCI (devu007) 3> edit params myload2

“/u01/oracle/software/goldengate/dirprm/myload2.prm” [New file] REPLICAT myload2
ASSUMETARGETDEFS
USERID ggs_owner, PASSWORD ggs_owner
MAP scott.myemp, TARGET sh.myemp;

On Source – start the online extract group

GGSCI (redhat346.localdomain) 13> start extract myload2

Sending START request to MANAGER …
EXTRACT MYLOAD2 starting

GGSCI (redhat346.localdomain) 14> info extract myload2

EXTRACT MYLOAD2 Last Started 2010-02-23 11:04 Status RUNNING
Checkpoint Lag 00:27:39 (updated 00:00:08 ago)
Log Read Checkpoint Oracle Redo Logs
2010-02-23 10:36:51 Seqno 214, RBA 103988

On Target

GGSCI (devu007) 4> start replicat myload2

Sending START request to MANAGER …
REPLICAT MYLOAD2 starting

GGSCI (devu007) 5> info replicat myload2

REPLICAT MYLOAD2 Last Started 2010-02-23 11:05 Status RUNNING
Checkpoint Lag 00:00:00 (updated 00:00:08 ago)
Log Read Checkpoint File /u01/oracle/software/goldengate/dirdat/bb000000
First Record RBA 989

On Source we now insert two rows into the MYEMP table – one which has the JOB value of ‘MANAGER’ and the other row which has the job value of ‘SALESMAN’


On SOURCE

SQL> INSERT INTO MYEMP
2 (empno,ename,job,sal)
3 VALUES
4 (1234,’GAVIN’,’MANAGER‘,10000);

1 row created.

SQL> commit;

Commit complete.

SQL> INSERT INTO MYEMP
2 (empno,ename,job,sal)
3 VALUES
4 (1235,’BOB’,’SALESMAN‘,1000);

1 row created.

SQL> commit;

Commit complete.

SQL> select count(*) from myemp;
COUNT(*)
———-
16

SQL> select count(*) from myemp where job=’MANAGER’;

COUNT(*)
———-
10

On Target, we will see that even though two rows have been inserted into the source MYEMP table, on the target MYEMP table only one row is inserted because the filter has been applied which only includes the rows where the JOB value equals ‘MANAGER’.

SQL> select count(*) from myemp;

COUNT(*)
———-
10

Example 2 – source and target table differ in column structure

In the source MYEMP table we have a column named SAL whereas on the target, the same MYEMP table has the column defined as SALARY.

Create a definitions file on the source using DEFGEN utility and then copy that definitions file to the target system

GGSCI (redhat346.localdomain) > EDIT PARAMS defgen

DEFSFILE /u01/oracle/ggs/dirsql/myemp.sql
USERID ggs_owner, PASSWORD ggs_owner
TABLE scott.myemp;

[oracle@redhat346 ggs]$ ./defgen paramfile /u01/oracle/ggs/dirprm/defgen.prm

***********************************************************************
Oracle GoldenGate Table Definition Generator for Oracle
Version 10.4.0.19 Build 002
Linux, x64, 64bit (optimized), Oracle 11 on Sep 18 2009 00:09:13

Copyright (C) 1995, 2009, Oracle and/or its affiliates. All rights reserved.

Starting at 2010-02-23 11:22:17
***********************************************************************

Operating System Version:
Linux
Version #1 SMP Wed Dec 17 11:41:38 EST 2008, Release 2.6.18-128.el5
Node: redhat346.localdomain
Machine: x86_64
soft limit hard limit
Address Space Size : unlimited unlimited
Heap Size : unlimited unlimited
File Size : unlimited unlimited
CPU Time : unlimited unlimited

Process id: 14175

***********************************************************************
** Running with the following parameters **
***********************************************************************
DEFSFILE /u01/oracle/ggs/dirsql/myemp.sql
USERID ggs_owner, PASSWORD *********
TABLE scott.myemp;
Retrieving definition for SCOTT.MYEMP

Definitions generated for 1 tables in /u01/oracle/ggs/dirsql/myemp.sql

If we were to try and run the replicat process on the target without copying the definitions file, we will see an error as shown below which pertains to the fact that the columns in the source and target database are different and GoldenGate is not able to resolve that.

2010-02-23 11:31:07 GGS WARNING 218 Aborted grouped transaction on ‘SH.MYEMP’, Database error 904 (ORA-00904: “SAL”: invalid identifier).

2010-02-23 11:31:07 GGS WARNING 218 SQL error 904 mapping SCOTT.MYEMP to SH.MYEMP OCI Error ORA-00904: “SAL”: invalid identifier (status = 904), SQL .

We then ftp the definitions file from the source to the target system – in this case to the dirsql directory located in the top level GoldenGate installed software directory

We now go and make a change to the original replicat parameter file and change the parameter ASSUMEDEFS to SOURCEDEFS which provides GoldenGate with the location of the definitions file.

The other parameter which is included is the COLMAP parameter which tells us how the column mapping has been performed. The ‘USEDEFAULTS’ keyword denotes that all the other columns in both tables are identical except for the columns SAL and SALARY which differ in both tables and now we are mapping the SAL columsn in source to the SALARY column on the target.

REPLICAT myload2
SOURCEDEFS /u01/oracle/software/goldengate/dirsql/myemp.sql
USERID ggs_owner, PASSWORD ggs_owner
MAP scott.myemp, TARGET sh.myemp,
COLMAP (usedefaults,
salary = sal);

We now go and start the originall replicat process myload2 which had abended because of the column mismatch (which has now been corrected via the parameter change) and we see that the process now is running without any error.

now go and start the process which had failed after table modification

GGSCI (devu007) 2> info replicat myload2

REPLICAT MYLOAD2 Last Started 2010-02-23 11:05 Status ABENDED
Checkpoint Lag 00:00:03 (updated 00:11:44 ago)
Log Read Checkpoint File /u01/oracle/software/goldengate/dirdat/bb000000
2010-02-23 11:31:03.999504 RBA 1225

GGSCI (devu007) 3> start replicat myload2

Sending START request to MANAGER …
REPLICAT MYLOAD2 starting

GGSCI (devu007) 4> info replicat myload2

REPLICAT MYLOAD2 Last Started 2010-02-23 11:43 Status RUNNING
Checkpoint Lag 00:00:00 (updated 00:00:03 ago)
Log Read Checkpoint File /u01/oracle/software/goldengate/dirdat/bb000000
2010-02-23 11:31:03.999504 RBA 1461

Coming Next! – Monitoring the GoldenGate environment …..

Please follow and like us:

Last Update: May 4, 2017  

February 25, 2010 117 Gavin Soorma
Total 0 Votes:
0

Tell us how can we improve this post?

+ = Verify Human or Spambot ?

Add A Knowledge Base Question !

You will receive an email when your question will be answered.

+ = Verify Human or Spambot ?

, , , ,

9 thoughts on “Oracle Goldengate Tutorial 8 – Filtering and Mapping data

  1. shubha

    Hi

    Ur documents are very good for basic configuration.

    We have one query – can we replicate data if table having differnt number of columns on source & target database. suppose source database have 1 table with 10 columns,target database have 5 columns on that table.

    Waiting for your reply

    Thanks in advance

  2. Gavin Soorma

    Hi Shubha – yes it is certainly possible – we can change table structure on the target – column names are different – data types are different – columns are fewer etc etc – read page 247 of the GoldenGate Administration Guide – it has reference to COLS and COLSEXCEPT keywords which basically enables us to specify the columns to include or exclude from the replication process – try it out and let me know how it goes as I have not tried that feature – thanks

  3. Pingback: GoldenGate – using FILTER, COMPUTE and SQLEXEC commands « Oracle DBA – Tips and Techniques

  4. Vinay Kumar

    Hi Gavin,

    Is it possible to map tables with different structure.
    Like in Source table we have 3 columns but in target table we want only 2 columns.
    How do we achieve that?
    Thanks in advance.

    Regards,
    Vinay Kumar G

  5. victor

    Hi Gavin,

    we are building a real-time ODS sourcing several Oracle OLTP systems, and want to run PL/SQL package at target to do an ETL-like transform at the end of GG replication, all synchronous, using GG as the invoker of the PL/SQL package. could you be kind enough to give an example of how one would source let’s say INVOICE headers, lines, and CUSTOMERS using GG, and then transform related records, namely invoice header, associated lines, and associated customer with a PL/SQL call, combining new CDC data for the sub-set that came from GG with already resident data on the target.. basic ETL pattern using both source and target data, all in GG.. Is this doable?

    thanks so much in advance!

  6. Pradeep Gupta

    Good work Gavin, I am having 3 years of good exp in GoldenGate. I went thru some of your documents and found very clear and nicely written.

    Many thanks for all your blogs, I will surely recommend your site for beginners.

    Cheers

  7. Parry

    Hi Gavin,

    Thanks for your tutorial.

    Question.. how would we handle records to be loaded in a target table only when they are new inserts? I don’t need to load any updates to a table, as I need to preserve the initial inserted value.

    Thanks,
    Parry

Leave a Comment

Your email address will not be published. Required fields are marked *

*
*

Social media & sharing icons powered by UltimatelySocial
Back To Top

Add A Knowledge Base Question !

You will receive an email when your question will be answered.

+ = Verify Human or Spambot ?