our BI environment uses Oracle Golden Gate which will replicate tables from source environment on Solaris servers to it.
The environment on BI is same as that of the source since we use downstream configuration for Golden Gate. The architecture uses SQL/PLSQL scripts to process data from the replicated tables.
We have another environment which uses Big Data. It uses Streamet/talend to extract data from source and process the data instead of using SQL/PLSQL scripts as from above architecture for the transformation.
I don't know anything about Streamet/talend. Can it perform CDC?
Goldengate is reliable and it can do CDC, as you know.
It is reliable and the best solution for making CDC on Oracle environments.
PLSQL transformation is cheap but it should be managed well. It should be under control of some sort of application.. Else, you will do lots of maintanence.
2 diagrams that you sent are telling me different stories.
In one of them, you use Hadoop and in the other you use Oracle RDBMS and RAC.
I don't know your business story and purpose for building this kind of an environment, but you should decide the correct environment according to your needs.
Using Oracle and Hadoop together is an alternative that we consider while dealing with enteprise Data Warehouses. We generally use Hadoop for offloading some of the processing and we still use Exadata (Oracle RAC) in the front line.
Orchestration and integration between the platforms is also important.
So, as for CDC, there are other solutions available on the market.
One of the is STRIIM. We ( as GTech )are the distributor of it. So maybe you can take a look at it as well..
Of course, if you want to have a POC or consultancy, we are happy to help.
We are already implemented this solution it lots of Enterprise Companies.
After GoldenGate extracts data from source and replicates to our BI environment, we use SQL/PLSQL scripts which are scheduled on crontab to process data using the tables replicated and load into a temporary table. The issue is as data grows(tables), the processing time wil increase and more resource will be needed.
I would like to connect the hadoop with streamset to our BI platform to prevent us using SQL/PLSQL scripts.
This streamset seems someting like Big Data Sql. We have Big Data Sql option , you know that right?
Oracle Big Data Sql and Gluent are alternatives for this approach.
The idea is correct. You offload some of the processing needs to cheap Big Data Platforms (like Hadoop)
As for PLSQL related transforrmation, we generally use ODI.
Lastly, crontab based scheduling is so primitive, youı should replace that in the first place. Actions should be event driven, One should start after one.. Or they should run parallel when necessary.. You can even write your own code to manage that transformaton, a code for orchestrating it..
That may not be enabled in the trial version, but we are checking.. I will get back to you.
If you are considering a Striim implementation, we can guide you.. We have a department that is responsible from Striim.
So if you are considering it seriously, send me an email(email@example.com) from your company email, thus I can redirect you to the Striim consultants.