Wiki source code of migration-path-aws-redshift-refresh
                  Last modified by Erik Bakker on 2024/09/09 12:37
              
      Show last authors
| author | version | line-number | content | 
|---|---|---|---|
| 1 | {{container}}{{container layoutStyle="columns"}}((( | ||
| 2 | |||
| 3 | Below you will find a document describing the migration path to add the AWS Redshift Refresh functionality to a data pipeline solution you have previously built. | ||
| 4 | If you want to implement a new data pipeline, import it from the store, guaranteeing that this functionality will be included. | ||
| 5 | |||
| 6 | Should you have any questions, please get in touch with academy@emagiz.com.= | ||
| 7 | |||
| 8 | == 1. Prerequisites == | ||
| 9 | |||
| 10 | * Basic knowledge of the eMagiz platform | ||
| 11 | * Understanding of Data pipelining concepts | ||
| 12 | * A existing Data pipeline solution within your eMagiz project. | ||
| 13 | |||
| 14 | == 2. Key concepts == | ||
| 15 | |||
| 16 | * This functionality automatically refreshes the materialized view in Redshift | ||
| 17 | * By making eMagiz refresh it automatically, the data in your dashboards are kept up to date | ||
| 18 | |||
| 19 | == 3. Migration Path - AWS Redshist Refresh == | ||
| 20 | |||
| 21 | Below you will find a document describing the migration path to add the AWS Redshift Refresh functionality to a data pipeline solution you have previously built. | ||
| 22 | If you want to implement a new data pipeline, import it from the store, guaranteeing that this functionality will be included. | ||
| 23 | |||
| 24 | === 3.1 Remove unnecessary components === | ||
| 25 | |||
| 26 | First, we will delete components that have become obsolete as of late. The parts you can remove from the flow are: | ||
| 27 | |||
| 28 | * support.bus-connection-plain | ||
| 29 | * support.bus-connection-caching | ||
| 30 | |||
| 31 | Furthermore, you could remove the following debug components as every interesting step is already monitored and can therefore be tracked without the help of the debugger: | ||
| 32 | |||
| 33 | * global channel interceptor | ||
| 34 | * activate.debug-bridge | ||
| 35 | * send.debug | ||
| 36 | * entry.channel.debug-queue | ||
| 37 | * debugBridgeChannel | ||
| 38 | |||
| 39 | === 3.2 Add new components to refresh AWS Redshift === | ||
| 40 | |||
| 41 | AWS Redshift offers you the option to create a materialized view. A materialized view is a table that combines information from several source tables into one view. | ||
| 42 | To keep updating this materialized view when a source table is updated eMagiz has created functionality to refresh the materialized view. | ||
| 43 | |||
| 44 | To make sure that your existing data pipeline will function in the same way you should execute the following steps: | ||
| 45 | |||
| 46 | * Add a channel called job-execution | ||
| 47 | * Add a channel called dummy | ||
| 48 | * Add a channel called job-completed | ||
| 49 | * Add a channel called job-completed-delay | ||
| 50 | * Add a channel called log | ||
| 51 | * Add a job execution listener gateway and configure it as follows | ||
| 52 | [[image:Main.Images.Migrationpath.WebHome@migration-path-aws-redshift-refresh--migration-path-aws-redshift-refresh-job-execution-listener.png]] | ||
| 53 | * Add a standard filter and configure it as follows (the expression checks whether the Job is already finished) -> payload.status == T(org.springframework.batch.core.BatchStatus).COMPLETED | ||
| 54 | [[image:Main.Images.Migrationpath.WebHome@migration-path-aws-redshift-refresh--migration-path-aws-redshift-refresh-execution-status.png]] | ||
| 55 | * Add a standard activator and configure it as follows (the expression puts the thread to sleep for 60 seconds) -> T(java.lang.Thread).sleep(60000) ?: 'AWS Materialized view refreshed.' | ||
| 56 | [[image:Main.Images.Migrationpath.WebHome@migration-path-aws-redshift-refresh--migration-path-aws-redshift-refresh-delay.png]] | ||
| 57 | * Add a JDBC outbound channel adapter and configure it as follows (query can be found below) | ||
| 58 | [[image:Main.Images.Migrationpath.WebHome@migration-path-aws-redshift-refresh--migration-path-aws-redshift-refresh-jdbc-outbound.png]] | ||
| 59 | * Add a logging channel adapter and configure it as follows | ||
| 60 | [[image:Main.Images.Migrationpath.WebHome@migration-path-aws-redshift-refresh--migration-path-aws-redshift-refresh-log.png]] | ||
| 61 | * Make sure that the job-completed-delay channel is wiretapped with the help of the log channel | ||
| 62 | |||
| 63 | === 3.3 Query you need for refresh === | ||
| 64 | |||
| 65 | The following query is needed to refresh the AWS materialized view. First, make sure that the name of the materialized view is filled in as a property value. | ||
| 66 | |||
| 67 | REFRESH MATERIALIZED VIEW ${dp.jdbc.distinctive-name.tablename}\_mv; | ||
| 68 | COMMIT; | ||
| 69 | |||
| 70 | === 3.4 Intermediate Result === | ||
| 71 | |||
| 72 | The result should look something like this: | ||
| 73 | |||
| 74 | [[image:Main.Images.Migrationpath.WebHome@migration-path-aws-redshift-refresh--migration-path-job-dashboard-cleanup-result-part-one.png]] | ||
| 75 | |||
| 76 | === 3.5 Add listener to job === | ||
| 77 | |||
| 78 | If you only add the listener section in the flow, you would have a solution that only partly works. | ||
| 79 | To make sure the complete chain of events starts working in unison, you need to tell the job which listeners need to listen to the execution of the job. | ||
| 80 | |||
| 81 | To do so, open the component called support. job.{technicalnameofmessagetype}. On the advanced tab, you need to add two listeners: | ||
| 82 | |||
| 83 | * A Gateway Listener -> Select the job execution listener that you have added in step 2 | ||
| 84 | * A Exception Listener -> This one makes sure that when an error occurs in AWS Redshift, the job status will be Failed instead of Completed. | ||
| 85 | |||
| 86 | The configuration should look as follows: | ||
| 87 | |||
| 88 | [[image:Main.Images.Migrationpath.WebHome@migration-path-aws-redshift-refresh--migration-path-aws-redshift-refresh-add-job-listeners.png]] | ||
| 89 | |||
| 90 | === 3.6 End Result === | ||
| 91 | |||
| 92 | The result of all this should look something like this: | ||
| 93 | |||
| 94 | [[image:Main.Images.Migrationpath.WebHome@migration-path-aws-redshift-refresh--migration-path-job-dashboard-cleanup-end-result.png]] | ||
| 95 | |||
| 96 | == 4. Key takeaways == | ||
| 97 | |||
| 98 | * This functionality automatically refreshes the materialized view in Redshift | ||
| 99 | * By making eMagiz refresh it automatically, the data in your dashboards are kept up to date | ||
| 100 | |||
| 101 | == 5. Suggested Additional Readings == | ||
| 102 | |||
| 103 | * [[AWS Redshift (Search Result)>>url:https://docs.emagiz.com/bin/view/Main/Search?sort=score&sortOrder=desc&highlight=true&facet=true&r=1&f_space_facet=0%2FMain.&f_type=DOCUMENT&f_locale=en&f_locale=&f_locale=en&text=%22aws+redshift%22||target="blank"]] | ||
| 104 | |||
| 105 | |||
| 106 | )))((({{toc/}}))){{/container}}{{/container}} | 
