Last modified by Erik Bakker on 2024/09/09 12:37

Show last authors
1 {{container}}{{container layoutStyle="columns"}}(((
2
3 Below you will find a document describing the migration path to add the AWS Redshift Refresh functionality to a data pipeline solution you have previously built.
4 If you want to implement a new data pipeline, import it from the store, guaranteeing that this functionality will be included.
5
6 Should you have any questions, please get in touch with academy@emagiz.com.=
7
8 == 1. Prerequisites ==
9
10 * Basic knowledge of the eMagiz platform
11 * Understanding of Data pipelining concepts
12 * A existing Data pipeline solution within your eMagiz project.
13
14 == 2. Key concepts ==
15
16 * This functionality automatically refreshes the materialized view in Redshift
17 * By making eMagiz refresh it automatically, the data in your dashboards are kept up to date
18
19 == 3. Migration Path - AWS Redshist Refresh ==
20
21 Below you will find a document describing the migration path to add the AWS Redshift Refresh functionality to a data pipeline solution you have previously built.
22 If you want to implement a new data pipeline, import it from the store, guaranteeing that this functionality will be included.
23
24 === 3.1 Remove unnecessary components ===
25
26 First, we will delete components that have become obsolete as of late. The parts you can remove from the flow are:
27
28 * support.bus-connection-plain
29 * support.bus-connection-caching
30
31 Furthermore, you could remove the following debug components as every interesting step is already monitored and can therefore be tracked without the help of the debugger:
32
33 * global channel interceptor
34 * activate.debug-bridge
35 * send.debug
36 * entry.channel.debug-queue
37 * debugBridgeChannel
38
39 === 3.2 Add new components to refresh AWS Redshift ===
40
41 AWS Redshift offers you the option to create a materialized view. A materialized view is a table that combines information from several source tables into one view.
42 To keep updating this materialized view when a source table is updated eMagiz has created functionality to refresh the materialized view.
43
44 To make sure that your existing data pipeline will function in the same way you should execute the following steps:
45
46 * Add a channel called job-execution
47 * Add a channel called dummy
48 * Add a channel called job-completed
49 * Add a channel called job-completed-delay
50 * Add a channel called log
51 * Add a job execution listener gateway and configure it as follows
52 [[image:Main.Images.Migrationpath.WebHome@migration-path-aws-redshift-refresh--migration-path-aws-redshift-refresh-job-execution-listener.png]]
53 * Add a standard filter and configure it as follows (the expression checks whether the Job is already finished) -> payload.status == T(org.springframework.batch.core.BatchStatus).COMPLETED
54 [[image:Main.Images.Migrationpath.WebHome@migration-path-aws-redshift-refresh--migration-path-aws-redshift-refresh-execution-status.png]]
55 * Add a standard activator and configure it as follows (the expression puts the thread to sleep for 60 seconds) -> T(java.lang.Thread).sleep(60000) ?: 'AWS Materialized view refreshed.'
56 [[image:Main.Images.Migrationpath.WebHome@migration-path-aws-redshift-refresh--migration-path-aws-redshift-refresh-delay.png]]
57 * Add a JDBC outbound channel adapter and configure it as follows (query can be found below)
58 [[image:Main.Images.Migrationpath.WebHome@migration-path-aws-redshift-refresh--migration-path-aws-redshift-refresh-jdbc-outbound.png]]
59 * Add a logging channel adapter and configure it as follows
60 [[image:Main.Images.Migrationpath.WebHome@migration-path-aws-redshift-refresh--migration-path-aws-redshift-refresh-log.png]]
61 * Make sure that the job-completed-delay channel is wiretapped with the help of the log channel
62
63 === 3.3 Query you need for refresh ===
64
65 The following query is needed to refresh the AWS materialized view. First, make sure that the name of the materialized view is filled in as a property value.
66
67 REFRESH MATERIALIZED VIEW ${dp.jdbc.distinctive-name.tablename}\_mv;
68 COMMIT;
69
70 === 3.4 Intermediate Result ===
71
72 The result should look something like this:
73
74 [[image:Main.Images.Migrationpath.WebHome@migration-path-aws-redshift-refresh--migration-path-job-dashboard-cleanup-result-part-one.png]]
75
76 === 3.5 Add listener to job ===
77
78 If you only add the listener section in the flow, you would have a solution that only partly works.
79 To make sure the complete chain of events starts working in unison, you need to tell the job which listeners need to listen to the execution of the job.
80
81 To do so, open the component called support. job.{technicalnameofmessagetype}. On the advanced tab, you need to add two listeners:
82
83 * A Gateway Listener -> Select the job execution listener that you have added in step 2
84 * A Exception Listener -> This one makes sure that when an error occurs in AWS Redshift, the job status will be Failed instead of Completed.
85
86 The configuration should look as follows:
87
88 [[image:Main.Images.Migrationpath.WebHome@migration-path-aws-redshift-refresh--migration-path-aws-redshift-refresh-add-job-listeners.png]]
89
90 === 3.6 End Result ===
91
92 The result of all this should look something like this:
93
94 [[image:Main.Images.Migrationpath.WebHome@migration-path-aws-redshift-refresh--migration-path-job-dashboard-cleanup-end-result.png]]
95
96 == 4. Key takeaways ==
97
98 * This functionality automatically refreshes the materialized view in Redshift
99 * By making eMagiz refresh it automatically, the data in your dashboards are kept up to date
100
101 == 5. Suggested Additional Readings ==
102
103 * [[AWS Redshift (Search Result)>>url:https://docs.emagiz.com/bin/view/Main/Search?sort=score&sortOrder=desc&highlight=true&facet=true&r=1&f_space_facet=0%2FMain.&f_type=DOCUMENT&f_locale=en&f_locale=&f_locale=en&text=%22aws+redshift%22||target="blank"]]
104
105
106 )))((({{toc/}}))){{/container}}{{/container}}