Last modified by Erik Bakker on 2024/09/09 12:38

Show last authors
1 {{container}}{{container layoutStyle="columns"}}(((
2 Below you will find a document describing the migration path to add the Job Dashboard Cleanup functionality to a data pipeline solution you have previously built.
3 If you want to implement a new data pipeline, import it from the store, guaranteeing that this functionality will be included.
4
5 Should you have any questions, please get in touch with academy@emagiz.com.
6
7 == 1. Prerequisites ==
8
9 * Basic knowledge of the eMagiz platform
10 * Understanding of Data pipelining concepts
11 * A existing Data pipeline solution within your eMagiz project.
12
13 == 2. Key concepts ==
14
15 * This functionality makes sure that the Job Dashboard will be available and will only show the relevant data of the last 30 days
16
17 == 3.Job Dashboard Cleanup ==
18
19 Below you will find a document describing the migration path to add the Job Dashboard Cleanup functionality to a data pipeline solution you have previously built.
20 If you want to implement a new data pipeline, import it from the store, guaranteeing that this functionality will be included.
21
22 === 3.1 Remove unnecessary components ===
23
24 First, we will delete components that have become obsolete as of late. The parts you can remove from the flow are:
25
26 * support.bus-connection-plain
27 * support.bus-connection-caching
28
29 Furthermore, you could remove the following debug components as every interesting step is already monitored and can therefore be tracked without the help of the debugger:
30
31 * global channel interceptor
32 * activate.debug-bridge
33 * send.debug
34 * entry.channel.debug-queue
35 * debugBridgeChannel
36
37 === 3.2 Add new components to cleanup the Job Dashboard ===
38
39 We have made it possible to clean up the Job Dashboard for you with the new functionality. This ensures that you can keep accessing the job info of the last month of job activity.
40
41 To make sure that your existing data pipeline will function in the same way, you should execute the following steps:
42
43 * Add a support object called top level poller and configure it as follows
44 [[image:Main.Images.Migrationpath.WebHome@migration-path-job-dashboard-cleanup--migration-path-job-dashboard-cleanup-top-level-poller-config.png]]
45 * Add a channel called clean
46 * Add a standard inbound channel adapter called clean.cron and configure it as follows (As you can see it cleans the job dashboard every day at five in the morning)
47 [[image:Main.Images.Migrationpath.WebHome@migration-path-job-dashboard-cleanup--migration-path-job-dashboard-cleanup-clean-cron-config.png]]
48 * Add a standard inbound channel adapter called startup.cron and configure it as follows (It cleans the job dashboard on startup)
49 [[image:Main.Images.Migrationpath.WebHome@migration-path-job-dashboard-cleanup--migration-path-job-dashboard-cleanup-startup-cron-config.png]]
50 * Add a JDBC outbound channel adapter to your flow
51 * Use the clean channel as input
52 * Link it to the h2 database that is in your flow
53 * Enter the query that you can find below
54 [[image:Main.Images.Migrationpath.WebHome@migration-path-aws-redshift-refresh--migration-path-job-dashboard-cleanup-result-part-one.png]]
55
56 === 3.2 Query you need for cleanup ===
57
58 The following query is needed to cleanup all relevant parts of the job dashboard to ensure that only the last month's jobs are still visible.
59
60 {{code language=none}}
61 DELETE FROM BATCH_JOB_EXECUTION_CONTEXT WHERE
62 JOB_EXECUTION_ID IN (SELECT JOB_EXECUTION_ID FROM BATCH_JOB_EXECUTION WHERE DATEADD(MONTH, 1, CREATE_TIME) < CURDATE());
63 DELETE FROM BATCH_JOB_EXECUTION_PARAMS WHERE
64 JOB_EXECUTION_ID IN (SELECT JOB_EXECUTION_ID FROM BATCH_JOB_EXECUTION WHERE DATEADD(MONTH, 1, CREATE_TIME) < CURDATE());
65 DELETE FROM BATCH_STEP_EXECUTION_CONTEXT WHERE
66 STEP_EXECUTION_ID IN (SELECT STEP_EXECUTION_ID FROM BATCH_STEP_EXECUTION WHERE
67 JOB_EXECUTION_ID IN (SELECT JOB_EXECUTION_ID FROM BATCH_JOB_EXECUTION WHERE DATEADD(MONTH, 1, CREATE_TIME) < CURDATE()));
68 DELETE FROM BATCH_STEP_EXECUTION WHERE
69 JOB_EXECUTION_ID IN (SELECT JOB_EXECUTION_ID FROM BATCH_JOB_EXECUTION WHERE DATEADD(MONTH, 1, CREATE_TIME) < CURDATE());
70 DELETE FROM BATCH_JOB_EXECUTION WHERE DATEADD(MONTH, 1, CREATE_TIME) < CURDATE();
71 DELETE FROM BATCH_JOB_INSTANCE WHERE
72 JOB_INSTANCE_ID NOT IN (SELECT JOB_INSTANCE_ID FROM BATCH_JOB_EXECUTION);
73 {{/code}}
74
75 === 3.3 Result ===
76
77 The result should look something like this:
78
79 [[image:Main.Images.Migrationpath.WebHome@migration-path-aws-redshift-refresh--migration-path-job-dashboard-cleanup-result-part-one.png]]
80
81 [[image:Main.Images.Migrationpath.WebHome@migration-path-job-dashboard-cleanup--migration-path-job-dashboard-cleanup-result.png]]
82
83 == 4. Key takeaways ==
84
85 * This functionality makes sure that the Job Dashboard will be available and will only show the relevant data of the last 30 days
86
87 )))((({{toc/}}))){{/container}}{{/container}}