Last modified by Erik Bakker on 2024/08/26 12:37

From version 30.2
edited by Erik Bakker
on 2022/06/10 13:23
Change comment: Update document after refactoring.
To version 39.1
edited by Erik Bakker
on 2022/10/31 09:06
Change comment: There is no comment for this version

Summary

Details

Page properties
Title
... ... @@ -1,1 +1,1 @@
1 -novice-file-based-connectivity-characterset
1 +Volume Mapping (On-premise)
Default language
... ... @@ -1,0 +1,1 @@
1 +en
Content
... ... @@ -1,11 +1,8 @@
1 1  {{container}}{{container layoutStyle="columns"}}(((
2 -In some cases, you want to treat each unique part of your input file as its message instead of processing the complete file as its message. In this microlearning, we will learn how you can process a (large) file on a per-line basis.
2 +When you need to read and write files from an on-premise disk, you need to know the path in which the data is stored and make sure that the docker container in your runtime(s) running has access to this path. There are several ways of dealing with this challenge. First, this microlearning will discuss the various alternatives and best approaches in these scenarios.
3 3  
4 4  Should you have any questions, please contact [[academy@emagiz.com>>mailto:academy@emagiz.com]].
5 5  
6 -* Last update: May 31th, 2021
7 -* Required reading time: 7 minutes
8 -
9 9  == 1. Prerequisites ==
10 10  
11 11  * Basic knowledge of the eMagiz platform
... ... @@ -12,93 +12,99 @@
12 12  
13 13  == 2. Key concepts ==
14 14  
15 -This microlearning centers around learning how to process an incoming file per line.
12 +This microlearning centers around learning how to set up your volume mapping correctly so you can exchange file-based data on-premise.
16 16  
17 -By processing per line, we mean: Splitting up the input into discernable pieces that each will become a unique message
14 +By volume mapping, we mean: Creating a configuration through which the docker container can read and write data on a specific path on an on-premise machine.
18 18  
19 -* Easy way of reading a file line by line and sending it to eMagiz (Low on memory)
20 -* Ability to process each line based on distinctive logic that is relevant on line level
21 -* Can be used for flat file as well as XML input files
16 +There are several options for volume mapping for your on-premise machine.
17 +* Volume
18 +* Bind mount
19 +* Temporary file system
20 +* Named pipe
22 22  
23 -== 3. Processing a File per Line ==
22 +== 3. Volume Mapping (On-premise) ==
24 24  
25 -In some cases, you want to treat each unique part of your input file as its message instead of processing the complete file as its message. In this microlearning, we will learn how you can process a (large) file on a per-line basis.
24 +When you need to read and write files from an on-premise disk, you need to know the path in which the data is stored and make sure that the docker container in your runtime(s) running has access to this path. There are several ways of dealing with this challenge. First, this microlearning will discuss the various alternatives and best approaches in these scenarios.
26 26  
27 -To make this work in eMagiz you need to navigate to the Create phase of eMagiz and open the entry flow in which you want to retrieve the file to a certain location. Within the context of this flow, we need to add functionality that will ensure that each line is read and processed separately and will become its unique message. To do so first enter "Start Editing" mode on flow level. After you have done so please add a file item reader message source to the flow. We will use this component to read and process our input file on a per-line basis.
26 +There are several options for volume mapping for your on-premise machine.
27 +* Volume
28 +* Bind mount
29 +* Temporary file system
30 +* Named pipe
28 28  
29 -The first step would be to define the directory from which we read our messages. As always reference to the directory with the help of a property.
32 +Below we will explain the differences between the various options available for your volume mapping. But before we do, we first explain how to set up this configuration within eMagiz. Then, you must navigate to Deploy -> Architecture on the model level. In this overview, you can access the Volume mapping per runtime deployed on-premise. To do so, you can right-click on the runtime to access the context menu.
30 30  
31 -[[image:Main.Images.Microlearning.WebHome@novice-file-based-connectivity-processing-a-file-per-line--file-item-reader-directory.png]]
34 +[[image:Main.Images.Microlearning.WebHome@novice-file-based-connectivity-volume-mapping-on-premise--volume-option-context-menu.png]]
32 32  
33 -Secondly, just as when reading the file as a whole ensure that you use a filter to retrieve only the correct files from the directory.
36 +When you click this option, you will see the following pop-up. In this pop-up, you can define the machine-level and runtime-level volumes. More on that later. This is the starting point for configuring your volume mapping. We will walk through each available option and explain how they work and should be configured.
34 34  
35 -=== 3.1 Item reader Type ===
38 +[[image:Main.Images.Microlearning.WebHome@novice-file-based-connectivity-volume-mapping-on-premise--volume-mapping-pop-up.png]]
36 36  
37 -Now it is time to select our Item reader Type. As the help text of the eMagiz component suggest there are two choices with this component. The first (and most frequently used) option is the Flat file item reader. With this option, you can read each line within the flat file input file and output is at a separate message. The second option is called the Stax event item reader. With this option, you can read your input XML and output messages on a per-record basis.
40 +{{info}}Note that you should be in "Start editing" mode to make any changes to the configuration of your volume mapping.
38 38  
39 -[[image:Main.Images.Microlearning.WebHome@novice-file-based-connectivity-processing-a-file-per-line--item-reader-type-options.png]]
42 +=== 3.1 Volume ===
40 40  
41 -Based on your choice the exact configuration will differ.
44 +To make this work in eMagiz you need to navigate to the Create phase of eMagiz and open the entry flow in which you want to archive the files. Within the context of this flow, we need to add functionality that will ensure that each input file is archived and cleaned up when older than three days. To do so first enter "Start Editing" mode on flow level. The first decision we have to take is how we are going to name the files within the archiving. The best practice, in this case, is the original filename + the current time as a suffix. You can define this by dragging a format file name generator (support object) to the canvas.
42 42  
43 -==== 3.1.1 Stax Event Item Reader ====
46 +[[image:Main.Images.Microlearning.WebHome@novice-file-based-connectivity-archiving--file-name-generator.png]]
44 44  
45 -For the Stax event item reader, you need to define the name of the element on which you want to split the XML and define whether you want to throw an error in case no such element exists in the input file (By (de)selecting the option Strict). The default setting of eMagiz is advisable for this option.
48 +After we have done this please add a file outbound channel adapter to the flow including an input channel. Ensure that you use a property for the directory that references another directory compared to the input directory to prevent creating an infinite loop.
46 46  
47 -[[image:Main.Images.Microlearning.WebHome@novice-file-based-connectivity-processing-a-file-per-line--stax-event-item-reader-config.png]]
50 +[[image:Main.Images.Microlearning.WebHome@novice-file-based-connectivity-archiving--archiving-config-file-outbound-basic.png]]
48 48  
49 -==== 3.1.2 Flat File Item Reader ====
52 +Now that we have configured the basics let us turn our attention to the advanced configuration. In the advanced tab of this component, we need to select the file name generator to ensure that the files are named correctly. In case you process each line separately you have to choose whether to save them as separate files in the archive or by appending them again. This can be achieved by selecting the correct Mode. In most cases, however, the default Mode of Replace will suffice.
50 50  
51 -For the Flat File item reader, there are some more choices and configurations to be made. There are three options you can choose from:
52 -- Pass through line mapper
53 -- Default line mapper
54 -- Pattern matching composite line mapper
54 +[[image:Main.Images.Microlearning.WebHome@novice-file-based-connectivity-archiving--archiving-config-file-outbound-advanced.png]]
55 55  
56 -Each of these options has some advantages and disadvantages. Adhering to the best practices of eMagiz (i.e. no transformation in the entry) the best option would be to use the pass-through line mapper. As the name suggests this option does nothing except give a string back to the flow on a per line basis. However, choosing this option means that the actual transformation from that string to XML needs to happen later in the process (most likely in the onramp) with the help of a flat-file to XML transformer (more on that component in a later course).
56 +The moment you are satisfied press Save. Now that we have configured this it becomes time to determine how we get the needed input to write to our archive. In the example we are using here we want to archive our input file so we need to ensure that the data we received is written to the archive as soon as possible. To do so place a wiretap on the first channel after retrieving the file. This will make sure that the message is archived before processed further. The result should be something as shown below. Note that this same piece of logic could be applied in other flows within the eMagiz platform in a similar manner.
57 57  
58 -The other two options transform the input line into an XML output. So you win one step in the process. However, no standard eMagiz error handling is advisable when you start transforming data within the entry. So in case, something goes wrong to analyze the error will become more difficult. Furthermore, another potential disadvantage is that when one line fails the processing of the rest of the file also halts.
58 +[[image:Main.Images.Microlearning.WebHome@novice-file-based-connectivity-archiving--archiving-result.png]]
59 59  
60 -For the remainder of this microlearning, we will assume that the option pass through line mapper is chosen.
60 +=== 3.2 Clean up the Archive ===
61 61  
62 -[[image:Main.Images.Microlearning.WebHome@novice-file-based-connectivity-processing-a-file-per-line--flat-file-item-reader-passthrough.png]]
62 +To ensure that the data is not kept indefinitely we need to clean up the archive. We do so to prevent problems with disk space but also to prevent data leaks of old data that could impact the privacy of others. Before we can set up the logic in eMagiz we need to talk to the customer to see what an acceptable term is within which the data is kept. In most cases, this is a week or two weeks. In this example, we have chosen three days.
63 63  
64 -As you can see on the Basic level we are done. However, it is always good to check out the settings on the Advanced tab, especially in this case, to see if there are additional configuration options that could benefit us. The setting of most interest, in this case, is the Lines to Skip setting (default setting is 0). With this setting, you can define whether or not you want to process the header line(s) that exists within your input file. The remainder of the settings is (in most cases) good the way eMagiz has set them up.
64 +Now that we know the limit it is time to configure the components. We start with a composite file filter (support object). Within this filter, we at least define how old a file must be before it can be deleted (in milliseconds). If we turn three days into milliseconds we get 259200000. Furthermore, we at least define that we only want to delete regular files.
65 65  
66 -[[image:Main.Images.Microlearning.WebHome@novice-file-based-connectivity-processing-a-file-per-line--flat-file-item-reader-passthrough-advanced.png]]
66 +[[image:Main.Images.Microlearning.WebHome@novice-file-based-connectivity-archiving--file-list-filter-for-archive-cleanup.png]]
67 67  
68 -=== 3.2 Poller ===
68 +Having done so we can add a file inbound channel adapter to the canvas including an output channel. Ensure that the property reference for the directory matches the one you have used before in the outbound channel adapter. Furthermore link the filter to the component and define the poller according to the best practice.
69 69  
70 -Now that we have selected and configured the item reader type it becomes time to fill in the last part of the configuration, the poller. For polling eMagiz offers three options:
70 +[[image:Main.Images.Microlearning.WebHome@novice-file-based-connectivity-archiving--file-inbound-archive-cleanup.png]]
71 71  
72 -- Fixed Delay Trigger
73 -- Fixed Rate Trigger
74 -- Cron Trigger
72 +One thing we should not forget within this configuration is to set the Max messages per poll on the Advanced tab of the poller-configuration to a sufficiently high number (i.e. 50). If you forget to do so and you only check once a day it will mean that only one message will be deleted that day.
75 75  
76 -Of these options, the cron trigger is used most frequently in eMagiz. The reason being is that you can define this option via a property that you can alter without having to alter the flow version in Create.
74 +[[image:Main.Images.Microlearning.WebHome@novice-file-based-connectivity-archiving--file-inbound-archive-cleanup-max-messages-per-poll.png]]
77 77  
78 -[[image:Main.Images.Microlearning.WebHome@novice-file-based-connectivity-processing-a-file-per-line--poller-config.png]]
76 +Now eMagiz will check on a set time interval whether there are files that are older than three days that are ready for deletion. One last step to go. This last step will ensure that all files that fit the bill will be deleted from the archive. Simply add a standard service activator to the canvas and define the following SPeL expression within the component: payload.delete().
79 79  
80 -After finishing all these configuration steps we can press Save to save our work and ensure that we can process the input file on a per-line basis.
78 +[[image:Main.Images.Microlearning.WebHome@novice-file-based-connectivity-archiving--archive-cleanup-deletion.png]]
81 81  
80 +This will ensure that each file that is retrieved will indeed be deleted from the archive.
81 +
82 82  == 4. Assignment ==
83 83  
84 -Configure an entry in which you define the component and configuration needed to process a file on a per-line basis.
84 +Configure an entry in which you build the archiving and the clean up of the archiving.
85 85  This assignment can be completed with the help of the (Academy) project that you have created/used in the previous assignment.
86 86  
87 87  == 5. Key takeaways ==
88 88  
89 -* Easy way of reading a file line by line and sending it to eMagiz (Low on memory)
90 -* Ability to process each line based on distinctive logic that is relevant on line level
91 -* Can be used for flat file as well as XML input files
92 -* Try to avoid complex transformations within the entry
89 +* Archiving is used for audit purposes
90 +* Archiving is used for retry scenarios
91 +* Ensure that data is cleaned after a retention period to keep in control of the data
92 +* Don't forget the max messages per poll
93 93  
94 94  == 6. Suggested Additional Readings ==
95 95  
96 -There are no suggested additional readings on this topic
96 +If you are interested in this topic and want more information on it please read the help text provided by eMagiz and check out the following store content:
97 97  
98 +* [[File Archiving>>doc:Main.eMagiz Store.Accelerators.File Archiving.WebHome||target="blank"]]
99 +* [[Delete Folder(s)>>doc:Main.eMagiz Store.Accelerators.Delete Folder(s).WebHome||target="blank"]]
100 +
98 98  == 7. Silent demonstration video ==
99 99  
100 100  This video demonstrates how you could have handled the assignment and gives you some context on what you have just learned.
101 101  
102 -{{video attachment="novice-file-based-connectivity-processing-a-file-per-line.mp4" reference="Main.Videos.Microlearning.WebHome"/}}
105 +{{video attachment="novice-file-based-connectivity-characterset.mp4" reference="Main.Videos.Microlearning.WebHome"/}}
103 103  
104 104  )))((({{toc/}}))){{/container}}{{/container}}