Changes for page Processing a File per Line
Last modified by Erik Bakker on 2024/08/26 12:29
From version 31.1
edited by Erik Bakker
on 2022/08/22 14:17
on 2022/08/22 14:17
Change comment:
There is no comment for this version
Summary
-
Page properties (5 modified, 0 added, 0 removed)
-
Attachments (0 modified, 4 added, 0 removed)
Details
- Page properties
-
- Title
-
... ... @@ -1,1 +1,0 @@ 1 -Processing a File per Line - Parent
-
... ... @@ -1,1 +1,0 @@ 1 -WebHome - Author
-
... ... @@ -1,1 +1,1 @@ 1 -XWiki. ebakker1 +XWiki.marijn - Default language
-
... ... @@ -1,1 +1,0 @@ 1 -en - Content
-
... ... @@ -1,93 +1,84 @@ 1 -{{container}}{{container layoutStyle="columns"}}((( 2 -In some cases, you want to treat each unique part of your input file as its message instead of processing the complete file as its message. In this microlearning, we will learn how you can process a (large) file on a per-line basis. 1 +{{html wiki="true"}} 2 +<div class="ez-academy"> 3 + <div class="ez-academy_body"> 3 3 4 - Shouldyou haveany questions, please contact [[academy@emagiz.com>>mailto:academy@emagiz.com]].5 +<div class="doc"> 5 5 6 -== 1. Prerequisites == 7 7 8 -* Basic knowledge of the eMagiz platform 9 9 10 -= =2. Key concepts ==9 += Annotations = 11 11 12 - This microlearning centers around learning how to processanincomingfile perline.11 +In this microlearning, we will focus on using annotations to clarify your thought process. In the annotation, you either describe a best practice everyone should follow when they change that flow (i.e. within the asynchronous routing), describe how the (more complex) parts of the flow work or describe (parts of) of your message definitions (i.e. CDM, API Gateway Data model, system message, etc.). This will help yourself and others every time changes are needed. 13 13 14 - By processing perline,we mean: Splittinguptheinputntodiscernablepieces thateach will becomeauniquessage13 +Should you have any questions, please contact academy@emagiz.com. 15 15 16 -* Easy way of reading a file line by line and sending it to eMagiz (Low on memory) 17 -* Ability to process each line based on distinctive logic that is relevant on line level 18 -* Can be used for flat file as well as XML input files 15 +* Last update: May 9th, 2021 16 +* Required reading time: 5 minutes 19 19 20 -== 3. Processing a Fileper Line==18 +== 1. Prerequisites == 21 21 22 - Insome cases, you want to treat eachunique partof your input fileas its messageinsteadofprocessingthecomplete fileas its message. In thismicrolearning, we will learn how you canprocess a (large)fileon a per-line basis.20 +* Basic knowledge of the eMagiz platform 23 23 24 - Tomake this work in eMagiz you need to navigate to the Create phase of eMagiz and open the entry flow in which you want to retrieve the file to a certain location.Within thecontext of this flow, we need to add functionalitythat will ensure that each line is read and processed separately and will becomeits unique message. To do so first enter "Start Editing" mode on flow level. After you have done soplease add a file item reader messagesource to the flow. We will use this component to read and process our input file on a per-line basis.22 +== 2. Key concepts == 25 25 26 -The first step would be to define the directory from which we read our messages. As always reference to the directory with the help of a property. 24 +This microlearning centers around using annotations. 25 +With annotations, we mean: A piece of text to explain something to yourself and others 27 27 28 - [[image:Main.Images.Microlearning.WebHome@novice-file-based-connectivity-processing-a-file-per-line--file-item-reader-directory.png]]27 +Annotations can be used for: 29 29 30 -Secondly, just as when reading the file as a whole ensure that you use a filter to retrieve only the correct files from the directory. 29 +* Describing a best practice everyone should follow 30 +* Describing (more complex) parts of the flow 31 +* Describe (parts of) your message definitions 31 31 32 -=== 3.1 Item reader Type === 33 33 34 -Now it is time to select our Item reader Type. As the help text of the eMagiz component suggest there are two choices with this component. The first (and most frequently used) option is the Flat file item reader. With this option, you can read each line within the flat file input file and output is at a separate message. The second option is called the Stax event item reader. With this option, you can read your input XML and output messages on a per-record basis. 35 35 36 - [[image:Main.Images.Microlearning.WebHome@novice-file-based-connectivity-processing-a-file-per-line--item-reader-type-options.png]]35 +== 3. Annotations == 37 37 38 - Basedon your choice the exact configurationwill differ.37 +In this microlearning, we will focus on using annotations on the flow level to clarify our thought process. In the annotation, you either describe a best practice everyone should follow when they change that flow (i.e. within the asynchronous routing) or describe how the (more complex) parts of the flow work. This will help yourself and others every time changes are needed within the flow. 39 39 40 - ==== 3.1.1 StaxEventItemReader====39 +Annotations can be used for: 41 41 42 -For the Stax event item reader, you need to define the name of the element on which you want to split the XML and define whether you want to throw an error in case no such element exists in the input file (By (de)selecting the option Strict). The default setting of eMagiz is advisable for this option. 41 +* Describing a best practice everyone should follow 42 +* Describing (more complex) parts of the flow 43 +* Describe (parts of) your message definitions 43 43 44 - [[image:Main.Images.Microlearning.WebHome@novice-file-based-connectivity-processing-a-file-per-line--stax-event-item-reader-config.png]]45 +To clarify the use cases let us take a look at how annotations can be added within the eMagiz platform. In our first example, we will take a look at asynchronous routing. In many eMagiz projects, a best practice is followed on how to add something to the asynchronous routing (or change something within the asynchronous routing). Because the best practice contains multiple steps it makes sense to use the annotation functionality of eMagiz to define all these steps and register them at the place you need them (i.e. the asynchronous routing). Having done so will result in something like this: 45 45 46 - ====3.1.2 Flat FileItemReader====47 +<p align="center">[[image:novice-devops-perspectives-annotations--annotation-best-practice-async-routing.png||]]</p> 47 47 48 -For the Flat File item reader, there are some more choices and configurations to be made. There are three options you can choose from: 49 -- Pass through line mapper 50 -- Default line mapper 51 -- Pattern matching composite line mapper 49 +The second example is about using annotations on the flow level to describe parts of the flow. In this example, we will use the annotation to describe that we use a filter to determine which messages are picked up from a local directory and how we filter. That way anyway opening the flow has to merely read the annotation to get the context. Having done so will result in something like this: 52 52 53 - Each of these optionshas some advantages and disadvantages. Adheringto thebest practices of eMagiz (i.e.notransformation in theentry) the best option wouldbetouse thepass-through line mapper. Asthe name suggests this option does nothing exceptgiveastring back to the flow ona per line basis. However, choosingthis optionmeansthat theactual transformation from that string to XML needs to happen laterinthe process (most likely in the onramp) with the help ofa flat-file toXML transformer (more on that component in a later course).51 +<p align="center">[[image:novice-devops-perspectives-annotations--describe-parts-of-flow.png||]]</p> 54 54 55 -The othertwooptionstransformtheinputlineintoanXMLoutput.Soyou winpin theprocess.However,nostandardeMagizerrorhandlingisadvisablewhenyoustart transformingdatawithintheentry.Soin case,somethinggoeswrongto analyzethe errorwillbecomemoredifficult.Furthermore,anotherpotentialdisadvantagetwhenone linefailstheprocessingoftherest ofthefilealsoalts.53 +The third example does not take place on the flow level but the message definition level. Therefore instead of going to Create, we go to Design. In Design when you navigate to the CDM, API Gateway Data model, Event Streaming Data model, message definitions, etc. you have the option to add annotations to the canvas. In this example, we want to make clear to all that make changes that a certain part of our CDM is used by a lot of integrations within eMagiz and therefore everyone should be careful and think twice before adjusting anything related to that part. Having done so will result in something like this: 56 56 57 - Fortheemainderof this microlearning, we will assumehat theptionpassthrough linemapperis chosen.55 +<p align="center">[[image:novice-devops-perspectives-annotations--describe-crucial-part-of-cdm.png||]]</p> 58 58 59 - [[image:Main.Images.Microlearning.WebHome@novice-file-based-connectivity-processing-a-file-per-line--flat-file-item-reader-passthrough.png]]57 +Now that we saw some examples let us turn our attention to the how. How can I add an annotation and how can I link it. Adding the annotation is simple. You drag the annotation icon from the left context menu onto the canvas. As a result, an empty annotation will be shown on the canvas. By double-clicking on it you can type whatever you want. Note that you need to be in "Start Editing" mode to change anything, including annotations. 60 60 61 - Asyou cansee onheBasic level we are done. However,it isalwaysgood to check out the settingson the Advancedtab,especially in thiscase,to seeif thereare additional configuration options thatcould benefit us. The settingof most interest, in this case, is the Lines toSkip setting (defaultsetting is 0). With this setting, you candefine whetherornot you wantto processtheheader line(s) that exists within your input file.The remainder of the settings is (in most cases) good the way eMagiz has set them up.59 +<p align="center">[[image:novice-devops-perspectives-annotations--annotation-icon-context-menu.png||]]</p> 62 62 63 - [[image:Main.Images.Microlearning.WebHome@novice-file-based-connectivity-processing-a-file-per-line--flat-file-item-reader-passthrough-advanced.png]]61 +When you are satisfied with what you have written down you can press the Save button. After you have done so you can rescale the annotation to ensure that the complete text is visible. To link the annotation to a component (on flow level) or an entity (on message definition level) you hover over the annotation until your mouse indicator changes to a + icon, execute a right-click and drag from the annotation to the component in question. 64 64 65 - ===3.2Poller===63 +Now you know what annotations are good for and how you can add them within the eMagiz platform. 66 66 67 - Nowthat we have selected and configured the item reader type it becomestime to fill in the last part of theconfiguration, thepoller. For polling eMagiz offers three options:65 +===== Practice ===== 68 68 69 -- Fixed Delay Trigger 70 -- Fixed Rate Trigger 71 -- Cron Trigger 72 - 73 -Of these options, the cron trigger is used most frequently in eMagiz. The reason being is that you can define this option via a property that you can alter without having to alter the flow version in Create. 74 - 75 -[[image:Main.Images.Microlearning.WebHome@novice-file-based-connectivity-processing-a-file-per-line--poller-config.png]] 76 - 77 -After finishing all these configuration steps we can press Save to save our work and ensure that we can process the input file on a per-line basis. 78 - 79 79 == 4. Assignment == 80 80 81 - Configureanentryinwhich you definethecomponentandconfiguration neededto processa fileona per-line basis.69 +Add annotation on the flow level that describes how (a part of) a flow works. 82 82 This assignment can be completed with the help of the (Academy) project that you have created/used in the previous assignment. 83 83 84 84 == 5. Key takeaways == 85 85 86 -* Easy way of reading a file line by line and sending it to eMagiz (Low on memory) 87 -* Ability to process each line based on distinctive logic that is relevant on line level 88 -* Can be used for flat file as well as XML input files 89 -* Try to avoid complex transformations within the entry 74 +* Annotations can be used for: 75 + * Describing a best practice everyone should follow 76 + * Describing (more complex) parts of the flow 77 + * Describe (parts of) your message definitions 78 +* You can add annotations by dragging and dropping the annotation icon on the canvas. 90 90 80 + 81 + 91 91 == 6. Suggested Additional Readings == 92 92 93 93 There are no suggested additional readings on this topic ... ... @@ -96,6 +96,11 @@ 96 96 97 97 This video demonstrates how you could have handled the assignment and gives you some context on what you have just learned. 98 98 99 - {{videoattachment="novice-file-based-connectivity-processing-a-file-per-line.mp4" reference="Main.Videos.Microlearning.WebHome"/}}90 +<iframe width="1280" height="720" src="../../vid/microlearning/novice-devops-perspectives-annotations.mp4" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe> 100 100 101 -)))((({{toc/}}))){{/container}}{{/container}} 92 +</div> 93 + 94 +</div> 95 +</div> 96 + 97 +{{/html}}
- novice-devops-perspectives-annotations--annotation-best-practice-async-routing.png
-
- Author
-
... ... @@ -1,0 +1,1 @@ 1 +XWiki.marijn - Size
-
... ... @@ -1,0 +1,1 @@ 1 +25.7 KB - Content
- novice-devops-perspectives-annotations--annotation-icon-context-menu.png
-
- Author
-
... ... @@ -1,0 +1,1 @@ 1 +XWiki.marijn - Size
-
... ... @@ -1,0 +1,1 @@ 1 +774 bytes - Content
- novice-devops-perspectives-annotations--describe-crucial-part-of-cdm.png
-
- Author
-
... ... @@ -1,0 +1,1 @@ 1 +XWiki.marijn - Size
-
... ... @@ -1,0 +1,1 @@ 1 +15.5 KB - Content
- novice-devops-perspectives-annotations--describe-parts-of-flow.png
-
- Author
-
... ... @@ -1,0 +1,1 @@ 1 +XWiki.marijn - Size
-
... ... @@ -1,0 +1,1 @@ 1 +18.9 KB - Content