Changes for page Processing a File per Line
Last modified by Erik Bakker on 2024/08/26 12:29
From version 16.1
edited by marijn
on 2022/05/22 21:30
on 2022/05/22 21:30
Change comment:
There is no comment for this version
To version 32.1
edited by Erik Bakker
on 2023/01/23 08:24
on 2023/01/23 08:24
Change comment:
There is no comment for this version
Summary
-
Page properties (5 modified, 0 added, 0 removed)
-
Attachments (0 modified, 0 added, 4 removed)
Details
- Page properties
-
- Title
-
... ... @@ -1,0 +1,1 @@ 1 +Processing a File per Line - Parent
-
... ... @@ -1,0 +1,1 @@ 1 +WebHome - Author
-
... ... @@ -1,1 +1,1 @@ 1 -XWiki. marijn1 +XWiki.ebakker - Default language
-
... ... @@ -1,0 +1,1 @@ 1 +en - Content
-
... ... @@ -1,84 +1,93 @@ 1 -{{html wiki="true"}} 2 -<div class="ez-academy"> 3 - <div class="ez-academy_body"> 1 +{{container}}{{container layoutStyle="columns"}}((( 2 +In some cases, you want to treat each unique part of your input file as its message instead of processing the complete file as its message. In this microlearning, we will learn how you can process a (large) file on a per-line basis. 4 4 5 - <divclass="doc">4 +Should you have any questions, please contact [[academy@emagiz.com>>mailto:academy@emagiz.com]]. 6 6 6 +== 1. Prerequisites == 7 7 8 +* Basic knowledge of the eMagiz platform 8 8 9 -= Annotations =10 +== 2. Key concepts == 10 10 11 - In this microlearning,we will focus on using annotations to clarify your thought process. Intheannotation, you eitherdescribeabest practice everyone shouldfollow when they change that flow (i.e. within the asynchronous routing),describehow the (morecomplex) partsof the flow work or describe(partsof) of your messagedefinitions (i.e. CDM, API Gateway Datamodel, system message,etc.). This will helpyourself and othersevery time changes are needed.12 +This microlearning centers around learning how to process an incoming file per line. 12 12 13 - Shouldyouhave anyquestions,please contact academy@emagiz.com.14 +By processing per line, we mean: Splitting up the input into discernable pieces that each will become a unique message 14 14 15 -* Last update: May 9th, 2021 16 -* Required reading time: 5 minutes 16 +* Easy way of reading a file line by line and sending it to eMagiz (Low on memory) 17 +* Ability to process each line based on distinctive logic that is relevant on line level 18 +* Can be used for flat file as well as XML input files 17 17 18 -== 1. Prerequisites==20 +== 3. Processing a File per Line == 19 19 20 - *Basicknowledge of the eMagizplatform22 +In some cases, you want to treat each unique part of your input file as its message instead of processing the complete file as its message. In this microlearning, we will learn how you can process a (large) file on a per-line basis. 21 21 22 - ==2.Key concepts==24 +To make this work in eMagiz you need to navigate to the Create phase of eMagiz and open the entry flow in which you want to retrieve the file to a certain location. Within the context of this flow, we need to add functionality that will ensure that each line is read and processed separately and will become its unique message. To do so first enter "Start Editing" mode on flow level. After you have done so please add a file item reader message source to the flow. We will use this component to read and process our input file on a per-line basis. 23 23 24 -This microlearning centers around using annotations. 25 -With annotations, we mean: A piece of text to explain something to yourself and others 26 +The first step would be to define the directory from which we read our messages. As always reference to the directory with the help of a property. 26 26 27 - Annotationscanbeusedfor:28 +[[image:Main.Images.Microlearning.WebHome@novice-file-based-connectivity-processing-a-file-per-line--file-item-reader-directory.png]] 28 28 29 -* Describing a best practice everyone should follow 30 -* Describing (more complex) parts of the flow 31 -* Describe (parts of) your message definitions 30 +Secondly, just as when reading the file as a whole ensure that you use a filter to retrieve only the correct files from the directory. 32 32 32 +=== 3.1 Item reader Type === 33 33 34 +Now it is time to select our Item reader Type. As the help text of the eMagiz component suggest there are two choices with this component. The first (and most frequently used) option is the Flat file item reader. With this option, you can read each line within the flat file input file and output is at a separate message. The second option is called the Stax event item reader. With this option, you can read your input XML and output messages on a per-record basis. 34 34 35 - == 3.Annotations==36 +[[image:Main.Images.Microlearning.WebHome@novice-file-based-connectivity-processing-a-file-per-line--item-reader-type-options.png]] 36 36 37 - In thismicrolearning,we will focus onusing annotations on the flow level to clarifyourthought process. In the annotation, you either describea bestpractice everyone should follow whenthey changethatflow (i.e. within the asynchronous routing) ordescribe how the (more complex) partsofthe flowwork. This willhelp yourself andothers every timechanges are needed within the flow.38 +Based on your choice the exact configuration will differ. 38 38 39 - Annotationscanbeusedfor:40 +==== 3.1.1 Stax Event Item Reader ==== 40 40 41 -* Describing a best practice everyone should follow 42 -* Describing (more complex) parts of the flow 43 -* Describe (parts of) your message definitions 42 +For the Stax event item reader, you need to define the name of the element on which you want to split the XML and define whether you want to throw an error in case no such element exists in the input file (By (de)selecting the option Strict). The default setting of eMagiz is advisable for this option. 44 44 45 - To clarify the use cases let us take a look at how annotations can be added within the eMagiz platform.In our first example, we will takea look at asynchronous routing.In many eMagiz projects, a best practice is followed on how toadd something to the asynchronous routing(or changesomething within theasynchronous routing). Becausethe bestpracticecontainsmultiplestepsit makes sense to use theannotationfunctionality ofeMagiz to defineall these stepsand registerthemat the place you needthem (i.e. the asynchronous routing).Havingdone so will result in something like this:44 +[[image:Main.Images.Microlearning.WebHome@novice-file-based-connectivity-processing-a-file-per-line--stax-event-item-reader-config.png]] 46 46 47 - <palign="center">[[image:novice-devops-perspectives-annotations--annotation-best-practice-async-routing.png||]]</p>46 +==== 3.1.2 Flat File Item Reader ==== 48 48 49 -The second example is about using annotations on the flow level to describe parts of the flow. In this example, we will use the annotation to describe that we use a filter to determine which messages are picked up from a local directory and how we filter. That way anyway opening the flow has to merely read the annotation to get the context. Having done so will result in something like this: 48 +For the Flat File item reader, there are some more choices and configurations to be made. There are three options you can choose from: 49 +* Pass through line mapper 50 +* Default line mapper 51 +* Pattern matching composite line mapper 50 50 51 - <p align="center">[[image:novice-devops-perspectives-annotations--describe-parts-of-flow.png||]]</p>53 +Each of these options has some advantages and disadvantages. Adhering to the best practices of eMagiz (i.e. no transformation in the entry) the best option would be to use the pass-through line mapper. As the name suggests this option does nothing except give a string back to the flow on a per line basis. However, choosing this option means that the actual transformation from that string to XML needs to happen later in the process (most likely in the onramp) with the help of a flat-file to XML transformer (more on that component in a later course). 52 52 53 -The th irdexample doesnot take place on theflow level butthemessage definitionlevel.Thereforeinsteadof goingtoCreate,wego toDesign.In Designwhenyounavigateto theCDM, API GatewayDatamodel, EventStreamingDatamodel, message definitions,etc.youhavetheoptiontoadd annotationstothe canvas.Inthisexample,we wantto make cleartoallthatmakechangesthat a certainpartof our CDM is used byalotofintegrationswithin eMagiz andthereforeeveryoneshould becarefuland thinktwicebeforeadjustinganythingrelatedto that part. Having doneso will resultinsomethinglikethis:55 +The other two options transform the input line into an XML output. So you win one step in the process. However, no standard eMagiz error handling is advisable when you start transforming data within the entry. So in case, something goes wrong to analyze the error will become more difficult. Furthermore, another potential disadvantage is that when one line fails the processing of the rest of the file also halts. 54 54 55 - <palign="center">[[image:novice-devops-perspectives-annotations--describe-crucial-part-of-cdm.png||]]</p>57 +For the remainder of this microlearning, we will assume that the option pass through line mapper is chosen. 56 56 57 - Now that we saw some examples let us turn ourattentionto the how.How canIadd an annotationand how can I link it. Addingtheannotation is simple. You drag the annotation icon from theleftcontext menu onto thecanvas. As a result, an emptyannotation will beshownon the canvas. By double-clicking on it you cantypewhateveryou want. Notethat you needto bein "Start Editing" mode to change anything, includingannotations.59 +[[image:Main.Images.Microlearning.WebHome@novice-file-based-connectivity-processing-a-file-per-line--flat-file-item-reader-passthrough.png]] 58 58 59 - <palign="center">[[image:novice-devops-perspectives-annotations--annotation-icon-context-menu.png||]]</p>61 +As you can see on the Basic level we are done. However, it is always good to check out the settings on the Advanced tab, especially in this case, to see if there are additional configuration options that could benefit us. The setting of most interest, in this case, is the Lines to Skip setting (default setting is 0). With this setting, you can define whether or not you want to process the header line(s) that exists within your input file. The remainder of the settings is (in most cases) good the way eMagiz has set them up. 60 60 61 - When youaresatisfied with what you have writtendown you can press the Save button.After you havedoneso youcanrescaletheannotationtoensure that the complete text is visible. To link the annotation to acomponent (onflow level)or anentity(on message definitionlevel) you hover over theannotation untilyour mouseindicatorchangestoa + icon, execute a right-clickanddrag from the annotation to thecomponent in question.63 +[[image:Main.Images.Microlearning.WebHome@novice-file-based-connectivity-processing-a-file-per-line--flat-file-item-reader-passthrough-advanced.png]] 62 62 63 - Nowyouknow what annotations aregood forand how you can add them within the eMagiz platform.65 +=== 3.2 Poller === 64 64 65 - =====Practice=====67 +Now that we have selected and configured the item reader type it becomes time to fill in the last part of the configuration, the poller. For polling eMagiz offers three options: 66 66 69 +- Fixed Delay Trigger 70 +- Fixed Rate Trigger 71 +- Cron Trigger 72 + 73 +Of these options, the cron trigger is used most frequently in eMagiz. The reason being is that you can define this option via a property that you can alter without having to alter the flow version in Create. 74 + 75 +[[image:Main.Images.Microlearning.WebHome@novice-file-based-connectivity-processing-a-file-per-line--poller-config.png]] 76 + 77 +After finishing all these configuration steps we can press Save to save our work and ensure that we can process the input file on a per-line basis. 78 + 67 67 == 4. Assignment == 68 68 69 - Addannotation on theflow levelthatdescribeshow(apartof)a flowworks.81 +Configure an entry in which you define the component and configuration needed to process a file on a per-line basis. 70 70 This assignment can be completed with the help of the (Academy) project that you have created/used in the previous assignment. 71 71 72 72 == 5. Key takeaways == 73 73 74 -* Annotations can be used for: 75 - * Describing a best practice everyone should follow 76 - * Describing (more complex) parts of the flow 77 - * Describe (parts of) your message definitions 78 -* You can add annotations by dragging and dropping the annotation icon on the canvas. 86 +* Easy way of reading a file line by line and sending it to eMagiz (Low on memory) 87 +* Ability to process each line based on distinctive logic that is relevant on line level 88 +* Can be used for flat file as well as XML input files 89 +* Try to avoid complex transformations within the entry 79 79 80 - 81 - 82 82 == 6. Suggested Additional Readings == 83 83 84 84 There are no suggested additional readings on this topic ... ... @@ -87,11 +87,6 @@ 87 87 88 88 This video demonstrates how you could have handled the assignment and gives you some context on what you have just learned. 89 89 90 - <iframewidth="1280" height="720" src="../../vid/microlearning/novice-devops-perspectives-annotations.mp4"frameborder="0"allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>99 +{{video attachment="novice-file-based-connectivity-processing-a-file-per-line.mp4" reference="Main.Videos.Microlearning.WebHome"/}} 91 91 92 -</div> 93 - 94 -</div> 95 -</div> 96 - 97 -{{/html}} 101 +)))((({{toc/}}))){{/container}}{{/container}}
- novice-devops-perspectives-annotations--annotation-best-practice-async-routing.png
-
- Author
-
... ... @@ -1,1 +1,0 @@ 1 -XWiki.marijn - Size
-
... ... @@ -1,1 +1,0 @@ 1 -25.7 KB - Content
- novice-devops-perspectives-annotations--annotation-icon-context-menu.png
-
- Author
-
... ... @@ -1,1 +1,0 @@ 1 -XWiki.marijn - Size
-
... ... @@ -1,1 +1,0 @@ 1 -774 bytes - Content
- novice-devops-perspectives-annotations--describe-crucial-part-of-cdm.png
-
- Author
-
... ... @@ -1,1 +1,0 @@ 1 -XWiki.marijn - Size
-
... ... @@ -1,1 +1,0 @@ 1 -15.5 KB - Content
- novice-devops-perspectives-annotations--describe-parts-of-flow.png
-
- Author
-
... ... @@ -1,1 +1,0 @@ 1 -XWiki.marijn - Size
-
... ... @@ -1,1 +1,0 @@ 1 -18.9 KB - Content