Changes for page Volume Mapping (On-premise)
Last modified by Erik Bakker on 2024/08/26 12:37
From version 30.1
edited by Erik Bakker
on 2022/06/10 13:22
on 2022/06/10 13:22
Change comment:
There is no comment for this version
To version 70.1
edited by Erik Bakker
on 2024/03/05 13:03
on 2024/03/05 13:03
Change comment:
There is no comment for this version
Summary
-
Page properties (3 modified, 0 added, 0 removed)
Details
- Page properties
-
- Title
-
... ... @@ -1,1 +1,1 @@ 1 - Processinga FileperLine1 +Volume Mapping (On-premise) - Default language
-
... ... @@ -1,0 +1,1 @@ 1 +en - Content
-
... ... @@ -1,11 +1,8 @@ 1 1 {{container}}{{container layoutStyle="columns"}}((( 2 - In somecases, youwantto treateachuniquepartofyourinputfile asitsmessage insteadofprocessingthecompletefile as itsmessage.In this microlearning,we willlearnhowyou can processa(large) fileonaper-linebasis.2 +When you need to read and write files from an on-premise disk, you need to know the path in which the data is stored and ensure that the docker container in your runtime(s) running has access to this path. There are several ways of dealing with this challenge. This microlearning will discuss the various alternatives and best approaches in these scenarios. 3 3 4 4 Should you have any questions, please contact [[academy@emagiz.com>>mailto:academy@emagiz.com]]. 5 5 6 -* Last update: May 31th, 2021 7 -* Required reading time: 7 minutes 8 - 9 9 == 1. Prerequisites == 10 10 11 11 * Basic knowledge of the eMagiz platform ... ... @@ -12,93 +12,151 @@ 12 12 13 13 == 2. Key concepts == 14 14 15 -This microlearning centers around learning how to pro cessanincoming file perline.12 +This microlearning centers around learning how to correctly set up your volume mapping so you can exchange file-based data on-premise. 16 16 17 -By processingper line, we mean:Splitting upthe input into discernable pieces that eachwillbecomeuniquemessage14 +By volume mapping, we mean creating a configuration through which the docker container can read and write data on a specific path on an on-premise machine. Note that the data can also be stored inside the docker container when (1) the other party writing or reading the data can access this path or (2) when the data is only relevant within the context of eMagiz. 18 18 19 -* Easy way of reading a file line by line and sending it to eMagiz (Low on memory) 20 -* Ability to process each line based on distinctive logic that is relevant on line level 21 -* Can be used for flat file as well as XML input files 16 +There are several options for volume mapping for your on-premise machine. 17 +* Machine volume 18 +* Bind mount 19 +* Network volume 20 +* Temporary file system 21 +* Named pipe 22 22 23 -== 3. Processinga FileperLine ==23 +== 3. Volume Mapping (On-premise) == 24 24 25 - In somecases, youwantto treateachuniquepartofyourinputfile asitsmessage insteadofprocessingthecompletefile as itsmessage.In this microlearning,we willlearnhowyou can processa(large) fileonaper-linebasis.25 +When you need to read and write files from an on-premise disk, you need to know the path in which the data is stored and ensure that the docker container in your runtime(s) running has access to this path. There are several ways of dealing with this challenge. This microlearning will discuss the various alternatives and best approaches in these scenarios. 26 26 27 -To make this work in eMagiz you need to navigate to the Create phase of eMagiz and open the entry flow in which you want to retrieve the file to a certain location. Within the context of this flow, we need to add functionality that will ensure that each line is read and processed separately and will become its unique message. To do so first enter "Start Editing" mode on flow level. After you have done so please add a file item reader message source to the flow. We will use this component to read and process our input file on a per-line basis. 27 +There are several options for volume mapping for your on-premise machine. 28 +* Machine volume 29 +* Bind mount 30 +* Network volume 31 +* Temporary file system 32 +* Named pipe 28 28 29 - Thefirststepwouldbetodefinethe directoryfromwhichwe readourmessages.Asalwaysreferenceto the directorywith thehelpofaproperty.34 +Below, we will explain the differences between the various options available for your volume mapping. But before we do this, we explain how to set up this configuration within eMagiz. First, you must navigate to Deploy -> Architecture on the model level. This overview lets you access the Volume mapping per runtime deployed on-premise. And then, you can right-click on the runtime to access the context menu. 30 30 31 -[[image:Main.Images.Microlearning.WebHome@novice-file-based-connectivity- processing-a-file-per-line--file-item-reader-directory.png]]36 +[[image:Main.Images.Microlearning.WebHome@novice-file-based-connectivity-volume-mapping-on-premise--volume-option-context-menu.png]] 32 32 33 - Secondly,justaswhenreadingthefileasa whole ensure thatyouuse afilter to retrieveonlythe correctfilesfromthedirectory.38 +Right after you click this option, you will see the following pop-up. In this pop-up, you can define the machine-level, runtime-level, and network-level volumes (more on this volume levels later). This pop-up page is the starting point for configuring your volume mapping. We will walk through each available option and explain how they work and should be configured. 34 34 35 - === 3.1Itemr Type===40 +[[image:Main.Images.Microlearning.WebHome@novice-file-based-connectivity-volume-mapping-on-premise--volume-mapping-pop-up.png]] 36 36 37 -No w itis time to select ourItem reader Type. Asthe helptextof the eMagiz componentsuggesttherearetwochoices with this component.The first (andmost frequentlyused) optionis the Flat file itemreader.With this option,youcan read eachline within the flat file input fileand output isata separate message. Thesecond option is called the Stax event itemreader. With thisoption,youcan readyourinput XML andoutputmessagesonaer-record basis.42 +{{info}}Note that you should be in "Start editing" mode to make any changes to the configuration of your volume mapping.{{/info}} 38 38 39 - [[image:Main.Images.Microlearning.WebHome@novice-file-based-connectivity-processing-a-file-per-line--item-reader-type-options.png]]44 +=== 3.1 Volume === 40 40 41 - Basedon yourchoice the exact configuration willdiffer.46 +The first Type available to you is volume. With this option, you create one or more folders on Docker relevant to that runtime to read and write **persistent** data. To configure this Type, you need to link the runtime volume to a machine volume (or network volume) you can create within the same pop-up. This means you can re-use a "Machine volume" or a "Network volume" over multiple runtimes (i.e., containers). We first need to define a machine (or network) volume to do so. Once we have done that, we can learn how to link the volume to the machine or network volume. 42 42 43 -==== 3.1.1 Stax Event ItemReader====48 +==== 3.1.1 Define Machine Volume ==== 44 44 45 - For theStaxeventitemreader, youneedto definethenameof theelement onwhichyou wanttosplittheXML and definewhether youwanttothrow anerrorin casenosuchelementexistsinthe inputfile (By (de)selecting theoptionStrict). The defaultsetting of eMagiz is advisable forthisoption.50 +So, we first open the tab called "Machine volume." Then, by pressing the "New" button, we can define a new "Machine volume." In the following pop-up, we can specify the name of a machine volume and tell whether the volume already exists on your docker installation. 46 46 47 -[[image:Main.Images.Microlearning.WebHome@novice-file-based-connectivity- processing-a-file-per-line--stax-event-item-reader-config.png]]52 +[[image:Main.Images.Microlearning.WebHome@novice-file-based-connectivity-volume-mapping-on-premise--machine-volumes-configuration.png]] 48 48 49 - ====3.1.2FlatFileItemReader====54 +Once you have done so, we press "Save" and switch back to the "Runtime volumes" tab. 50 50 51 -For the Flat File item reader, there are some more choices and configurations to be made. There are three options you can choose from: 52 -- Pass through line mapper 53 -- Default line mapper 54 -- Pattern matching composite line mapper 56 +{{info}}When stating that the machine volume already exists, you can re-use the same machine volume across multiple runtimes (i.e., containers). This is especially useful when archiving data. You can create a central volume in which the data is stored, and through the linkage of the volume to the machine volume, you can subsequently structure your archiving folder. The paths will then look as follows, "/archive/runtimename"{{/info}} 55 55 56 - Eachof these options has some advantages and disadvantages.Adhering to the best practices ofeMagiz (i.e.no transformation in the entry) the best optionwould be to use the pass-through line mapper. As the name suggests this option does nothing except give a string backtothe flow on a per line basis. However, choosing this option means that the actual transformation from that string to XML needsto happen later in the process (most likely in the onramp) with the help of a flat-file to XML transformer (more on that component in a later course).58 +==== 3.1.2 Define Network Volume ==== 57 57 58 - Theothertwooptionstransform theinput lineintoan XMLoutput.Soyouwin onetepin theprocess.However,no standardeMagizerrorhandlingisadvisablewhenyoustarttransformingdatawithin the entry.Soincase, somethinggoes wrongtoanalyzehe errorwill become moredifficult.Furthermore, anotherpotentialdisadvantage is thatwhen oneline fails theprocessingoftherestofthefile alsohalts.60 +So, we first open the tab called "Network volume." Then, by pressing the "New" button, we can define a new "Network volume." In the following pop-up, we can specify the name of a machine volume and configure the relevant information for a network volume. In most cases, a CIFS is used, and the only pertinent options that need to be filled in are the host, path, username, and password. 59 59 60 - For the remainder of thismicrolearning, wewillassumehattheoptionassthroughlinemapper ishosen.62 +[[image:Main.Images.Microlearning.WebHome@novice-file-based-connectivity-volume-mapping-on-premise--network-volumes-configuration.png]] 61 61 62 - [[image:Main.Images.Microlearning.WebHome@novice-file-based-connectivity-processing-a-file-per-line--flat-file-item-reader-passthrough.png]]64 +Once you have done so, we press "Save" and switch back to the "Runtime volumes" tab. 63 63 64 -As you can see on the Basic level we are done. However, it is always good to check out the settings on the Advanced tab, especially in this case, to see if there are additional configuration options that could benefit us. The setting of most interest, in this case, is the Lines to Skip setting (default setting is 0). With this setting, you can define whether or not you want to process the header line(s) that exists within your input file. The remainder of the settings is (in most cases) good the way eMagiz has set them up. 66 +{{warning}}When configuring a network volume, the following information is relevant to know: 67 +* When you create a network volume to a folder that contains sub-folders, all sub-folders are shared automatically and can be accessed from the flow level 68 +* When dealing with multiple hosts, you must create a specific entry per host, as this follows the guiding security principles of the underlying infrastructure.{{/warning}} 65 65 66 - [[image:Main.Images.Microlearning.WebHome@novice-file-based-connectivity-processing-a-file-per-line--flat-file-item-reader-passthrough-advanced.png]]70 +==== 3.1.3 Link Volume ==== 67 67 68 - ===3.2Poller===72 +In the "Runtime volumes" tab, we push the "New" button to create a new "Runtime volume." In the following pop-up, we must select the Type we want to use. For this example, we use the Type called "Volume." 69 69 70 - Nowthat wehaveselectedandconfiguredtheitem reader type it becomestime to fillin thelast part of theconfiguration,thepoller. For pollingeMagiz offers threeoptions:74 +{{info}} The relevant input fields will change based on your selection. {{/info}} 71 71 72 -- Fixed Delay Trigger 73 -- Fixed Rate Trigger 74 -- Cron Trigger 76 +[[image:Main.Images.Microlearning.WebHome@novice-file-based-connectivity-volume-mapping-on-premise--runtime-volumes-configuration-type-volume.png]] 75 75 76 - Of these options,the cron trigger is usedmost frequentlyineMagiz.The reason beingis thatyoucandefine thisoptionviaapropertythatyou canalterwithout havingto alter theflowversionin Create.78 +The first thing we need to select is the "Volume." Once we have chosen our "Volume," we must set the Target specific for this runtime. This target defines the second part of the path to which the runtime will gain access. For example, when you fill in "/target", we can combine this with the "Volume" name to arrive at the correct directory from which eMagiz needs to read data (or write data to). So, in our case, in which we link the volume to the machine volume we created earlier, this would be "/file-directory/target." 77 77 78 - [[image:Main.Images.Microlearning.WebHome@novice-file-based-connectivity-processing-a-file-per-line--poller-config.png]]80 +The last setting we need to configure is to define the rights we will grant our runtime on the volume we create. The default setting is read/write rights for the runtime, which is usually sufficient. The result of following these steps will be the following. 79 79 80 - After finishingall theseconfigurationstepsweanpress Saveto saveour work andensure that we canocess theinputfiler-linebasis.82 +[[image:Main.Images.Microlearning.WebHome@novice-file-based-connectivity-volume-mapping-on-premise--runtime-volumes-configuration-type-volume-filled-in.png]] 81 81 82 -== 4. Assignment == 84 +{{warning}}Note the following when considering using the Volume option: 85 +* In the case of using the Volume option in combination with a Machine volume, the external system with which you exchange data on-premise via a file-based method needs to be able to write or read the data from the volume (i.e., directory) you have configured. Should this be a problem, the Bind mount alternative discussed below should be considered. 86 +* The Volume option and Machine volume combination can also be used for eMagiz-only information that needs to be persistable, such as archiving. 87 +* In the case of using the Volume option in combination with a Network volume, the path to read and write from becomes what you define in the target field. 88 +* In case of mapping a volume on a windows host machine to another one on a windows docker runtime, the following small adjustment is required when writing the source/target paths: 89 +** All “\” in the source/target path should be written as “/”. For example: C:\Users\xxxx\tmp should be written as C:/Users/xxxx/tmp. 83 83 84 -Configure an entry in which you define the component and configuration needed to process a file on a per-line basis. 85 -This assignment can be completed with the help of the (Academy) project that you have created/used in the previous assignment. 91 +{{/warning}} 86 86 87 -== 5.Keytakeaways==93 +=== 3.2 Bind mount === 88 88 89 -* Easy way of reading a file line by line and sending it to eMagiz (Low on memory) 90 -* Ability to process each line based on distinctive logic that is relevant on line level 91 -* Can be used for flat file as well as XML input files 92 -* Try to avoid complex transformations within the entry 95 +An alternative option to read and write **persistent** data is the "Bind mount" option. We generally advise using the "Volume" option because they perform better, and bind mounts depend on the host machine's directory structure and OS. However, only some external systems can adapt to this that easily. For example, the "Bind mount" option can interest your use case. 93 93 94 - == 6.SuggestedAdditionalReadings ==97 +[[image:Main.Images.Microlearning.WebHome@novice-file-based-connectivity-volume-mapping-on-premise--runtime-volumes-configuration-type-bind-mount.png]] 95 95 96 -T here areno suggested additional readings on thispic99 +To configure a "Bind mount," you need to define a source and a target directory linked to each other. The source directory represents the directory on your local system (that might already be used currently to exchange files). The target directory defines a directory on your docker installation that the runtime can access. 97 97 98 - == 7.Silentdemonstrationvideo==101 +[[image:Main.Images.Microlearning.WebHome@novice-file-based-connectivity-volume-mapping-on-premise--runtime-volumes-configuration-type-bind-mount-filled-in.png]] 99 99 100 - This videodemonstrateshow youcould havehandledthe assignmentandgivesyousomecontextonwhat you havejust learned.103 +{{info}}Note that when you use this option, your directory reference in your flow should refer to the "target" directory configured here.{{/info}} 101 101 102 -{{video attachment="novice-file-based-connectivity-processing-a-file-per-line.mp4" reference="Main.Videos.Microlearning.WebHome"/}} 105 +{{warning}} 106 +When configuring a bind mount on a windows host machine to another one on a windows docker runtime, the following small adjustment is required when writing the source/target paths: 107 +** All “\” in the source/target path should be written as “/”. For example: C:\Users\xxxx\tmp should be written as C:/Users/xxxx/tmp. 108 +{{/warning}} 103 103 110 +=== 3.3 Temporary file system === 111 + 112 +{{info}}This option is only relevant when running on **Linux**.{{/info}} 113 + 114 +The temporary file system option is for you if you do not want to work with **persistent** data but require **non-persistent** data. This way, you can increase the container's performance by avoiding writing into the container's writable layer. 115 + 116 +[[image:Main.Images.Microlearning.WebHome@novice-file-based-connectivity-volume-mapping-on-premise--runtime-volumes-configuration-type-temp-file-storage.png]] 117 + 118 +To configure this option, you need a target location. On top of that, you can define the maximum size of the temporary file system. 119 + 120 +[[image:Main.Images.Microlearning.WebHome@novice-file-based-connectivity-volume-mapping-on-premise--runtime-volumes-configuration-type-temp-file-storage-filled-in.png]] 121 + 122 +{{warning}} 123 +We strongly advise you to define this number so that you can limit the potential impact this solution can have on the stability of your machine. {{/warning}} 124 + 125 +=== 3.4 Named pipe === 126 + 127 +{{info}}This option is only relevant when running on **Windows**.{{/info}} 128 + 129 +A named pipe is a named, one-way or duplex pipe for communication between the pipe server and one or more pipe clients. All instances of a named pipe share the same pipe name, but each instance has its own buffers and handles, and provides a separate conduit for client/server communication. Any process can access named pipes, subject to security checks, making named pipes an easy form of communication between related or unrelated processes. 130 + 131 +*The named pipe option can be selected, but we yet have to see a valid use case within the context of eMagiz for using this option. Therefore, we won't discuss this option further in this microlearning. 132 + 133 +{{warning}} 134 +* When configuring a pipe path on a windows host machine to another one on a windows docker runtime, the following small adjustment is required when writing the source/target paths: 135 +** All “\” in the source/target path should be written as “/”. For example: C:\Users\xxxx\tmp should be written as C:/Users/xxxx/tmp.{{/warning}} 136 + 137 +=== 3.5 Deployment consequences === 138 + 139 +{{warning}} 140 +* Note that the runtimes cannot be deployed correctly when the source directory **does not exist**. Consequently, no runtime on that machine will start up. One of the following two configurations displayed below are needed to find the source directory: 141 +** /mnt/host/{local-directory} 142 +** /run/desktop/mnt/host/{local-directory} 143 +* When the source directory can be found but the user has no access, the deployment will **fail** for the specific runtime in question with the volume mapping configured. All other runtimes (i.e., containers) will start up (pending other configuration issues).{{/warning}} 144 + 145 +== 4. Key takeaways == 146 + 147 +* File-based communication on-premise changes in the new runtime architecture 148 +* There are two ways to store **persistent** data 149 + ** Volume 150 + ** Bind mount 151 +* The Volume option is considered the best alternative because they have better performance, and bind mounts are dependent on the directory structure and OS of the host machine 152 +* Before deploying, ensure that the various sources in your configuration exist and that access is granted to avoid problems while deploying. 153 +* The Temporary file storage option is the way to go when dealing with **non-persistent** data. 154 + 155 +== 5. Suggested Additional Readings == 156 + 157 +If you are interested in this topic and want more information, please read the help text provided by eMagiz. 158 + 104 104 )))((({{toc/}}))){{/container}}{{/container}}