Changes for page Configure your SOAP web service
Last modified by Erik Bakker on 2024/08/26 12:38
From version 30.1
edited by Erik Bakker
on 2022/06/10 13:22
on 2022/06/10 13:22
Change comment:
There is no comment for this version
To version 31.1
edited by Erik Bakker
on 2022/06/10 13:29
on 2022/06/10 13:29
Change comment:
There is no comment for this version
Summary
-
Page properties (2 modified, 0 added, 0 removed)
Details
- Page properties
-
- Title
-
... ... @@ -1,1 +1,1 @@ 1 - Processing a File perLine1 +Character set - Content
-
... ... @@ -1,5 +1,5 @@ 1 1 {{container}}{{container layoutStyle="columns"}}((( 2 - In some cases,youwanttotreat each uniquepart ofyourinputfile asits messageinsteadofprocessingthecompletefile asits message. In this microlearning, we will learn how you canprocessa(large)file on a per-linebasis.2 +n some cases, the input you receive or the output that you need to send to an external party cannot handle all characters or the input or output is written with the help of a character set. In this microlearning, we will learn how you can define the character set for file-based connectivity to ensure that you can process and deliver files according to the specifications. 3 3 4 4 Should you have any questions, please contact [[academy@emagiz.com>>mailto:academy@emagiz.com]]. 5 5 ... ... @@ -12,73 +12,32 @@ 12 12 13 13 == 2. Key concepts == 14 14 15 -This microlearning centers around learning how to process anincomingfile perline.15 +This microlearning centers around learning how to define the character set to ensure that eMagiz processes the information correctly. 16 16 17 -By processing perline, we mean:Splittingupthe inputintodiscernablepieces thatachwillbecomeauniquemessage17 +By character set, we mean: The composite number of different characters that are being used and supported by computer software and hardware. It consists of codes, bit patterns, or natural numbers used in defining some particular character. 18 18 19 -* Easy wayofreadingafilelinebylineandsending it toeMagiz(Low on memory)20 -* Abilityto process eachlinebasedondistinctivelogicthatisrelevanton line level21 -* Canbe usedforflatfile aswellasXMLinputfiles19 +* Some external system talk in a different character set 20 +* eMagiz talks in default UTF-8 as a character set and assumes everyone else also does this 21 +* In cases of mismatch correct is at the point where you talk with the other system (i.e. entry or exit) 22 22 23 -== 3. Processing a File perLine ==23 +== 3. Character set == 24 24 25 -In some cases, youwanttotreat each uniquepart ofyourinputfile asits messageinsteadofprocessingthecompletefile asits message. In this microlearning, we will learn how you canprocessa(large)file on a per-linebasis.25 +In some cases, the input you receive or the output that you need to send to an external party cannot handle all characters or the input or output is written with the help of a character set. In this microlearning, we will learn how you can define the character set for file-based connectivity to ensure that you can process and deliver files according to the specifications. 26 26 27 - Toakehiswork ineMagizyouneedtonavigate totheCreatephase ofeMagiz andopentheentryflowinwhich youwanttoretrievethe filetoacertain location. Within thecontextofthisflow,we needto addfunctionalitythatwill ensureeachline isreadand processedseparatelyand willbecomeitsuniquemessage. Toosofirst enter"StartEditing"modeon flowlevel.Afteryouhave donesopleaseaddafile itemreadermessagesourcetotheflow.Wewill usethis component toread andprocessourinputfile onaper-linebasis.27 +Sometimes external systems only talk in a specific character set. To ensure that all the data is properly communicated between eMagiz and the other system we need to make sure that we define which character set that is so we can tell it to eMagiz via a component. That way eMagiz will deviate from its default (i.e. UTF-8) and will process the file according to that different character set. In practice, we mainly see windows-1252 as an alternative that pops up once in a while. In various components that deal with file handling, you can define the character set on which eMagiz should act. Examples of such components are: 28 28 29 -The first step would be to define the directory from which we read our messages. As always reference to the directory with the help of a property. 29 +- File to string transformer 30 +- Flat file to XML transformer 31 +- File outbound channel adapter 30 30 31 - [[image:Main.Images.Microlearning.WebHome@novice-file-based-connectivity-processing-a-file-per-line--file-item-reader-directory.png]]33 +In all these components you have the option to define the character set within the Advanced tab of the component. In this microlearning, we will use the File to string transformer to illustrate how that will look. 32 32 33 - Secondly, justas whenreadingthea wholeensurehatou useafiltero retrieve onlythecorrectfiles from thedirectory.35 +[[image:Main.Images.Microlearning.WebHome@novice-file-based-connectivity-characterset--characterset-configuration.png]] 34 34 35 - ===3.1Item reader Type===37 +In this field, you can define the character set of your choice. To make this work in eMagiz you need to navigate to the Create phase of eMagiz and open the entry flow in which you want to retrieve the file to a certain location. Within the context of this flow, we need to add functionality that will ensure that the correct character set is used. To do so first enter "Start Editing" mode on flow level. After that open, the File to string transformer, navigate to the Advanced tab, and fill in the correct character set. After you have defined the correct character set the only thing left to do is to Save the component. See the suggested additional readings section on the complete list of character sets that are supported by Java 8. 36 36 37 - Now it is time to select ourItem reader Type. Asthe help text of the eMagiz componentsuggesttherearetwochoiceswith thiscomponent. Thefirst (and most frequentlyused) option is the Flat fileitem reader. With this option, you can readeachlinewithinthe flat file input file andoutputisat a separatemessage. The second optionis calledtheStax event item reader. With this option, youcanread your inputXML and output messages on a per-recordbasis.39 +Congratulations you have successfully learned how to specify the character set. 38 38 39 -[[image:Main.Images.Microlearning.WebHome@novice-file-based-connectivity-processing-a-file-per-line--item-reader-type-options.png]] 40 - 41 -Based on your choice the exact configuration will differ. 42 - 43 -==== 3.1.1 Stax Event Item Reader ==== 44 - 45 -For the Stax event item reader, you need to define the name of the element on which you want to split the XML and define whether you want to throw an error in case no such element exists in the input file (By (de)selecting the option Strict). The default setting of eMagiz is advisable for this option. 46 - 47 -[[image:Main.Images.Microlearning.WebHome@novice-file-based-connectivity-processing-a-file-per-line--stax-event-item-reader-config.png]] 48 - 49 -==== 3.1.2 Flat File Item Reader ==== 50 - 51 -For the Flat File item reader, there are some more choices and configurations to be made. There are three options you can choose from: 52 -- Pass through line mapper 53 -- Default line mapper 54 -- Pattern matching composite line mapper 55 - 56 -Each of these options has some advantages and disadvantages. Adhering to the best practices of eMagiz (i.e. no transformation in the entry) the best option would be to use the pass-through line mapper. As the name suggests this option does nothing except give a string back to the flow on a per line basis. However, choosing this option means that the actual transformation from that string to XML needs to happen later in the process (most likely in the onramp) with the help of a flat-file to XML transformer (more on that component in a later course). 57 - 58 -The other two options transform the input line into an XML output. So you win one step in the process. However, no standard eMagiz error handling is advisable when you start transforming data within the entry. So in case, something goes wrong to analyze the error will become more difficult. Furthermore, another potential disadvantage is that when one line fails the processing of the rest of the file also halts. 59 - 60 -For the remainder of this microlearning, we will assume that the option pass through line mapper is chosen. 61 - 62 -[[image:Main.Images.Microlearning.WebHome@novice-file-based-connectivity-processing-a-file-per-line--flat-file-item-reader-passthrough.png]] 63 - 64 -As you can see on the Basic level we are done. However, it is always good to check out the settings on the Advanced tab, especially in this case, to see if there are additional configuration options that could benefit us. The setting of most interest, in this case, is the Lines to Skip setting (default setting is 0). With this setting, you can define whether or not you want to process the header line(s) that exists within your input file. The remainder of the settings is (in most cases) good the way eMagiz has set them up. 65 - 66 -[[image:Main.Images.Microlearning.WebHome@novice-file-based-connectivity-processing-a-file-per-line--flat-file-item-reader-passthrough-advanced.png]] 67 - 68 -=== 3.2 Poller === 69 - 70 -Now that we have selected and configured the item reader type it becomes time to fill in the last part of the configuration, the poller. For polling eMagiz offers three options: 71 - 72 -- Fixed Delay Trigger 73 -- Fixed Rate Trigger 74 -- Cron Trigger 75 - 76 -Of these options, the cron trigger is used most frequently in eMagiz. The reason being is that you can define this option via a property that you can alter without having to alter the flow version in Create. 77 - 78 -[[image:Main.Images.Microlearning.WebHome@novice-file-based-connectivity-processing-a-file-per-line--poller-config.png]] 79 - 80 -After finishing all these configuration steps we can press Save to save our work and ensure that we can process the input file on a per-line basis. 81 - 82 82 == 4. Assignment == 83 83 84 84 Configure an entry in which you define the component and configuration needed to process a file on a per-line basis. ... ... @@ -86,19 +86,23 @@ 86 86 87 87 == 5. Key takeaways == 88 88 89 -* Easy wayofreadingafilelinebylineandsending it toeMagiz(Low on memory)90 -* Abilityto process eachlinebasedondistinctivelogicthatisrelevanton line level91 -* Canbe usedforflatfile aswellasXMLinputfiles92 -* Trytoavoid complex transformationswithin theentry48 +* Some external system talk in a different character set 49 +* eMagiz talks in default UTF-8 as a character set and assumes everyone else also does this 50 +* In cases of mismatch correct is at the point where you talk with the other system (i.e. entry or exit) 51 +* eMagiz provides several components within which you can define the character set 93 93 94 94 == 6. Suggested Additional Readings == 95 95 96 - Thereare no suggestedadditionalreadingson thistopic55 +If you are interested in this topic and want more information on it please read the help text provided by eMagiz and read the following links: 97 97 57 +* https://docs.oracle.com/javase/8/docs/technotes/guides/intl/encoding.doc.html 58 +* https://www.techopedia.com/definition/941/character-set 59 +* https://www.smashingmagazine.com/2012/06/all-about-unicode-utf8-character-sets/ 60 + 98 98 == 7. Silent demonstration video == 99 99 100 100 This video demonstrates how you could have handled the assignment and gives you some context on what you have just learned. 101 101 102 -{{video attachment="novice-file-based-connectivity- processing-a-file-per-line.mp4" reference="Main.Videos.Microlearning.WebHome"/}}65 +{{video attachment="/novice-file-based-connectivity-characterset.mp4" reference="Main.Videos.Microlearning.WebHome"/}} 103 103 104 104 )))((({{toc/}}))){{/container}}{{/container}}