Wiki source code of Character set

Version 31.1 by Erik Bakker on 2022/06/10 13:29

Hide last authors
Erik Bakker 21.1 1 {{container}}{{container layoutStyle="columns"}}(((
Erik Bakker 31.1 2 n some cases, the input you receive or the output that you need to send to an external party cannot handle all characters or the input or output is written with the help of a character set. In this microlearning, we will learn how you can define the character set for file-based connectivity to ensure that you can process and deliver files according to the specifications.
eMagiz 1.1 3
Erik Bakker 21.1 4 Should you have any questions, please contact [[academy@emagiz.com>>mailto:academy@emagiz.com]].
eMagiz 1.1 5
Erik Bakker 30.1 6 * Last update: May 31th, 2021
7 * Required reading time: 7 minutes
eMagiz 1.1 8
9 == 1. Prerequisites ==
10
11 * Basic knowledge of the eMagiz platform
12
13 == 2. Key concepts ==
14
Erik Bakker 31.1 15 This microlearning centers around learning how to define the character set to ensure that eMagiz processes the information correctly.
eMagiz 1.1 16
Erik Bakker 31.1 17 By character set, we mean: The composite number of different characters that are being used and supported by computer software and hardware. It consists of codes, bit patterns, or natural numbers used in defining some particular character.
eMagiz 1.1 18
Erik Bakker 31.1 19 * Some external system talk in a different character set
20 * eMagiz talks in default UTF-8 as a character set and assumes everyone else also does this
21 * In cases of mismatch correct is at the point where you talk with the other system (i.e. entry or exit)
eMagiz 1.1 22
Erik Bakker 31.1 23 == 3. Character set ==
eMagiz 1.1 24
Erik Bakker 31.1 25 In some cases, the input you receive or the output that you need to send to an external party cannot handle all characters or the input or output is written with the help of a character set. In this microlearning, we will learn how you can define the character set for file-based connectivity to ensure that you can process and deliver files according to the specifications.
eMagiz 1.1 26
Erik Bakker 31.1 27 Sometimes external systems only talk in a specific character set. To ensure that all the data is properly communicated between eMagiz and the other system we need to make sure that we define which character set that is so we can tell it to eMagiz via a component. That way eMagiz will deviate from its default (i.e. UTF-8) and will process the file according to that different character set. In practice, we mainly see windows-1252 as an alternative that pops up once in a while. In various components that deal with file handling, you can define the character set on which eMagiz should act. Examples of such components are:
eMagiz 1.1 28
Erik Bakker 31.1 29 - File to string transformer
30 - Flat file to XML transformer
31 - File outbound channel adapter
eMagiz 1.1 32
Erik Bakker 31.1 33 In all these components you have the option to define the character set within the Advanced tab of the component. In this microlearning, we will use the File to string transformer to illustrate how that will look.
eMagiz 1.1 34
Erik Bakker 31.1 35 [[image:Main.Images.Microlearning.WebHome@novice-file-based-connectivity-characterset--characterset-configuration.png]]
eMagiz 1.1 36
Erik Bakker 31.1 37 In this field, you can define the character set of your choice. To make this work in eMagiz you need to navigate to the Create phase of eMagiz and open the entry flow in which you want to retrieve the file to a certain location. Within the context of this flow, we need to add functionality that will ensure that the correct character set is used. To do so first enter "Start Editing" mode on flow level. After that open, the File to string transformer, navigate to the Advanced tab, and fill in the correct character set. After you have defined the correct character set the only thing left to do is to Save the component. See the suggested additional readings section on the complete list of character sets that are supported by Java 8.
eMagiz 1.1 38
Erik Bakker 31.1 39 Congratulations you have successfully learned how to specify the character set.
eMagiz 1.1 40
41 == 4. Assignment ==
42
Erik Bakker 29.1 43 Configure an entry in which you define the component and configuration needed to process a file on a per-line basis.
eMagiz 1.1 44 This assignment can be completed with the help of the (Academy) project that you have created/used in the previous assignment.
45
46 == 5. Key takeaways ==
47
Erik Bakker 31.1 48 * Some external system talk in a different character set
49 * eMagiz talks in default UTF-8 as a character set and assumes everyone else also does this
50 * In cases of mismatch correct is at the point where you talk with the other system (i.e. entry or exit)
51 * eMagiz provides several components within which you can define the character set
eMagiz 1.1 52
53 == 6. Suggested Additional Readings ==
54
Erik Bakker 31.1 55 If you are interested in this topic and want more information on it please read the help text provided by eMagiz and read the following links:
eMagiz 1.1 56
Erik Bakker 31.1 57 * https://docs.oracle.com/javase/8/docs/technotes/guides/intl/encoding.doc.html
58 * https://www.techopedia.com/definition/941/character-set
59 * https://www.smashingmagazine.com/2012/06/all-about-unicode-utf8-character-sets/
60
eMagiz 1.1 61 == 7. Silent demonstration video ==
62
63 This video demonstrates how you could have handled the assignment and gives you some context on what you have just learned.
64
Erik Bakker 31.1 65 {{video attachment="/novice-file-based-connectivity-characterset.mp4" reference="Main.Videos.Microlearning.WebHome"/}}
eMagiz 1.1 66
Erik Bakker 21.1 67 )))((({{toc/}}))){{/container}}{{/container}}