Wiki source code of Interpreting Queue Statistics
Last modified by Erik Bakker on 2024/02/20 13:53
Show last authors
| author | version | line-number | content |
|---|---|---|---|
| 1 | {{container}}{{container layoutStyle="columns"}}((( | ||
| 2 | |||
| 3 | In this microlearning, we will focus on how you can read the information available on queue level for all integrations that use the messaging pattern. | ||
| 4 | |||
| 5 | Should you have any questions, please contact [[academy@emagiz.com>>mailto:academy@emagiz.com]]. | ||
| 6 | |||
| 7 | == 1. Prerequisites == | ||
| 8 | * Basic knowledge of the eMagiz platform | ||
| 9 | |||
| 10 | == 2. Key concepts == | ||
| 11 | This microlearning centers around how you can read the information in the queue statistics and what you can learn from it. | ||
| 12 | By queue statistics we mean: Information on queue level that helps you to interpret the data that is passing on that queue | ||
| 13 | |||
| 14 | There are four parts to the queue statistics: | ||
| 15 | * Total messages in queue | ||
| 16 | * Total messages added to queue | ||
| 17 | * Number of consumers | ||
| 18 | * Data measurements | ||
| 19 | |||
| 20 | == 3. Interpreting queue statistics == | ||
| 21 | |||
| 22 | In many cases, you want to validate your assumptions by checking the queue statistics. The queue statistics section in eMagiz is divided into four parts: | ||
| 23 | |||
| 24 | * Total messages in queue | ||
| 25 | * Total messages added to queue | ||
| 26 | * Number of consumers | ||
| 27 | * Data measurements | ||
| 28 | |||
| 29 | For example, when you want to verify how many messages have arrived on a certain queue within a certain time window you can use the queue statistics overview for this. | ||
| 30 | |||
| 31 | [[image:Main.Images.Microlearning.WebHome@crashcourse-messaging-interpreting-queue-statistics--complete-overview.png]] | ||
| 32 | |||
| 33 | Below we will delve into each of these four parts and explain a bit more about them. | ||
| 34 | That way you can use the queue statistics to interpret what happened within your messaging flows at any given moment. | ||
| 35 | To assist you in tracking anomalies in your project you can use eMagiz alerting. To learn more on that please take a look at the microlearning on that subject. | ||
| 36 | |||
| 37 | === 3.1 Total messages in queue === | ||
| 38 | |||
| 39 | The first metric we are going to look at is the total messages in queue. | ||
| 40 | As the name implies this metric tells us how many messages currently reside on the queue and have **yet to be** processed. | ||
| 41 | |||
| 42 | [[image:Main.Images.Microlearning.WebHome@crashcourse-messaging-interpreting-queue-statistics--total-messages-in-queue-flat-line.png]] | ||
| 43 | |||
| 44 | Having a flat-line, such as in the picture above can mean two things: | ||
| 45 | * The queue can keep up with the supply (i.e. the number of messages added to the queue is **lower** than the number of messages that are processed on the queue) | ||
| 46 | * There is no supply (i.e. the number of messages added to the queue equals zero) | ||
| 47 | |||
| 48 | As you can see from the above interpretation one metric by itself will never paint the whole picture. | ||
| 49 | Let's continue by looking at a scenario in which the total messages in queue are gradually increasing. | ||
| 50 | |||
| 51 | [[image:Main.Images.Microlearning.WebHome@crashcourse-messaging-interpreting-queue-statistics--total-messages-in-queue-increase.png]] | ||
| 52 | |||
| 53 | Seeing this behavior can also mean two things: | ||
| 54 | * The queue is **not** able to keep up with the supply (i.e. the number of messages added to the queue is **higher** than the number of messages that are processed on the queue) | ||
| 55 | * Nobody is processing the data on the queue (i.e. no consumer wants to consume the data from the queue and process it) | ||
| 56 | |||
| 57 | The first of these two potential reasons is most likely temporary in nature. | ||
| 58 | In case you get a sudden burst of data that is supplied to the queue will need some time to process all data. | ||
| 59 | In such a scenario you will see the metric increase at first and afterward decrease to zero (flat-line) again. | ||
| 60 | |||
| 61 | The second of these two potential reasons could be more structural. | ||
| 62 | This can happen because the consumer is broken or not active. This will lead to a steady increase in data that won't return to zero without a user action correcting the behavior. | ||
| 63 | |||
| 64 | === 3.2 Total messages added to queue === | ||
| 65 | |||
| 66 | The second metric we are going to look at is the total messages added to queue. | ||
| 67 | As the name implies this metric tells us how many messages were supplied to the queue within a given timeframe. | ||
| 68 | |||
| 69 | [[image:Main.Images.Microlearning.WebHome@crashcourse-messaging-interpreting-queue-statistics--total-messages-added-to-queue-standard.png]] | ||
| 70 | |||
| 71 | In the example shown above, you can see that between minute 24 and minute 46 we have a steady flow of data that is supplied to this queue. | ||
| 72 | Based on the actual implementation of your flows the pattern of how data is supplied to your queue can differ. | ||
| 73 | The main expectation is that data is supplied to the queue at some point in time. In some cases, you will expect data almost all the time. | ||
| 74 | In some cases only between 9 and 17 on weekdays. And in other cases, you only expect data in the evening when nobody will be hindered by it. | ||
| 75 | |||
| 76 | If this metric shows that data is supplied to the queue and the total messages in queue metric show a flat-line on zero messages everything works as expected. | ||
| 77 | When this metric shows that data is supplied to the queue and the total messages in the queue metric show the same **exact** increase you can conclude that no data is being consumed from the queue. | ||
| 78 | In case the metric shows that data is supplied to the queue and the total messages in queue metric shows a **less steep** increase you can conclude that the queue is **not** able to keep up with demand. | ||
| 79 | |||
| 80 | Just as with the total messages in queue the number of total messages added to queue maybe not changed over a period leading to a flat-line graph. | ||
| 81 | In itself this means nothing. However, if you would have expected messages to be supplied to that queue within that time frame | ||
| 82 | you need to analyze the complete chain of flows more thoroughly to see what is causing the unexpected behavior. | ||
| 83 | |||
| 84 | [[image:Main.Images.Microlearning.WebHome@crashcourse-messaging-interpreting-queue-statistics--total-messages-added-to-queue-flat-line.png]] | ||
| 85 | |||
| 86 | |||
| 87 | === 3.3 Number of consumers === | ||
| 88 | |||
| 89 | The third metric we are going to look at is the number of consumers. | ||
| 90 | This metric tells us how many consumers are activated to consume data from that specific queue. | ||
| 91 | |||
| 92 | In most asynchronous (messaging and event streaming) flows the expected number is 1 or 2 consumers. In API Gateway exit gates the number can vary between 1 and 5 depending on the supply on the queue. | ||
| 93 | Knowing the expected number of consumers is crucial for a good interpretation of what the reported number of consumers tells you. | ||
| 94 | |||
| 95 | Let's assume for the sake of this microlearning that the expected number of consumers on my queue is 1. | ||
| 96 | |||
| 97 | [[image:Main.Images.Microlearning.WebHome@crashcourse-messaging-interpreting-queue-statistics--number-of-consumers-expected.png]] | ||
| 98 | |||
| 99 | With that in mind, the number of consumers that is reported can either be 1 (i.e what we expect), 0 (i.e. too low), or 2 and higher (i.e. too much). | ||
| 100 | |||
| 101 | If the reported number of consumers equals the expected number of consumers data will be processed as expected. | ||
| 102 | This means that one message per consumer is processed at any given moment assuming that a message is available to be processed. | ||
| 103 | |||
| 104 | However, if the number of consumers is too low or too high you need to analyze the situation. | ||
| 105 | |||
| 106 | [[image:Main.Images.Microlearning.WebHome@crashcourse-messaging-interpreting-queue-statistics--number-of-consumers-too-low.png]] | ||
| 107 | |||
| 108 | As shown above, you can see that the number of consumers drops from 1 to 0 at a certain point in time. This could be caused by: | ||
| 109 | |||
| 110 | * A user stopped the queue ensuring no data would be processed anymore | ||
| 111 | * The runtime on which the queue is running is stopped (either by the user or by an automatic process) | ||
| 112 | * The complete eMagiz project is not running at the moment (either due to a user action or by an automatic process) | ||
| 113 | |||
| 114 | Whatever the reason maybe you should investigate what the cause is to keep your environment running stably. | ||
| 115 | |||
| 116 | Apart from the consumer count being too low, it can also happen that the consumer count is too high. This could be caused by: | ||
| 117 | |||
| 118 | * Incorrect configuration of the flow by a user that leads to an unexpected number of consumers | ||
| 119 | * Linking both a Test, Acceptance, and Production system to the same eMagiz environment | ||
| 120 | * Fringe situations that appear once in a while | ||
| 121 | |||
| 122 | === 3.4 Data measurements === | ||
| 123 | |||
| 124 | The fourth and last metric tells us whether metrics (i.e. data measurements) are coming in. | ||
| 125 | |||
| 126 | [[image:Main.Images.Microlearning.WebHome@crashcourse-messaging-interpreting-queue-statistics--data-measurements.png]] | ||
| 127 | |||
| 128 | With the help of this metric, we can establish if the queue in question is sending data measurements to the portal or not. | ||
| 129 | As you can see from the example shown above the queue was not activated until somewhere after the 14th minute of the hour. | ||
| 130 | This is entirely consistent with the behavior of the other graphs. | ||
| 131 | |||
| 132 | == 4. Key takeaways == | ||
| 133 | |||
| 134 | * There are four parts to the queue statistics: | ||
| 135 | ** Total messages in queue | ||
| 136 | ** Total messages added to queue | ||
| 137 | ** Number of consumers | ||
| 138 | ** Data measurements | ||
| 139 | * You can best interpret them together as that approach gives you the most context | ||
| 140 | * To assist in anomaly detection use the eMagiz alerting | ||
| 141 | |||
| 142 | == 5. Suggested Additional Readings == | ||
| 143 | |||
| 144 | If you are interested in this topic and want more information on it please read the help text provided by eMagiz when executing these actions.)))((({{toc}}))) |