Interpreting Queue Statistics

{{container}}{{container layoutStyle="columns"}}(((

In this microlearning, we will focus on how you can read the information available on queue level for all integrations that use the messaging pattern.

4

5

Should you have any questions, please contact [[academy@emagiz.com>>mailto:academy@emagiz.com]].

6

7

== 1. Prerequisites ==

8

* Basic knowledge of the eMagiz platform

9

10

== 2. Key concepts ==

11

This microlearning centers around how you can read the information in the queue statistics and what you can learn from it.

12

By queue statistics we mean: Information on queue level that helps you to interpret the data that is passing on that queue

13

14

There are four parts to the queue statistics:

15

* Total messages in queue

16

* Total messages added to queue

17

* Number of consumers

18

* Data measurements

19

20

== 3. Interpreting queue statistics ==

21

22

In many cases, you want to validate your assumptions by checking the queue statistics. The queue statistics section in eMagiz is divided into four parts:

23

24

* Total messages in queue

25

* Total messages added to queue

26

* Number of consumers

27

* Data measurements

28

29

For example, when you want to verify how many messages have arrived on a certain queue within a certain time window you can use the queue statistics overview for this.

30

31

[[image:Main.Images.Microlearning.WebHome@crashcourse-messaging-interpreting-queue-statistics--complete-overview.png]]

32

33

Below we will delve into each of these four parts and explain a bit more about them.

34

That way you can use the queue statistics to interpret what happened within your messaging flows at any given moment.

35

To assist you in tracking anomalies in your project you can use eMagiz alerting. To learn more on that please take a look at the microlearning on that subject.

36

37

=== 3.1 Total messages in queue ===

38

39

The first metric we are going to look at is the total messages in queue.

40

As the name implies this metric tells us how many messages currently reside on the queue and have **yet to be** processed.

41

42

[[image:Main.Images.Microlearning.WebHome@crashcourse-messaging-interpreting-queue-statistics--total-messages-in-queue-flat-line.png]]

43

44

Having a flat-line, such as in the picture above can mean two things:

45

* The queue can keep up with the supply (i.e. the number of messages added to the queue is **lower** than the number of messages that are processed on the queue)

46

* There is no supply (i.e. the number of messages added to the queue equals zero)

47

48

As you can see from the above interpretation one metric by itself will never paint the whole picture.

49

Let's continue by looking at a scenario in which the total messages in queue are gradually increasing.

50

51

[[image:Main.Images.Microlearning.WebHome@crashcourse-messaging-interpreting-queue-statistics--total-messages-in-queue-increase.png]]

52

53

Seeing this behavior can also mean two things:

54

* The queue is **not** able to keep up with the supply (i.e. the number of messages added to the queue is **higher** than the number of messages that are processed on the queue)

55

* Nobody is processing the data on the queue (i.e. no consumer wants to consume the data from the queue and process it)

56

57

The first of these two potential reasons is most likely temporary in nature.

58

In case you get a sudden burst of data that is supplied to the queue will need some time to process all data.

59

In such a scenario you will see the metric increase at first and afterward decrease to zero (flat-line) again.

60

61

The second of these two potential reasons could be more structural.

62

This can happen because the consumer is broken or not active. This will lead to a steady increase in data that won't return to zero without a user action correcting the behavior.

63

64

=== 3.2 Total messages added to queue ===

65

66

The second metric we are going to look at is the total messages added to queue.

67

As the name implies this metric tells us how many messages were supplied to the queue within a given timeframe.

68

69

[[image:Main.Images.Microlearning.WebHome@crashcourse-messaging-interpreting-queue-statistics--total-messages-added-to-queue-standard.png]]

70

71

In the example shown above, you can see that between minute 24 and minute 46 we have a steady flow of data that is supplied to this queue.

72

Based on the actual implementation of your flows the pattern of how data is supplied to your queue can differ.

73

The main expectation is that data is supplied to the queue at some point in time. In some cases, you will expect data almost all the time.

74

In some cases only between 9 and 17 on weekdays. And in other cases, you only expect data in the evening when nobody will be hindered by it.

75

76

If this metric shows that data is supplied to the queue and the total messages in queue metric show a flat-line on zero messages everything works as expected.

77

When this metric shows that data is supplied to the queue and the total messages in the queue metric show the same **exact** increase you can conclude that no data is being consumed from the queue.

78

In case the metric shows that data is supplied to the queue and the total messages in queue metric shows a **less steep** increase you can conclude that the queue is **not** able to keep up with demand.

79

80

Just as with the total messages in queue the number of total messages added to queue maybe not changed over a period leading to a flat-line graph.

81

In itself this means nothing. However, if you would have expected messages to be supplied to that queue within that time frame

82

you need to analyze the complete chain of flows more thoroughly to see what is causing the unexpected behavior.

83

84

[[image:Main.Images.Microlearning.WebHome@crashcourse-messaging-interpreting-queue-statistics--total-messages-added-to-queue-flat-line.png]]

85

86

87

=== 3.3 Number of consumers ===

88

89

The third metric we are going to look at is the number of consumers.

90

This metric tells us how many consumers are activated to consume data from that specific queue.

91

92

In most asynchronous (messaging and event streaming) flows the expected number is 1 or 2 consumers. In API Gateway exit gates the number can vary between 1 and 5 depending on the supply on the queue.

93

Knowing the expected number of consumers is crucial for a good interpretation of what the reported number of consumers tells you.

94

95

Let's assume for the sake of this microlearning that the expected number of consumers on my queue is 1.

96

97

[[image:Main.Images.Microlearning.WebHome@crashcourse-messaging-interpreting-queue-statistics--number-of-consumers-expected.png]]

98

99

With that in mind, the number of consumers that is reported can either be 1 (i.e what we expect), 0 (i.e. too low), or 2 and higher (i.e. too much).

100

101

If the reported number of consumers equals the expected number of consumers data will be processed as expected.

102

This means that one message per consumer is processed at any given moment assuming that a message is available to be processed.

103

104

However, if the number of consumers is too low or too high you need to analyze the situation.

105

106

[[image:Main.Images.Microlearning.WebHome@crashcourse-messaging-interpreting-queue-statistics--number-of-consumers-too-low.png]]

107

108

As shown above, you can see that the number of consumers drops from 1 to 0 at a certain point in time. This could be caused by:

109

110

* A user stopped the queue ensuring no data would be processed anymore

111

* The runtime on which the queue is running is stopped (either by the user or by an automatic process)

112

* The complete eMagiz project is not running at the moment (either due to a user action or by an automatic process)

113

114

Whatever the reason maybe you should investigate what the cause is to keep your environment running stably.

115

116

Apart from the consumer count being too low, it can also happen that the consumer count is too high. This could be caused by:

117

118

* Incorrect configuration of the flow by a user that leads to an unexpected number of consumers

119

* Linking both a Test, Acceptance, and Production system to the same eMagiz environment

120

* Fringe situations that appear once in a while

121

122

=== 3.4 Data measurements ===

123

124

The fourth and last metric tells us whether metrics (i.e. data measurements) are coming in.

125

126

[[image:Main.Images.Microlearning.WebHome@crashcourse-messaging-interpreting-queue-statistics--data-measurements.png]]

127

128

With the help of this metric, we can establish if the queue in question is sending data measurements to the portal or not.

129

As you can see from the example shown above the queue was not activated until somewhere after the 14th minute of the hour.

130

This is entirely consistent with the behavior of the other graphs.

131

132

== 4. Key takeaways ==

133

134

* There are four parts to the queue statistics:

135

** Total messages in queue

136

** Total messages added to queue

137

** Number of consumers

138

** Data measurements

139

* You can best interpret them together as that approach gives you the most context

140

* To assist in anomaly detection use the eMagiz alerting

141

142

== 5. Suggested Additional Readings ==

143

144

If you are interested in this topic and want more information on it please read the help text provided by eMagiz when executing these actions.)))((({{toc}})))

Wiki source code of Interpreting Queue Statistics

Navigation