You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've hit an issue that I think is going to prevent us from deploying InfluxDB. The following is an example of this issue, but in reality it manifests in different ways in many of our use cases.
I have a set of nodes each reporting the rate at which differnet SIP messages are being received (in terms of messages per s)
Each node reports writes these stats to InfluxDB roughly every 5s.
I want to draw charts showing the total rate of different types of SIP messages being processed across my deployment, split by message_type.
Ideally I would do this like so:
SELECT sum("value") FROM "sipMessages" WHERE WHERE time > now() - 7d GROUP BY time(1h), "tag_message_type"
The problem here is that this sums across different values of node, but also sums multiple points within each series. So the number I get returned is ~720 times higher than it should be. If I knew that there were always going to be exactly 720 measurements in each time interval I could divide by 720, but this 720 is only an approximation. [it is also the case that using Grafana to draw the graphs this "1h" period changes automatically based on the period that I'm graphing over]
What I really need to do is to run an average aggregation within each series over each GROUP BY time period (so that for each node / message_type combination I have one data point per time period), and then sum the results of those aggregations across the different series.
I don't believe this is possibe though, and I can't see any way to work around it?
What I don't understand is why this isn't a critical issue for lots of users as this seems like a very standard use case. Is there a reason that other users are able to avoid this issue?
Would adding this capability (to do per series aggregation within "GROUP BY time" intervals) be technically difficult?
The text was updated successfully, but these errors were encountered:
I've hit an issue that I think is going to prevent us from deploying InfluxDB. The following is an example of this issue, but in reality it manifests in different ways in many of our use cases.
I have a set of nodes each reporting the rate at which differnet SIP messages are being received (in terms of messages per s)
timestamp | tag_node | tag_message_type | value
Each node reports writes these stats to InfluxDB roughly every 5s.
I want to draw charts showing the total rate of different types of SIP messages being processed across my deployment, split by message_type.
Ideally I would do this like so:
SELECT sum("value") FROM "sipMessages" WHERE WHERE time > now() - 7d GROUP BY time(1h), "tag_message_type"
The problem here is that this sums across different values of node, but also sums multiple points within each series. So the number I get returned is ~720 times higher than it should be. If I knew that there were always going to be exactly 720 measurements in each time interval I could divide by 720, but this 720 is only an approximation. [it is also the case that using Grafana to draw the graphs this "1h" period changes automatically based on the period that I'm graphing over]
What I really need to do is to run an average aggregation within each series over each GROUP BY time period (so that for each node / message_type combination I have one data point per time period), and then sum the results of those aggregations across the different series.
I don't believe this is possibe though, and I can't see any way to work around it?
What I don't understand is why this isn't a critical issue for lots of users as this seems like a very standard use case. Is there a reason that other users are able to avoid this issue?
Would adding this capability (to do per series aggregation within "GROUP BY time" intervals) be technically difficult?
The text was updated successfully, but these errors were encountered: