Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Advanced event loop monitoring not aggregated correctly across cluster #418

Closed
mkreidenweis-schulmngr opened this issue Jan 8, 2021 · 2 comments · Fixed by #422
Closed
Assignees

Comments

@mkreidenweis-schulmngr
Copy link

mkreidenweis-schulmngr commented Jan 8, 2021

Hi everybody,

could it be the the cluster support (#147) was overlooked when introducing the advanced event loop monitoring in #278?

The old event loop metric has an aggregator set, whereas all the new onen don't:

aggregator: 'average'

Was looking at some metrics of a node cluster today and couldn't quite make sense of it. It looks to me like something else than the default sum aggregator should be used for most the other lag metrics, right?

zbjornson added a commit to zbjornson/prom-client that referenced this issue Jan 23, 2021
These are not entirely accurate: they're the "mean of the mean" and "mean of percentiles," but that's as good as we can get.

Fixes siimon#418
@zbjornson
Copy link
Collaborator

Opened #422 if you'd like to give it a try or provide feedback!

@zbjornson zbjornson changed the title Advanced event loop monitoring not aggregated correctly accross cluster Advanced event loop monitoring not aggregated correctly across cluster Jan 23, 2021
@zbjornson zbjornson self-assigned this Jan 23, 2021
zbjornson added a commit to zbjornson/prom-client that referenced this issue Jan 25, 2021
These are not entirely accurate: they're the "mean of the mean" and "mean of percentiles," but that's as good as we can get.

Fixes siimon#418
@mkreidenweis-schulmngr
Copy link
Author

Thanks for having looked into this. This actually went down in our priorities, as we actually moved away from using node cluster module. It also caused other problems with Prometheus metrics, e.g. causing random spikes in metrics when single cluster processes were restarted -- increase() and rate() can't deal with that properly. So it looks like I won't make time to try it out.

Maybe somebody else wants to give it a try? @dominathan seemed to be interested. :-)

zbjornson added a commit to zbjornson/prom-client that referenced this issue Jul 31, 2021
These are not entirely accurate: they're the "mean of the mean" and "mean of percentiles," but that's as good as we can get.

Fixes siimon#418
zbjornson added a commit that referenced this issue Jul 31, 2021
These are not entirely accurate: they're the "mean of the mean" and "mean of percentiles," but that's as good as we can get.

Fixes #418
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants