0
Fork 0
mirror of https://github.com/TryGhost/Ghost.git synced 2025-01-27 22:49:56 -05:00
Commit graph

5 commits

Author SHA1 Message Date
Chris Raible
431719080e
Added prometheus metric for time to acquire connection (#21628)
ref
https://linear.app/ghost/issue/ENG-1769/improve-pool-utilization-metric

- Currently the connection pool metrics are all point in time metrics,
and with a scrape interval of 15s this doesn't tell us a whole lot about
what's happening in the pool.
- This commit adds a Summary metric to track the elapsed time each
transaction has to wait to acquire a connection from the pool, which
should be a good indication of contention in the pool.
- Also moved the call to `prometheusClient.instrumentKnex` to after `initCore` in the boot process, because the metric depends on event listeners on `knex.client.pool`, and the pool gets destroyed and recreated in `initCore`, which removes the listeners
2024-11-14 21:14:40 -08:00
Chris Raible
85408d10b7
Added connection pool metrics to prometheus client (#21576)
ref
https://linear.app/ghost/issue/ENG-1592/start-monitoring-connection-pool-utilization-in-ghost

- This commit adds prometheus metrics to the connection pool so we can
start to track connection pool utilization, number of pending acquires,
and also adds some basic SQL query summary metrics like queries per
minute and query duration percentiles.
- The connection pool has now been theorized to be a main constraint of
Ghost for some time, but it's been challenging to get actual visibility
into the state of the connection pool. With this change, we should be
able to directly observe, monitor and alert on the connection pool.
- Updated grafana version to fix a bug in the query editor that was
fixed in 8.3, even though this is a couple versions ahead of production
2024-11-07 23:01:34 -08:00
Chris Raible
8aba92e444
Fixed CPU Usage chart in grafana dashboard (#21568)
ref
https://linear.app/ghost/issue/ENG-1505/start-monitoring-ghosts-constraints-and-our-3-goals-using-prometheus

- Using `irate` for aggregating CPU usage was resulting in some strange
behavior — the CPU Usage chart would zero out after a few mins of
running. Switching to regular `rate` seems to have fixed the issue
completely.
2024-11-07 10:18:41 -08:00
Chris Raible
a26f63dc11
Configured local prometheus and pushgateway in docker-compose (#21538)
ref
https://linear.app/ghost/issue/ENG-1746/enable-ghost-to-push-metrics-to-a-pushgateway

- Added prometheus job to scrape the pushgateway
- Updated grafana dashboard to use the metrics from the pushgateway
- Added some logging to prometheus client to log errors when pushing
metrics to pushgateway
2024-11-06 11:36:37 -08:00
Chris Raible
8b26b52513
Added prometheus and grafana services to docker compose (#21213)
ref
https://linear.app/tryghost/issue/ENG-1591/add-prometheus-and-grafana-services-to-docker-compose

This commit adds 2 new services to the docker compose file to enable
monitoring metrics from Ghost locally in real-time:
1. Prometheus - a service that scrapes Ghost's new `/metrics` endpoint
introduced in this
[commit](768336efad).
2. Grafana - a service that consumes the metrics from prometheus and
exposes them in a dashboard that you can view locally at
`localhost:3000`.

# Usage
Both of these services are selectively enabled using docker compose
[profiles](https://docs.docker.com/compose/how-tos/profiles/). This way,
if you don't opt-in to using these monitoring tools, they won't start
and consume resources on your host machine. To enable these services,
enable the `monitoring` profile by either setting the `COMPOSE_PROFILES`
environment variable to `monitoring`, or specifying the `--profile
monitoring` CLI argument to any `docker compose ...` commands.

I've found the easiest way to configure this in an 'always on' fashion
is to create a `.env` file in the project's root directory and add
`COMPOSE_PROFILES=monitoring` to it. As an added convenience, you can
also set `COMPOSE_FILE=.github/scripts/docker-compose.yml`, which will
allow you to run `docker compose ...` commands from the root directory
without specifying the full path each time.

# Intended for development only
These services are meant for local development only, and are not
configured for a production use-case. For example, the Grafana instance
is configured to have _no authorization_ so you won't need a
username/password to login at `localhost:3000`. Prometheus is also
configured to scrape the metrics once every second, which is likely
excessive for production use-cases, but may be useful for getting more
granular metrics while e.g. load testing locally.

# Dashboards
The Grafana instance includes a default dashboard including most of the
main default metrics provided by our prometheus client integration. The
dashboard is defined in a JSON file at
`.github/scripts/docker/grafana/dashboards/main-dashboard.json' and can
be modified & committed to add new visualizations that will be available
to anyone work on Ghost locally. You can also add other dashboards to
the same directory for specific use-cases, which should be picked up and
made available in the Grafana UI. [Read
more](https://grafana.com/docs/grafana/latest/dashboards/build-dashboards/view-dashboard-json-model/)
about Grafana's JSON schema for dashboards.
2024-10-03 14:43:07 -07:00