This PR contains the following updates:
| Package | Type | Update | Change |
|---|---|---|---|
|
[parse-prometheus-text-format](https://redirect.github.com/yunyu/parse-prometheus-text-format)
| devDependencies | pin | [`^1.1.1` ->
`1.1.1`](https://renovatebot.com/diffs/npm/parse-prometheus-text-format/1.1.1/1.1.1)
|
Add the preset `:preserveSemverRanges` to your config if you don't want
to pin your dependencies.
---
### Configuration
📅 **Schedule**: Branch creation - "every weekday" (UTC), Automerge - At
any time (no schedule defined).
🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.
♻ **Rebasing**: Whenever PR is behind base branch, or you tick the
rebase/retry checkbox.
🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.
---
- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box
---
This PR was generated by [Mend Renovate](https://mend.io/renovate/).
View the [repository job
log](https://developer.mend.io/github/TryGhost/Ghost).
<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzOC45Ny4wIiwidXBkYXRlZEluVmVyIjoiMzguOTcuMCIsInRhcmdldEJyYW5jaCI6Im1haW4iLCJsYWJlbHMiOltdfQ==-->
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
ref
https://linear.app/tryghost/issue/ENG-1505/add-prometheus-metrics-server-to-allow-monitoring-ghost-metrics
# Summary
This commit includes two main components: a prometheus client class to
collect metrics from Ghost, and a standalone metrics server that exposes
a /metrics endpoint at a separate port (9416 by default) from the main
Ghost app.
The prometheus client is a very thin wrapper around
[prom-client](https://github.com/siimon/prom-client). We could use
prom-client directly, but this approach should make it easier to switch
to a different prometheus client package (or make our own) if we ever
need to down the line.
The list of default metrics this enables is specified in an e2e test
[here](https://github.com/TryGhost/Ghost/pull/21192/files#diff-ebc52236be2cd14b40be89220ae961f48d3f837693f7d1da76db292348915941R66-R92).
This also gives us the ability to create and collect custom metrics,
although none are included in this commit yet.
# Configuration
The prometheus client and the metrics server are both disabled by
default, but can be enabled by setting the metrics_server:enabled flag
to true.
You can also define a custom host and port using `metrics_server:host`
and `metrics_server:port`.
## Why not expose the /metrics endpoint in one of the existing express
apps?
The standalone express app exists for two main reasons:
1. We don't want these metrics to be public, and the easiest way to
accomplish that is to expose the /metrics endpoint at a different port
that won't be exposed to the internet.
2. Creating a standalone express instance decouples the metrics endpoint
from the Ghost server, so if Ghost is not responding for whatever
reason, we should still be able to scrape metrics to understand what's
going on internally.
## Impact on Boot & Shut down time
The prometheus client is initialized early in the boot process so we can
collect metrics during the boot sequence. Testing locally has shown that
this increases boot time by ~20ms. The metrics server which exposes the
/metrics endpoint is not initialized until after the background
services, and it is not awaited, to avoid impacting boot time. None of
this code, including the requires, will run if the
metrics_server:enabled flag is set to false (or not set).
Shutting down the metrics server is added as a cleanup task for the main
Ghost server instance, and is setup to shut down with 0 grace period to
avoid impacting shut down time.
no issue
- This commit removes all OpenTelemetry related code and dependencies
from Ghost.
- The initial implementation was done as a POC but it raised some
performance concerns with boot time, so we never actually enabled it
widely.
- We can revisit this in the future, but in the meantime it's just
adding unnecessary dependencies and bloating the codebase.
refs https://github.com/nrwl/nx/releases/tag/19.8.0
- this commit updates Nx to v19
- we need to add some extra commands to the dev script to stop and
restart the Nx daemon, so it's ready and running before we execute a
bunch of Nx commands concurrently
- this also updates nx.json to the format needed for the latest version
ref https://linear.app/tryghost/issue/ONC-323
When the store gets into a bad state for new posts that causes saves to fail we can detect that by looking at the `model.isNew` property. Currently our best approach to fix this state is to restart the app.
- added a `didTransition()` hook to our `lexical-edit.new` route
- detects the bad state, logs the error, and triggers a browser refresh
- logs with a `recreatedPostIsGood` property that will let us know if we could instead just try recreating the post and avoiding a full refresh (so far we have no reproduction case so we need to learn what we can)
- added `sinon-chai` dependency for better assertions on spies/stubs
- added `sentry-testkit` dependency so we can test our Sentry integration calls
- we can't use sinon for these calls because of the way Sentry's es6 imports work
- extracted our full Sentry config object generation to a util function so it can be re-used in unit tests
- updated our integrations list to disable the default `dedupe` integration because it can cause very unexpected/difficult to debug test failures when you're asserting using `sentry-testkit`
refs https://linear.app/tryghost/issue/DEV-23/workaround-for-yarn-caching-issues
- in our build pipeline, we add some more dependencies for our internal
adapters
- recently we've been seeing caching issues with these dependencies, not
sure why
- to workaround that, we'll just include them here and eventually bring
the adapters into the OSS repo
ref https://linear.app/tryghost/issue/ENG-661
Logging the full lexical objects to Sentry for the "showing leave editor modal" event isn't very useful because they almost always truncated due to size or stripped due to potentially sensitive data which makes the reports difficult to debug and action.
- added `code` to the context for each report to make identification easier
- updated `reason` for the lexical diverged reason to better match what's happened
- stripped `lexical`, `scratch`, and `secondaryLexical` from the context that is logged to Sentry because they aren't actionable there and just add noise
- added a diff to the logged context for the lexical change reason to more easily identify the changes that triggered an unexpected modal display
- previously we didn't get full objects in Sentry so couldn't do a comparison and the local workflow was to grab the logged scratch and secondaryLexical values and run a manual diff - this should help in both cases