0
Fork 0
mirror of https://github.com/TryGhost/Ghost.git synced 2025-02-10 23:36:14 -05:00

Fixed Tinybird KPI results (#21914)

ref https://linear.app/ghost/issue/ANAL-120/bounce-rate-data-seems-to-mix-units
closes https://linear.app/ghost/issue/ANAL-119/visit-duration-metric-inaccurate
closes https://linear.app/ghost/issue/ANAL-118/charts-are-empty-with-only-1-data-point

- The original [web analytics starter kit KPI endpoint](ad1efb766e/tinybird/pipes/kpis.pipe (L122)) had this simpler endpoint, but as I've messed around adding features, I've unintentionally overcomplicated it and introduced a tonne of bugs.
- This reverts the KPI endpoint back towards the original structure, and moves all the calculations and where statements up to the data node
- This means that the left join at the end works and pulls in all the dates from the timeseries node correctly, without the need for using `WITH FILL STEP 1` which generated a result for every second when looking at a single days data. 
- Moving the where clause handling up to the `data` node, rather than being on the endpoint still works as expected, which confused me when I first started working with tinybird
- This should resolve several bugs we've experienced with the visit duration, with missing data points and empty charts, and perhaps even the bounce rate (but need to look at that more closely)
This commit is contained in:
Hannah Wolfe 2024-12-18 16:59:24 +00:00 committed by GitHub
parent ad44d7ac61
commit 79e5991ac2
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
5 changed files with 44 additions and 7230 deletions

View file

@ -185,36 +185,14 @@ DESCRIPTION >
General KPIs per date, works for both summary metrics and trends charts.
SQL >
%
select
site_uuid,
date,
member_status,
device,
browser,
location,
source,
pathname,
uniq(session_id) as visits,
sum(pageviews) as pageviews,
sum(case when latest_view_aux = first_view_aux then 1 else 0 end) / visits as bounce_rate,
avg(latest_view_aux - first_view_aux) as avg_session_sec
from pageviews
group by date, site_uuid, member_status, device, browser, location, source, pathname
NODE endpoint
DESCRIPTION >
Join and generate timeseries with metrics
SQL >
%
select
a.date as date,
sum(b.visits) as visits,
sum(b.pageviews) as pageviews,
avg(b.bounce_rate) as bounce_rate,
sum(b.avg_session_sec) as avg_session_sec
from timeseries a
left join data b using date
where
site_uuid = {{String(site_uuid, 'mock_site_uuid', description="Tenant ID", required=True)}}
@ -224,5 +202,13 @@ SQL >
{% if defined(source) %} and source = {{ String(source, description="Source to filter on", required=False) }} {% end %}
{% if defined(location) %} and location = {{ String(location, description="Location to filter on", required=False) }} {% end %}
{% if defined(pathname) %} and pathname = {{ String(pathname, description="Pathname to filter on", required=False) }} {% end %}
group by date
order by date WITH FILL STEP 1
group by date
NODE endpoint
DESCRIPTION >
Join and generate timeseries with metrics
SQL >
select a.date, b.visits, b.pageviews, b.bounce_rate, b.avg_session_sec
from timeseries a
left join data b using date

View file

@ -1,8 +1,8 @@
"date","visits","pageviews","bounce_rate","avg_session_sec"
"2100-01-01",3,5,0.3333333333333333,1741
"2100-01-01",3,5,0.3333333333333333,580.3333333333334
"2100-01-02",1,3,0,1027
"2100-01-03",3,7,0,9999
"2100-01-04",3,7,0.3333333333333333,1716
"2100-01-05",2,5,0,616
"2100-01-03",3,7,0,3333
"2100-01-04",3,7,0.3333333333333333,572
"2100-01-05",2,5,0,308
"2100-01-06",2,2,1,0
"2100-01-07",2,2,1,0

File diff suppressed because it is too large Load diff

View file

@ -2,6 +2,7 @@
"2100-01-01",1,2,0,1111
"2100-01-02",0,0,0,0
"2100-01-03",1,3,0,1115
"2100-01-04",3,7,0.3333333333333333,1716
"2100-01-04",3,7,0.3333333333333333,572
"2100-01-05",1,3,0,493
"2100-01-06",1,1,1,0
"2100-01-07",0,0,0,0

View file

@ -2,3 +2,7 @@
"2100-01-01",1,2,0,630
"2100-01-02",0,0,0,0
"2100-01-03",1,3,0,1115
"2100-01-04",0,0,0,0
"2100-01-05",0,0,0,0
"2100-01-06",0,0,0,0
"2100-01-07",0,0,0,0