GA4 sessions magick. Which hit makes a session source / medium?

GA4 paranormal behavior experts

The goal

The goal of this series of tests is to understand which hit sets session source / medium in GA4. Is it page_view only? Does the auto-events change a session source / medium? If an event has campaign_source, campaign_medium fields but also has different UTM params in a URL, which source / medium will session have? How does the GA4 Config tag settings change the source / medium. If an event was before a page_view and they have different source / medium which source / medium will be applied to the session? And so on and so on..

This’s a rather long post, at the beginning where I will share my interpretation of the tests results, the same information I posted on linkedin, after that I show my GTM setup for testing and at the end - the long part describes steps for each test, with dataLayer pushes, requests payloads, GA4 screens, bigQuery screens and my comments. As I told you, there are a lot of details for those who like breathtaking data collection stories in the GA4 universe.

Interpretation of test results

Note: Everything below is about sessions inside the GA4 interface. I did test both with user_id, and without user_id.

  • Auto events first_visit, session_start don’t change source / medium

  • The source / medium is defined by a first hit. It doesn’t matter if it is an event or a page_view. And there are a few ways to set source / medium, these methods are listed by priority:

    • campaign_source, campaign_medium event fields set directly inside GTM Tag
    • campaign_source, campaign_medium event fields set by GA4 config tag
    • UTM params in the URL
  • If a hit has more than one of the listed parameters - the top one is applied.

  • If the first hit doesn’t have any of these parameters (and doesn’t have referrer and gclid) the session will be (direct) / (none)

  • The values of source / medium for the second and latest hit don’t change source / medium of the session

There’s one special case with unexpected results for me.

A session will have source / medium equals to (direct) / (none) if both conditions are applied:

  • Sessions starts with the hits without user_id, but later it has hits with user_id (aka registration session)
  • Source / medium are changed during the session.

I suppose it’s a very important case, as we can lose registration source / medium. I did more than 15 tests and still don’t understand the logic and how to prevent the problem. It’s definitely the subject of the next post I’m working on.

Based on the results of experiments I thought to have these base rules for GA4 setup:

  • Extract source, medium, campaign, gclid from url as GTM URL variables;
  • In GA4 Config tag set Fields to set for campaign_source, campaign_medium, campaign_name, gclid by the values from the previous step;
  • Set GA4 Config tag once per page;
  • For all GA4 events tags in Tag Sequence section add GA4 config as a setup tags;
  • Inside GA4 events tags don’t pass campaign_source, campaign_medium, campaign_name, gclid

But this setup is unsafe for registration sessions. Here we can have source and medium both in the URL (as UTM params), and in the hit (as campaign_source, campaign_medium params) and during my tests I got (direct) / (none) when I used such a setup for registration sessions. That’s why at the moment I can’t recommend this setup.

GTM Setup for experiments

Note: If you want you can download the GTM Container I used.

For testing I decided to use dataLayer pushes to fire different types of GA4 tags. I created Tags for GA4 Config, GA4 events, GA4 page_view.

GA4 config tag

GA4 config GTM tag

It’s rather base settings, but just a few things to mention. I disabled send a page view check box, as in some cases I want to understand how GA4 config works without page_view. And also I created a bunch of helper variables.

lt.medium.config

lt.medium.config GTM variable

A lookup variable which gets medium from dataLayer and add config_ prefix. For the GA4 event, and the GA4 page_view tags I created similar variables but with different prefixes. This way each tag always has a unique medium value and in the GA4 interface I can easily understand which hit set a session source / medium. And yes the same logic applies for lt.campaign.config, lt.source.config

lt.site-id variable

lt.site-id GTM variable

A lookup variable to get user_id either from URL or from dataLayer. For some tests I just set site-id as URL parameter, and all tags get the same values, but for other tests I want a different user_id for different events, so I always can pass it inside dataLayer push.

And for GA4 config tag I create a special trigger:

GA4 config GTM trigger

This way If I want to fire the GA4 config tag with some source / medium parameters I can make this dataLayer push:

1
2
3
4
5
6
dataLayer.push({
    event:  "send_config",
    source: "source_name",
    medium: "medium_name",
    campaign: "campaign_name",
})

GA4 event tag

GA4 event tag

GA4 page_view tag

GA4 page_view tag

These tags have pretty much the same settings, but use variables with different prefixes. And of course they have different triggers

This one is for GA4 page_view tag

GA4 page_view tag

And this one is for GA4 event tag

GA4 page_view tag

That’s all for the setup part. Just one small comment, since I usually pass source / medium through dataLayer pushes if I want some event that doesn’t have any predefined source / medium values I simply cleaned it with this dataLayer push:

1
2
3
4
5
6
dataLayer.push({
  event: "clear_sources",
  source: undefined,
  medium: undefined,
  campaign: undefined,
})

Fun (experiments) part

I split the tests into two parts. The first part - experiments which results are more or less clear for me and I can make some interpretation, and the second part tests which results I don’t understand and break my expectations.

I created a table of contents for both parts, and added a 🔥 symbol for the most interesting cases, so you can read through all the tests or jump to the test which for some reason you find cooler.

For GA4 screenshots I created two exploration reports.

For experiments with user_id:

  • ROWS : site_id, Session source / medium
  • VALUES : Sessions, Views, Event count,

For experiments without user_id:

  • ROWS : Page path + query string, Session source / medium
  • VALUES : Sessions, Views, Event count,

Also I made a SQL query which keeps only couple of event_params I hope it makes bigQuery results more readeable on screenshots, if needed you can reuse it:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
SELECT
    event_timestamp,
    event_name,
    user_id,
    user_pseudo_id,
    (
        ARRAY(
            select
                STRUCT(
                    key as key,
                    IFNULL(
                        value.string_value,
                        CAST(value.int_value AS STRING)
                    ) as value
                )
            from
                unnest(e.event_params)
            where
                key in (
                    'ga_session_id',
                    'page_location',
                    'source',
                    'medium'
                )
        )
    ) as event_params_prepared,
FROM
    `<project_prefix>.analytics_<analytics_id>.events_<date>` as e
where
    user_id = '<user_id>'
order by
    event_timestamp asc

Experiments with expected results

GA4 config with source / medium different from the UTM params in the URL

Steps

Go to:

https://ga4000.weebly.com/?site-id=100700&test=100700&utm_source=url_source_100700&utm_medium=url_medium_100700&utm_campaign=url_campaing_100700

dataLayer push:

1
2
3
4
5
6
dataLayer.push({
    event:  "send_config",
    source: "source100700",
    medium: "medium100700",
    campaign: "campaign100700",
})

Request payload

  • cid: 1476251450.1654411795
  • uid: 100700
  • cm: medium100700
  • cs: source100700
  • cn: campaign100700
  • sid: 1654411795
  • dl: https//ga4000.weebly.com/?site-id=100700&test=100700&utm_source=url_source_100700&utm_medium=url_medium_100700&utm_campaign=url_campaing_100700
  • en: page_view

Comment

For this test in GA4 config I enable send a page view check box.

Results in BigQuery

Results in BigQuery

Results in GA4 interface

Results in GA4 interface

Interpretation

  • auto-events doesn’t have source, medeium event_params, and they don’t change session source / medium
  • campaign_source, campaign_medium fields have priority over UTM params in a URL

Event with source / medium before GA4 config without source / medium

Steps

Go to:

https://ga4000.weebly.com/?site-id=1007031&test=1007031&utm_source=url_source_1007031&utm_medium=url_medium_1007031&utm_campaign=url_campaing_1007031

dataLayer push with event:

1
2
3
4
5
6
dataLayer.push({
    event:  "send_event",
    source: "event_source1007031",
    medium: "event_medium1007031",
    campaign: "event_campaign1007031",
})

Request payload

  • cid: 1528891532.1654412537
  • uid: 1007031
  • cn: event_campaign1007031
  • cs: event_source1007031
  • cm: event_medium1007031
  • sid: 1654412536
  • dl: https//ga4000.weebly.com/?site-id=1007031&test=1007031&utm_source=url_source_1007031&utm_medium=url_medium_1007031&utm_campaign=url_campaing_1007031
  • en: test

dataLayer push to clear dataLayer:

1
2
3
4
5
6
dataLayer.push({
    event: "clear_sources",
    source: undefined,
    medium: undefined,
    campaign: undefined,
})

dataLayer push with GA4 Config:

1
2
3
dataLayer.push({
    event:  "send_config",
})

Request payload

  • cid: 1528891532.1654412537
  • uid: 1007031
  • sid: 1654412536
  • dl: https//ga4000.weebly.com/?site-id=1007031&test=1007031&utm_source=url_source_1007031&utm_medium=url_medium_1007031&utm_campaign=url_campaing_1007031
  • en: page_view

Comment

For this test in GA4 config I enable send a page view check box.

Results in BigQuery

Results in BigQuery

Results in GA4 interface

Results in GA4 interface

Interpretation

  • evnt’s (first hit) source and medium having a higher priority over UTM params in a URL

Event with source / medium before GA4 config with source / medium 🔥

Steps

Go to:

https://ga4000.weebly.com/?site-id=100704&test=100704&utm_source=url_source_100704&utm_medium=url_medium_100704&utm_campaign=url_campaing_100704

dataLayer push with event:

1
2
3
4
5
6
dataLayer.push({
    event:  "send_event",
    source: "event_source100704",
    medium: "event_medium100704",
    campaign: "event_campaign100704",
})

Request payload

  • cid: 1453219504.1654412703
  • uid: 100704
  • cn: event_campaign100704
  • cs: event_source100704
  • cm: event_medium100704
  • sid: 1654412702
  • dl: https//ga4000.weebly.com/?site-id=100704&test=100704&utm_source=url_source_100704&utm_medium=url_medium_100704&utm_campaign=url_campaing_100704
  • en: test

dataLayer push to clear dataLayer:

1
2
3
4
5
6
dataLayer.push({
    event: "clear_sources",
    source: undefined,
    medium: undefined,
    campaign: undefined,
})

dataLayer push with GA4 Config:

1
2
3
4
5
6
dataLayer.push({
    event:  "send_config",
    source: "config_source100704",
    medium: "config_medium100704",
    campaign: "config_campaign100704",
})

Request payload

  • cid: 1453219504.1654412703
  • uid: 100704
  • cm: config_medium100704
  • cs: config_source100704
  • cn: config_campaign100704
  • sid: 1654412702
  • dl: https//ga4000.weebly.com/?site-id=100704&test=100704&utm_source=url_source_100704&utm_medium=url_medium_100704&utm_campaign=url_campaing_100704
  • en: page_view

Comment

For this test in GA4 config I enable send a page view check box.

Results in BigQuery

Results in BigQuery

Results in GA4 interface

Results in GA4 interface

Interpretation

  • this time second page_view has its own source and medium not from URL, but the results are the same, with the first hit source and medium having a higher priority.

Event and Config at the same trigger. All hits have different source /medium

Steps

Go to:

https://ga4000.weebly.com/?site-id=100705&test=100705&utm_source=url_source_100705&utm_medium=url_medium_100705&utm_campaign=url_campaing_100705

dataLayer push with event and config name:

1
2
3
4
5
6
dataLayer.push({
    event:  "send_event_and_config",
    source: "source100705",
    medium: "medium100705",
    campaign: "campaign100705",
})

Request payload 1st hit

  • cid: 1263233182.1654441240
  • uid: 100705
  • cm: config_medium100705
  • cs: config_source100705
  • cn: config_campaign100705
  • sid: 1654441239
  • dl: https//ga4000.weebly.com/?site-id=100705&test=100705&utm_source=url_source_100705&utm_medium=url_medium_100705&utm_campaign=url_campaing_100705
  • en: page_view

Request payload 2nd hit

  • cid: 1263233182.1654441240
  • uid: 100705
  • cm: event_medium100705
  • cs: event_source100705
  • cn: event_campaign100705
  • sid: 1654441239
  • dl: https//ga4000.weebly.com/?site-id=100705&test=100705&utm_source=url_source_100705&utm_medium=url_medium_100705&utm_campaign=url_campaing_100705
  • en: test

Comment

For this test in GA4 config I enable send a page view check box. Also I created a special trigger which fires an event tag and a config tag at the same dataLayer push.

Results in BigQuery

Results in BigQuery

Results in GA4 interface

Results in GA4 interface

Interpretation

  • here the page_view was fired first and as a result the session has page_view’s source / medium

Event Tag has Config Tag as a «Setup tag» in Advanced Settings / Tag Sequencing section. All hits have different source /medium

Steps

Go to:

https://ga4000.weebly.com/?site-id=100706&test=100706&utm_source=url_source_100706&utm_medium=url_medium_100706&utm_campaign=url_campaing_100706

dataLayer push with event:

1
2
3
4
5
6
dataLayer.push({
    event:  "send_event",
    source: "event_source100706",
    medium: "event_medium100706",
    campaign: "event_campaign100706",
})

Request payload 1st hit

  • cid: 1629436540.1654441544
  • uid: 100706
  • cm: config_event_medium100706
  • cs: config_event_source100706
  • cn: config_event_campaign100706
  • sid: 1654441544
  • dl: https//ga4000.weebly.com/?site-id=100706&test=100706&utm_source=url_source_100706&utm_medium=url_medium_100706&utm_campaign=url_campaing_100706
  • en: page_view

Request payload 2nd hit

  • cid: 1629436540.1654441544
  • uid: 100706
  • cm: event_event_medium100706
  • cs: event_event_source100706
  • cn: event_event_campaign100706
  • sid: 1654441544
  • dl: https//ga4000.weebly.com/?site-id=100706&test=100706&utm_source=url_source_100706&utm_medium=url_medium_100706&utm_campaign=url_campaing_100706
  • en: test

Comment

For this test in GA4 config I enable send a page view check box. For this test I changed Event Tag settings and in an Advanced Settings / Tag Sequencing section selected GA4 Config Tag as «Setup tag». First of all I wanted to be sure that page_view was sent before test event, and the second - which source / medium would be applied to the session.

Results in BigQuery

Results in BigQuery

Results in GA4 interface

Results in GA4 interface

Interpretation

  • Nothing unexpected here. Again the page_view was fired first and as a result the session has page_view’s source / medium

Config Tag without page_view, and separate page_view after it. All dataLayers pushes have different source /medium

Steps

Go to:

https://ga4000.weebly.com/?site-id=100707&test=100707&utm_source=url_source_100707&utm_medium=url_medium_100707&utm_campaign=url_campaing_100707

dataLayer push with config:

1
2
3
4
5
6
dataLayer.push({
    event:  "send_config",
    source: "hit1_source100707",
    medium: "hit1_medium100707",
    campaign: "hit1_campaign100707",
})

dataLayer push with page_view:

1
2
3
4
5
6
7
dataLayer.push({
    event:  "send_event",
    name: "page_view",
    source: "hit2_source100707",
    medium: "hit2_medium100707",
    campaign: "hit2_campaign100707",
})

Request payload

  • cid: 1911180781.1654441897
  • uid: 100707
  • cm: pv_hit2_medium100707
  • cs: pv_hit2_source100707
  • cn: pv_hit2_campaign100707
  • sid: 1654441896
  • dl: https//ga4000.weebly.com/?site-id=100707&test=100707&utm_source=url_source_100707&utm_medium=url_medium_100707&utm_campaign=url_campaing_100707
  • en: page_view

Comment

The first dataLayer push doesn’t send any hit to GA4, as in GA4 config I disabled «send a page view» check box. But GA4 config tag inits gtag with default source / medium.

Results in BigQuery

Results in BigQuery

Results in GA4 interface

Results in GA4 interface

Interpretation

  • In this experiment we can see that the hit’s (page_view in this case) source / medium has the higher priority than GA4 Config settings.

Event without source / medium, later page_view with source / medium 🔥

Steps

Go to:

https://ga4000.weebly.com/?site-id=100709&test=100709&utm_source=url_source_100709&utm_medium=url_medium_100709&utm_campaign=url_campaing_100709

dataLayer push with event:

1
2
3
dataLayer.push({
    event:  "send_event",
})

Request payload

  • cid: 1960408921.1654743372
  • uid: 100709
  • sid: 1654743372
  • dl: https//ga4000.weebly.com/?site-id=100709&test=100709&utm_source=url_source_100709&utm_medium=url_medium_100709&utm_campaign=url_campaing_100709
  • en: test

dataLayer push with page_view:

1
2
3
4
5
6
7
dataLayer.push({
    event:  "send_event",
    name: "page_view",
    source: "source100709",
    medium: "medium100709",
    campaign: "campaign100709",
})

Request payload

  • cid: 1960408921.1654743372
  • uid: 100709
  • cn: pv_campaign100709
  • cs: pv_source100709
  • cm: pv_medium100709
  • sid: 1654743372
  • dl: https//ga4000.weebly.com/?site-id=100709&test=100709&utm_source=url_source_100709&utm_medium=url_medium_100709&utm_campaign=url_campaing_100709
  • en: page_view

Results in BigQuery

Results in BigQuery

Results in GA4 interface

Results in GA4 interface

Interpretation

  • This is an important one. It shows that only the first hit matters. In the first event we get value from the UTM params. The second hit is a page_view with different source / medium - but the session gets source / medium from the first one.

Config with source / medium, then page_view without source / medium 🔥

Steps

Go to:

https://ga4000.weebly.com/?site-id=100711&test=100711&utm_source=url_source_100711&utm_medium=url_medium_100711&utm_campaign=url_campaing_100711

dataLayer push with config:

1
2
3
4
5
6
  dataLayer.push({
    event:  "send_config",
    source: "source100711",
    medium: "medium100711",
    campaign: "campaign100711",
})

dataLayer push to clear dataLayer:

1
2
3
4
5
6
dataLayer.push({
  event: "clear_sources",
  source: undefined,
  medium: undefined,
  campaign: undefined,
})

dataLayer push with page_view:

1
2
3
4
dataLayer.push({
    event:  "send_event",
    name: "page_view",
})

Request payload

  • cid: 385734457.1654744381
  • uid: 100711
  • cm: config_medium100711
  • cs: config_source100711
  • cn: config_campaign100711
  • sid: 1654744381
  • dl: https//ga4000.weebly.com/?site-id=100711&test=100711&utm_source=url_source_100711&utm_medium=url_medium_100711&utm_campaign=url_campaing_100711
  • en: page_view

Results in BigQuery

Results in BigQuery

Results in GA4 interface

Results in GA4 interface

Interpretation

  • Here we can see that GA4 config settings are more important than UTM params in URL

Two page view with source / medium

Steps

Go to:

https://ga4000.weebly.com/?site-id=100712&test=100712&utm_source=url_source_100712&utm_medium=url_medium_100712&utm_campaign=url_campaing_100712

dataLayer push with page_view 1:

1
2
3
4
5
6
7
dataLayer.push({
    event:  "send_event",
    name: "page_view",
    source: "pv1_source100712",
    medium: "pv1_medium100712",
    campaign: "pv1_campaign100712",
})

Request payload

  • cid: 1619507817.1654743937
  • uid: 100712
  • cn: pv_pv1_campaign100712
  • cs: pv_pv1_source100712
  • cm: pv_pv1_medium100712
  • sid: 1654743936
  • dl: https//ga4000.weebly.com/?site-id=100712&test=100712&utm_source=url_source_100712&utm_medium=url_medium_100712&utm_campaign=url_campaing_100712
  • en: page_view

dataLayer push with page_view 2:

1
2
3
4
5
6
7
dataLayer.push({
    event:  "send_event",
    name: "page_view",
    source: "pv2_source100712",
    medium: "pv2_medium100712",
    campaign: "pv2_campaign100712",
})

Request payload

  • cid: 1619507817.1654743937
  • uid: 100712
  • cn: pv_pv2_campaign100712
  • cs: pv_pv2_source100712
  • cm: pv_pv2_medium100712
  • sid: 1654743936
  • dl: https//ga4000.weebly.com/?site-id=100712&test=100712&utm_source=url_source_100712&utm_medium=url_medium_100712&utm_campaign=url_campaing_100712
  • en: page_view

Comment

This experiment was suggested by Camila (Pereira Martins) Ramos Mori

Results in BigQuery

Results in BigQuery

Results in GA4 interface

Results in GA4 interface

Interpretation

  • The first hit set the session source / medium

Event, then page_view both without user_id

Steps

Go to:

https://ga4000.weebly.com/?test=100801

dataLayer push:

1
2
3
4
5
6
dataLayer.push({
    event:  "send_event",
    source: "source100801",
    medium: "medium100801",
    campaign: "campaign100801",
})

Request payload

  • cid: 1028056362.1654752817
  • cn: event_campaign100801
  • cs: event_source100801
  • cm: event_medium100801
  • sid: 1654752816
  • dl: https//ga4000.weebly.com/?test=100801
  • en: test

dataLayer push with page_view:

1
2
3
4
5
6
7
dataLayer.push({
    event:  "send_event",
    name: "page_view",
    source: "source100801",
    medium: "medium100801",
    campaign: "campaign100801",
})

Request payload

  • cid: 1028056362.1654752817
  • cn: pv_campaign100801
  • cs: pv_source100801
  • cm: pv_medium100801
  • sid: 1654752816
  • dl: https//ga4000.weebly.com/?test=100801
  • en: page_view

Comment

I repeated a few experiments but without user_id, to be sure to get the same results.

Results in BigQuery

Results in BigQuery

Results in GA4 interface

Results in GA4 interface

Interpretation

  • The results are the same like we have for sessions with user_id - the first hit set the session source / medium

Experiments with unexpected results

One event with USER_ID but without source / medium

Steps

Go to:

https://ga4000.weebly.com/?site-id=100701&test=100701&utm_source=url_source_100701&utm_medium=url_medium_100701&utm_campaign=url_campaing_100701

dataLayer push:

1
2
3
dataLayer.push({
    event:  "send_event",
})

Request payload

  • cid: 1726334256.1654411896
  • uid: 100701
  • sid: 1654411896
  • dl: https//ga4000.weebly.com/?site-id=100701&test=100701&utm_source=url_source_100701&utm_medium=url_medium_100701&utm_campaign=url_campaing_100701
  • en: test

Results in BigQuery

Results in BigQuery

Results in GA4 interface

Results in GA4 interface

Interpretation

Do you see something paranormal here? The BigQuery event has source, medium and UTM params in the page_location, but the session is direct..

In the next experiment I tried to pass source, medium using campaign_source, campaign_medium but…

One event with USER_ID and with source / medium

Steps

Go to:

https://ga4000.weebly.com/?site-id=100702&test=100702&utm_source=url_source_100702&utm_medium=url_medium_100702&utm_campaign=url_campaing_100702

dataLayer push:

1
2
3
4
5
6
dataLayer.push({
    event:  "send_event",
    source: "source100702",
    medium: "medium100702",
    campaign: "campaign100702",
})

Request payload

  • cid: 408460092.1654411998
  • uid: 100702
  • cn: campaign100702
  • cs: source100702
  • cm: medium100702
  • sid: 1654411997
  • dl: https//ga4000.weebly.com/?site-id=100702&test=100702&utm_source=url_source_100702&utm_medium=url_medium_100702&utm_campaign=url_campaing_100702
  • en: test

Results in BigQuery

Results in BigQuery

Results in GA4 interface

Results in GA4 interface

Interpretation

This experiment is the same like the previous one, but I also tried to pass source / medium through event params. As you can see in the bigQuery screenshot test event source and medium params are not the same as the UTM params in page_location, but it gets no results - GA4 still has (direct) / (none).

At first I decided that sessions without page_view are always (direct) / (none). And made the next experiment..

Two events with USER_ID and with source / medium

Steps

Go to:

https://ga4000.weebly.com/?site-id=100708&test=100708&utm_source=url_source_100708&utm_medium=url_medium_100708&utm_campaign=url_campaing_100708

dataLayer push event 1:

1
2
3
4
5
6
dataLayer.push({
    event:  "send_event",
    source: "hit1_source100708",
    medium: "hit1_medium100708",
    campaign: "hit1_campaign100708",
})

Request payload hit 1

  • cid: 374640802.1654743205
  • uid: 100708
  • cn: event_hit1_campaign100708
  • cs: event_hit1_source100708
  • cm: event_hit1_medium100708
  • sid: 1654743204
  • dl: https//ga4000.weebly.com/?site-id=100708&test=100708&utm_source=url_source_100708&utm_medium=url_medium_100708&utm_campaign=url_campaing_100708
  • en: test

dataLayer push event 2:

1
2
3
4
5
6
dataLayer.push({
    event:  "send_event",
    source: "hit2_source100708",
    medium: "hit2_medium100708",
    campaign: "hit2_campaign100708",
})

Request payload hit 2

  • cid: 374640802.1654743205
  • uid: 100708
  • cn: event_hit2_campaign100708
  • cs: event_hit2_source100708
  • cm: event_hit2_medium100708
  • sid: 1654743204
  • dl: https//ga4000.weebly.com/?site-id=100708&test=100708&utm_source=url_source_100708&utm_medium=url_medium_100708&utm_campaign=url_campaing_100708
  • en: test

Results in BigQuery

Results in BigQuery

Results in GA4 interface

Results in GA4 interface

Interpretation

It breaks the hypothesis that the session should have at least one page_view to not be direct. It looks like a normal expected behaviour - the first hit makes session source / medium.

But the most strange thing is the next test - it is like the first one but without user_id, and everything works..

Event without USER_ID but with source / medium

Steps

Go to:

https://ga4000.weebly.com/?test=20220610-2&utm_source=url_source_20220610-2&utm_medium=url_medium_20220610-2&utm_campaign=url_campaing_20220610-2

dataLayer push:

1
2
3
4
5
6
dataLayer.push({
  event:  "send_event",
  source: "source20220610-2",
  medium: "medium20220610-2",
  campaign: "campaign20220610-2",
})

Request payload hit 1

  • cid: 1549664107.1654823742
  • cn: event_campaign20220610-2
  • cs: event_source20220610-2
  • cm: event_medium20220610-2
  • sid: 1654823742
  • dl: https//ga4000.weebly.com/?test=20220610-2&utm_source=url_source_20220610-2&utm_medium=url_medium_20220610-2&utm_campaign=url_campaing_20220610-2
  • dt: Home
  • en: test

Results in BigQuery

Results in BigQuery

Results in GA4 interface

Results in GA4 interface

Interpretation

In this experiment we have one event without page_view, and without user_id - and everything works as expected - no direct session.

The magic, I can’t explain why a session with only one event with user_id becomes direct, but without user_id works correctly. Maybe in some edge cases auto-events somehow affect a session source / medium.

Page_view without USER_ID, then page_view without USER_ID. All hits with source / medium 🔥

Steps

Go to:

https://ga4000.weebly.com/?test=100802

dataLayer push page_view 1:

1
2
3
4
5
6
7
dataLayer.push({
    event:  "send_event",
    name: "page_view",
    source: "pv1_source100802",
    medium: "pv1_medium100802",
    campaign: "pv1_campaign100802",
})

Request payload hit 1

  • cid: 772717869.1654752914
  • cn: pv_pv1_campaign100802
  • cs: pv_pv1_source100802
  • cm: pv_pv1_medium100802
  • sid: 1654752913
  • dl: https//ga4000.weebly.com/?test=100802
  • en: page_view

dataLayer push page_view 2:

1
2
3
4
5
6
7
8
dataLayer.push({
  event:  "send_event",
  site_id: "100802",
  name: "page_view",
  source: "pv2_source100802",
  medium: "pv2_medium100802",
  campaign: "pv2_campaign100802",
})

Request payload hit 2

  • cid: 772717869.1654752914
  • uid: 100802
  • cn: pv_pv2_campaign100802
  • cs: pv_pv2_source100802
  • cm: pv_pv2_medium100802
  • sid: 1654752913
  • dl: https//ga4000.weebly.com/?test=100802
  • en: page_view

Results in BigQuery

Results in BigQuery

Results in GA4 interface

Results in GA4 interface

Interpretation

I don’t understand the results of this experiment - all hits have source / medium but the session is direct. I try to reproduce this test in 15 more variations, for example - send event first, send an event after a page_view, with UTM params and so on - if a session has hits without and with user_id and more than 1 variant of source / medium I always get the same direct result in GA4.

Next I will show a few tests but one screenshot can say more than 100 words.

Results in GA4 interface

Page_view without USER_ID and with source / medium, then page_view with USER_ID and with source / medium from UTM params

Steps

Go to:

https://ga4000.weebly.com/?test=20220611-2&utm_source=url_source_20220611-2&utm_medium=url_medium_20220611-2&utm_campaign=url_campaing_20220611-2

dataLayer push page_view:

1
2
3
4
5
6
7
dataLayer.push({
  event:  "send_event",
  name: "page_view",
  source: "source20220611-2",
  medium: "medium20220611-2",
  campaign: "campaign20220611-2",
})

Request payload hit 1

  • cid: 1150268895.1654941077
  • cn: pv_campaign20220611-2
  • cs: pv_source20220611-2
  • cm: pv_medium20220611-2
  • sid: 1654941077
  • dl: https//ga4000.weebly.com/?test=20220611-2&utm_source=url_source_20220611-2&utm_medium=url_medium_20220611-2&utm_campaign=url_campaing_20220611-2
  • en: page_view

dataLayer push clear DLV:

1
2
3
4
5
6
dataLayer.push({
  event: "clear_sources",
  source: undefined,
  medium: undefined,
  campaign: undefined,
})

dataLayer push event:

1
2
3
4
5
dataLayer.push({
  event:  "send_event",
  name: "page_view",
  site_id: "20220611-2",
})

Request payload hit 2

  • cid: 1150268895.1654941077
  • uid: 20220611-2
  • sid: 1654941077
  • dl: https//ga4000.weebly.com/?test=20220611-2&utm_source=url_source_20220611-2&utm_medium=url_medium_20220611-2&utm_campaign=url_campaing_20220611-2
  • en: page_view

Results in BigQuery

Results in BigQuery

Results in GA4 interface

Results in GA4 interface

Interpretation

In this case, the first page_vies sends source / medium different value from UTM params, and second event doesn’t send source / medium but gets it from UTM params - as a result the session has hits with and without user_id, and more than one source / medium variant - it leads to direct.

Event without USER_ID and with source / medium, then page_view with USER_ID and with source / medium from UTM params 🔥

Steps

Go to:

https://ga4000.weebly.com/?test=20220611-4&utm_source=url_source_20220611-4&utm_medium=url_medium_20220611-4&utm_campaign=url_campaing_20220611-4

dataLayer push page_view:

1
2
3
4
5
6
dataLayer.push({
  event:  "send_event",
  source: "source20220611-4",
  medium: "medium20220611-4",
  campaign: "campaign20220611-4",
})

Request payload hit 1

  • cid: 384254225.1654941247
  • cn: event_campaign20220611-4
  • cs: event_source20220611-4
  • cm: event_medium20220611-4
  • sid: 1654941246
  • dl: https//ga4000.weebly.com/?test=20220611-4&utm_source=url_source_20220611-4&utm_medium=url_medium_20220611-4&utm_campaign=url_campaing_20220611-4
  • en: test

dataLayer push clear DLV:

1
2
3
4
5
6
dataLayer.push({
  event: "clear_sources",
  source: undefined,
  medium: undefined,
  campaign: undefined,
})

dataLayer push event:

1
2
3
4
5
dataLayer.push({
  event:  "send_event",
  name: "page_view",
  site_id: "20220611-4",
})

Request payload hit 2

  • cid: 384254225.1654941247
  • uid: 20220611-4
  • sid: 1654941246
  • dl: https//ga4000.weebly.com/?test=20220611-4&utm_source=url_source_20220611-4&utm_medium=url_medium_20220611-4&utm_campaign=url_campaing_20220611-4
  • en: page_view

Results in BigQuery

Results in BigQuery

Results in GA4 interface

Results in GA4 interface
Results in GA4 interface

Interpretation

This experiment crushed me completely. I suppose if user has only one session - Session source / medium and First user source / medium should have the same value. But it’s not always true..

page_view without USER_ID, then page_view with USER_ID. Source / medium in UTM params

Steps

Go to:

https://ga4000.weebly.com/?test=20220612-1&utm_source=url_source_20220612-1&utm_medium=url_medium_20220612-1&utm_campaign=url_campaing_20220612-1

dataLayer push page_view 1:

1
2
3
4
dataLayer.push({
  event:  "send_event",
  name: "page_view",
})

Request payload hit 1

  • cid: 906593365.1655039373
  • sid: 1655039373
  • dl: https//ga4000.weebly.com/?test=20220612-1&utm_source=url_source_20220612-1&utm_medium=url_medium_20220612-1&utm_campaign=url_campaing_20220612-1
  • en: page_view

dataLayer push page_view 2:

1
2
3
4
5
dataLayer.push({
  event:  "send_event",
  site_id: "20220612-1",
  name: "page_view",
})

Request payload hit 2

  • cid: 906593365.1655039373
  • uid: 20220612-1
  • sid: 1655039373
  • dl: https//ga4000.weebly.com/?test=20220612-1&utm_source=url_source_20220612-1&utm_medium=url_medium_20220612-1&utm_campaign=url_campaing_20220612-1
  • en: page_view

Results in BigQuery

Results in BigQuery

Results in GA4 interface

Results in GA4 interface

Interpretation

This’s the only working way I found for the «registration sessions» - all hits with or without user_id don’t send any source / medium but get them from UTM params.

P.S.

If you read the whole post and find it somehow useful or even interesting, my congratulations you, you are definitely one of GA4’s ghostbusters team. I encourage you to try to reproduce some tests yourself and share your findings with the community. If you have any questions or ideas please message me on LinkedIn.

GA4 ghostbusters team