Bug 1852098 - Move and update Use Counter documentation r=emilio

Differential Revision: https://phabricator.services.mozilla.com/D193247
This commit is contained in:
Chris H-C 2023-11-15 21:33:09 +00:00
parent adcd2dc7f9
commit 5c4d5f65c2
5 changed files with 166 additions and 108 deletions

View File

@ -16,3 +16,4 @@ These linked pages contain design documents for the DOM implementation in Gecko.
ioutils_migration
fedcm
streams
use-counters

165
dom/docs/use-counters.rst Normal file
View File

@ -0,0 +1,165 @@
============
Use Counters
============
Use counters are used to report statistics on how much a given web platform feature is used across the Web.
Supported features include:
* WebIDL methods and attributes (getters and setters are reported separately) for pages, documents, and workers,
* CSS properties (including properties that aren't in the web platform, but we're interested in),
* Deprecated DOM operations,
* Other things like SVG filters and APIs specifically unsupported in Private Browsing Mode,
via custom use counters.
Adding a new Use Counter
========================
How you add a new use counter is different depending on what kind of web platform feature you're instrumenting.
The one constant is that you must run ``./mach gen-use-counter-metrics``
after adding or removing a use counter.
(Why this is a manual step and not part of the build is explained in
`the implementation bug 1852098 <https://bugzilla.mozilla.org/show_bug.cgi?id=1852098#c11>`_.)
WebIDL Methods and Attributes
-----------------------------
Use counters for WebIDL Methods and Attributes are added manually by editing
:searchfox:`UseCounters.conf <dom/base/UseCounters.conf>` or, for workers,
:searchfox:`UseCountersWorker.conf <dom/base/UseCountersWorker.conf>`, and
by annotating the WebIDL Method or Attribute with the ``[UseCounter]``
extended attribute.
(Why you must write this in two places is because generating things from
bindings codegen and ensuring all the dependencies were correct proved to be
rather difficult)
Then run ``./mach gen-use-counter-metrics`` and build as normal.
CSS Properties
--------------
Use counters for CSS properties are automatically generated for every property Gecko supports.
To add a use counter for a CSS property that isn't supported by Gecko,
add it to :searchfox:`counted_unknown_properties.py <servo/components/style/properties/counted_unknown_properties.py>`.
Then run ``./mach gen-use-counter-metrics`` and build as normal.
Deprecated DOM operations
-------------------------
Use counters for deprecated DOM operations are declared in
:searchfox:`nsDeprecatedOperationList.h <dom/base/nsDeprecatedOperationList.h>`.
To add a use counter for a deprecated DOM operation, you'll add an invocation of the
``DEPRECATED_OPERATION(DeprecationReference)`` macro.
The provided parameter must have the same value of the deprecation note added to the *IDL* file.
See `bug 1860635 <https://bugzilla.mozilla.org/show_bug.cgi?id=1860635>`_ for a sample
deprecated operation.
Then run ``./mach gen-use-counter-metrics`` and build as normal.
Custom use counters
-------------------
Custom use counters are for counting per-page, per-document, or per-worker
uses of web platform features that can't be handled directly through WebIDL annotation.
For example, the use of specific SVG filters isn't a WebIDL method or attribute,
but was still an aspect of the web platform of interest.
To add a custom use counter, define it in
:searchfox:`UseCounters.conf <dom/base/UseCounters.conf>` or, for workers,
:searchfox:`UseCountersWorker.conf <dom/base/UseCountersWorker.conf>`
by following the instructions in the file.
Broadly, you'll be writing a line like ``custom feBlend uses the feBlend SVG filter``.
Then, by running the build as normal, an enum in ``enum class UseCounter``
will be generated for your use counter, which you should pass to
``Document::SetUseCounter()`` when it's used.
``Document::SetUseCounter()`` is very cheap,
so do not be afraid to call it every time the feature is used.
Take care to craft the description appropriately.
It will be appended to "Whether a document " or "Whether a shared worker ",
so write only the ending.
The processor scripts
=====================
The definition files are processed during the build to generate C++ headers
included by web platform components (e.g. DOM) that own the features to be tracked.
The definition files are also processed during ``./mach gen-use-counter-metrics``
to generate :searchfox:`use_counter_metrics.yaml <dom/base/use_counter_metrics.yaml>`
which generates the necessary Glean metrics for recording and reporting use counter data.
gen-usecounters.py
------------------
This script is called by the build system to generate:
- the ``UseCounterList.h`` header for the WebIDL, out of the definition files.
- the ``UseCounterWorkerList.h`` header for the WebIDL, out of the definition files.
usecounters.py
--------------
Contains methods for parsing and transforming use counter definition files,
as well as the mechanism that outputs the Glean use counter metrics definitions.
Data Review
===========
The concept of a Use Counter data collection
(being a web platform feature which has the number of pages, documents, workers
(of various types), or other broad category of web platform API surfaces that
*use* it recorded and reported by a data collection mechanism (like Glean))
was approved for opt-out collection in all products using Gecko and Glean in
`bug 1852098 <https://bugzilla.mozilla.org/show_bug.cgi?id=1852098>`_.
As a result,
if you are adding new use counter data collections for WebIDL methods or attributes,
deprecated operations, or CSS properties:
you almost certainly don't need a data collection review.
If you are adding a custom use counter, you might need a data collection review.
The criteria for whether you do or not is whether the custom use counter you're adding
can fall under
`the over-arching data collection review request <https://bugzilla.mozilla.org/show_bug.cgi?id=1852098>`_.
For example: a custom use counter for an SVG filter? Clearly a web platform feature being counted.
A custom use counter that solely increments when you visit a social media website?
Doesn't seem like it'd be covered, no.
If unsure, please ask on
`the #data-stewards channel on Matrix <https://chat.mozilla.org/#/room/#data-stewards:mozilla.org>`_.
The Data
========
Use Counters are, as of Firefox 121, collected using Glean as
``counter`` metrics on the "use-counters" ping.
They are in a variety of metrics categories of ``use.counter.X``
which you can browse on
`the Glean Dictionary <https://dictionary.telemetry.mozilla.org/apps/firefox_desktop?page=1&search=use.counter>`_.
The dictionary also contains information about how to view the data.
Interpreting the data
---------------------
A use counter on its own is minimally useful, as it is solely a count of how many
(pages, documents, workers of a specific type, other web platform API surfaces)
a given part of the web platform was used on.
Knowing a feature was encountered ``0`` times across all of Firefox would be useful to know.
(If you wanted to remove something).
Knowing a feature was encountered *more than* ``0`` times would be useful.
(If you wanted to argue against removing something).
But any other number of, say, pages using a web platform feature is only useful
in context with how many total pages were viewed.
Thus, each use counter has in its description a name of another counter
-- a denominator -- to convert the use counter into a usage rate.
Using pages as an example, knowing the CSS property ``overflow``
is used on ``1504`` pages is... nice. I guess.
But if you sum up ``use.counters.top_level_content_documents_destroyed``
to find that there were only ``1506`` pages loaded?
That's a figure we can do something with.
We can order MDN search results by popularity.
We can prioritize performance efforts in Gecko to focus on the most-encountered features.
We can view the popularity over time and see when we expect we'll be able to deprecate and remove the feature.
This is why you'll more likely encounter use counter data expressed as usage rates.

View File

@ -18,7 +18,6 @@ The current data collection possibilities include:
* :doc:`events` can record richer data on individual occurrences of specific actions
* :doc:`Measuring elapsed time <measuring-time>`
* :doc:`Custom pings <custom-pings>`
* :doc:`Use counters <use-counters>` measure the usage of web platform features
* :doc:`Experiment annotations <experiments>`
* :doc:`Remote content uptake <uptake>`
* :doc:`WebExtension API <webextension-api>` can be used in privileged webextensions

View File

@ -1,105 +0,0 @@
============
Use Counters
============
Use counters are used to report Telemetry statistics on whether individual documents
use a given WebIDL method or attribute (getters and setters are reported separately), CSS
property, or deprecated DOM operation. Custom use counters can also be
defined to test frequency of things that don't fall into one of those
categories.
As of Firefox 65 the collection of Use Counters is enabled on all channels.
The API
=======
The process to add a new use counter is different depending on the type feature that needs
to be measured. In general, for each defined use counter, two separate boolean histograms are generated:
- one describes the use of the tracked feature for individual documents and has the ``_DOCUMENT`` suffix;
- the other describes the use of the same thing for top-level pages (basically what we think of as a *web page*) and has the ``_PAGE`` suffix.
Using two histograms is particularly suited to measure how many sites would be affected by
removing the tracked feature.
Example scenarios:
- Site *X* triggers use counter *Y*. We report "used" (true) in both the ``_DOCUMENT`` and ``_PAGE`` histograms.
- Site *X* does not trigger use counter *Y*. We report "unused" (false) in both the ``_DOCUMENT`` and ``_PAGE`` histograms.
- Site *X* has an iframe for site *W*. Site *W* triggers use counter *Y*, but site *X* does not. We report one "used" and one "unused" in the individual ``_DOCUMENT`` histogram and one "used" in the top-level ``_PAGE`` histogram.
Deprecated DOM operations
-------------------------
Use counters for deprecated DOM operations are declared in the `nsDeprecatedOperationList.h <https://searchfox.org/mozilla-central/source/dom/base/nsDeprecatedOperationList.h>`_ file. The counters are
registered through the ``DEPRECATED_OPERATION(DeprecationReference)`` macro. The provided
parameter must have the same value of the deprecation note added to the *IDL* file.
See this `changeset <https://hg.mozilla.org/mozilla-central/rev/e30a357b25f1>`_ for a sample
deprecated operation.
CSS Properties
~~~~~~~~~~~~~~
Use counters for CSS properties are generated for every property Gecko supports automatically, and are counted via StyleUseCounters (`Rust code <https://searchfox.org/mozilla-central/rev/7ed8e2d3d1d7a1464ba42763a33fd2e60efcaedc/servo/components/style/use_counters/mod.rs>`_, `C++ code <https://searchfox.org/mozilla-central/rev/7ed8e2d3d1d7a1464ba42763a33fd2e60efcaedc/dom/base/Document.h#5077>`_).
The UseCounters registry
------------------------
Use counters for WebIDL methods/attributes are registered in the `UseCounters.conf <https://searchfox.org/mozilla-central/source/dom/base/UseCounters.conf>`_ file. The format of this file is very strict. Each line can be:
1. a blank line
2. a comment, which is a line that begins with ``//``
3. one of four possible use counter declarations:
* ``method <IDL interface name>.<IDL operation name>``
* ``attribute <IDL interface name>.<IDL attribute name>``
* ``custom <any valid identifier> <description>``
Custom use counters
~~~~~~~~~~~~~~~~~~~
The <description> for custom counters will be appended to "When a document " or "When a page ", so phrase it appropriately. For instance, "constructs a Foo object" or "calls Document.bar('some value')". It may contain any character (including whitespace). Custom counters are incremented when SetUseCounter(eUseCounter_custom_MyName) is called on a Document object.
WebIDL methods and attributes
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Additionally to having a new entry added to the `UseCounters.conf <https://searchfox.org/mozilla-central/source/dom/base/UseCounters.conf>`_ file, WebIDL methods and attributes must have a ``[UseCounter]`` extended attribute in the Web IDL file in order for the counters to be incremented.
Both additions are required because generating things from bindings codegen and ensuring all the dependencies are correct would have been rather difficult.
The processor script
====================
The definition files are processed twice:
- once to generate two C++ headers files, included by the web platform components (e.g. DOM) that own the features to be tracked;
- the other time by the Telemetry component, to generate the histogram definitions that make the collection system work.
.. note::
The histograms that are generated out of use counters are set to *never* expire and are collected from Firefox release. Note that before Firefox 65 they were only collected on pre-release.
gen-usecounters.py
------------------
This script is called by the build system to generate:
- the ``UseCounterList.h`` header for the WebIDL, out of the definition files.
Interpreting the data
=====================
The histogram as accumulated on the client only puts values into the 1 bucket, meaning that
the use counter directly reports if a feature was used but it does not directly report if
it isn't used.
The values accumulated within a use counter should be considered proportional to
``CONTENT_DOCUMENTS_DESTROYED`` and ``TOP_LEVEL_CONTENT_DOCUMENTS_DESTROYED`` (see
`here <https://searchfox.org/mozilla-central/rev/1a973762afcbc5066f73f1508b0c846872fe3952/dom/base/Document.cpp#15059-15081>`__). The difference between the values of these two histograms
and the related use counters below tell us how many pages did *not* use the feature in question.
For instance, if we see that a given session has destroyed 30 content documents, but a
particular use counter shows only a count of 5, we can infer that the use counter was *not*
used in 25 of those 30 documents.
Things are done this way, rather than accumulating a boolean flag for each use counter,
to avoid sending histograms for features that don't get widely used. Doing things in this
fashion means smaller telemetry payloads and faster processing on the server side.
Version History
---------------
- Firefox 65:
- Enable Use Counters on release channel (`bug 1477433 <https://bugzilla.mozilla.org/show_bug.cgi?id=1477433>`_)

View File

@ -74,8 +74,6 @@ Most of our data collection happens through :doc:`scalars <../collection/scalars
Both scalars & histograms allow recording by keys. This allows for more flexible, two-level data collection.
Other collections can build on top of scalars & histograms. An example is :doc:`use counters <../collection/use-counters>`, which submit web feature usage through histograms.
We also collect :doc:`environment data <../data/environment>`. This consists of mostly scalar values that capture the “working environment” a Firefox session lives in, and includes e.g. data on hardware, OS, add-ons and some settings. Any data that is part of the "working environment", or needs to split :doc:`subsessions <../concepts/sessions>`, should go into it.
Rich data