Skip to content

Commit 6c3fce7

Browse files
committed
Initial doc: installation, quickstart, API doc
Trimmed the readme, and added a bit on re2. Had to update a ton of docstrings to have a decent API doc. Also removed a bunch of leftover references to parsers, and removed the completely useless `Parse` bit from `ParseResult`, `PartialParseResult`, and `DefaultedParseResult`, which further contributed to the churn and every file in the project being touched, as it required editing files without docstrings. Fixes #182
1 parent 4282003 commit 6c3fce7

25 files changed

Lines changed: 724 additions & 184 deletions

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,3 +8,4 @@ dist/
88
tmp/
99
regexes.yaml
1010
_regexes.py
11+
doc/_build

README.rst

Lines changed: 29 additions & 57 deletions
Original file line numberDiff line numberDiff line change
@@ -30,8 +30,20 @@ Just add ``ua-parser`` to your project's dependencies, or run
3030
3131
to install in the current environment.
3232

33-
Getting Started
34-
---------------
33+
Installing `google-re2 <https://pypi.org/project/google-re2/>`_ is
34+
*strongly* recommended as it leads to *significantly* better
35+
performances. This can be done directly via the ``re2`` optional
36+
dependency:
37+
38+
.. code-block:: sh
39+
40+
$ pip install 'ua_parser[re2]'
41+
42+
If ``re2`` is available, ``ua-parser`` will simply use it by default
43+
instead of the pure-python resolver.
44+
45+
Quick Start
46+
-----------
3547

3648
Retrieve all data on a user-agent string
3749
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -41,25 +53,25 @@ Retrieve all data on a user-agent string
4153
>>> from ua_parser import parse
4254
>>> ua_string = 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.104 Safari/537.36'
4355
>>> parse(ua_string) # doctest: +NORMALIZE_WHITESPACE, +ELLIPSIS
44-
ParseResult(user_agent=UserAgent(family='Chrome',
45-
major='41',
46-
minor='0',
47-
patch='2272',
48-
patch_minor='104'),
49-
os=OS(family='Mac OS X',
50-
major='10',
51-
minor='9',
52-
patch='4',
53-
patch_minor=None),
54-
device=Device(family='Mac',
55-
brand='Apple',
56-
model='Mac'),
57-
string='Mozilla/5.0 (Macintosh; Intel Mac OS...
56+
Result(user_agent=UserAgent(family='Chrome',
57+
major='41',
58+
minor='0',
59+
patch='2272',
60+
patch_minor='104'),
61+
os=OS(family='Mac OS X',
62+
major='10',
63+
minor='9',
64+
patch='4',
65+
patch_minor=None),
66+
device=Device(family='Mac',
67+
brand='Apple',
68+
model='Mac'),
69+
string='Mozilla/5.0 (Macintosh; Intel Mac OS...
5870
5971
Any datum not found in the user agent string is set to ``None``::
6072
6173
>>> parse("")
62-
ParseResult(user_agent=None, os=None, device=None, string='')
74+
Result(user_agent=None, os=None, device=None, string='')
6375
6476
Extract only browser data from user-agent string
6577
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -94,43 +106,3 @@ Extract device information from user-agent string
94106
>>> ua_string = 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.104 Safari/537.36'
95107
>>> parse_device(ua_string)
96108
Device(family='Mac', brand='Apple', model='Mac')
97-
98-
Parser
99-
~~~~~~
100-
101-
Parsers expose the same functions (``parse``, ``parse_user_agent``,
102-
``parse_os``, and ``parse_device``) as the top-level of the package,
103-
however these are all *utility* methods.
104-
105-
The actual protocol of parsers, and the one method which must be
106-
implemented / overridden is::
107-
108-
def __call__(self, str, Components, /) -> ParseResult:
109-
110-
It's similar to but more flexible than ``parse``:
111-
112-
- The ``str`` is the user agent string.
113-
- The ``Components`` is a hint, through which the caller requests the
114-
domain (component) they are looking for, any combination of
115-
``Components.USER_AGENT``, ``Components.OS``, and
116-
``Components.DEVICE``. ``Domains.ALL`` exists as a convenience alias
117-
for the combination of all three.
118-
119-
The parser *must* return at least the requested information, but if
120-
that's more convenient or no more expensive it *can* return more.
121-
- The ``ParseResult`` is similar to ``CompleteParseResult``, except
122-
all the attributes are ``Optional`` and it has a ``components:
123-
Components`` attribute which specifies whether a component was never
124-
requested (its value for the user agent string is unknown) or it has
125-
been requested but could not be resolved (no match was found for the
126-
user agent).
127-
128-
``ParseResult.complete()`` convert to a ``CompleteParseResult`` if
129-
all the components are set, and raise an exception otherwise. If
130-
some of the components are set to ``None``, they'll be swapped for a
131-
default value.
132-
133-
Calling the parser directly is part of the public API. One of the
134-
advantage is that it does not return default values, as such it allows
135-
more easily differentiating between a non-match (= ``None``) and a
136-
default fallback (``family = "Other"``).

doc/Makefile

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
# Minimal makefile for Sphinx documentation
2+
#
3+
4+
# You can set these variables from the command line, and also
5+
# from the environment for the first two.
6+
SPHINXOPTS ?=
7+
SPHINXBUILD ?= sphinx-build
8+
SOURCEDIR = .
9+
BUILDDIR = _build
10+
11+
# Put it first so that "make" without argument is like "make help".
12+
help:
13+
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
14+
15+
.PHONY: help Makefile
16+
17+
# Catch-all target: route all unknown targets to Sphinx using the new
18+
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
19+
%: Makefile
20+
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)

doc/_templates/navigation.html

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
<h3>{{ _('Navigation') }}</h3>
2+
{{ toctree(includehidden=theme_sidebar_includehidden, collapse=theme_sidebar_collapse, maxdepth=3) }}
3+
{% if theme_extra_nav_links %}
4+
<hr />
5+
<ul>
6+
{% for text, uri in theme_extra_nav_links.items() %}
7+
<li class="toctree-l1"><a href="{{ uri }}">{{ text }}</a></li>
8+
{% endfor %}
9+
</ul>
10+
{% endif %}

doc/api.rst

Lines changed: 169 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,169 @@
1+
===
2+
API
3+
===
4+
5+
Global Helpers
6+
--------------
7+
8+
.. module:: ua_parser
9+
10+
.. autofunction:: parse
11+
12+
.. autofunction:: parse_user_agent
13+
14+
.. autofunction:: parse_os
15+
16+
.. autofunction:: parse_device
17+
18+
.. autodata:: parser
19+
20+
Core Types
21+
----------
22+
23+
.. autoclass:: Resolver
24+
:members:
25+
:special-members: __call__
26+
27+
.. autoclass:: Domain
28+
:members:
29+
:member-order: bysource
30+
31+
.. autoclass:: Parser
32+
:members:
33+
34+
Data Types
35+
----------
36+
37+
These are the various types produced by successfully resolving a user
38+
agent string. They are guaranteed to be `dataclasses
39+
<https://docs.python.org/3/library/dataclasses.html>`_, and using
40+
dataclass utility functions is officially supported.
41+
42+
.. autoclass:: PartialResult
43+
:members:
44+
45+
.. autoclass:: Result
46+
:members:
47+
48+
.. autoclass:: UserAgent
49+
50+
.. autoclass:: OS
51+
52+
.. autoclass:: Device
53+
54+
.. autoclass:: DefaultedResult
55+
:members:
56+
57+
58+
Base Resolvers
59+
--------------
60+
61+
Base resolvers take sets of :class:`~ua_parser.core.Matchers`
62+
generated by :ref:`loaders <Loading>`, and use them to extract data
63+
from user agent strings.
64+
65+
.. autoclass:: ua_parser.basic.Resolver(Matchers)
66+
67+
.. class:: ua_parser.re2.Resolver(Matchers)
68+
69+
An advanced resolver based around |re2|_'s ``FilteredRE2`` feature,
70+
which efficiently prunes the number of possibly matching matchers
71+
before actually running them.
72+
73+
Sufficiently fast that a cache may not be necessary, and may even
74+
be detrimental at smaller cache sizes
75+
76+
.. warning:: Only available if |re2|_ is installed.
77+
78+
Eager Matchers
79+
''''''''''''''
80+
81+
.. automodule:: ua_parser.matchers
82+
:members:
83+
:member-order: bysource
84+
:show-inheritance:
85+
86+
Lazy Matchers
87+
'''''''''''''
88+
89+
These matchers will lazily compile their
90+
:attr:`~ua_parser.core.Matcher.pattern` to an :class:`re.Pattern`.
91+
92+
93+
While this saves CPU upfront, this is most useful with resolvers which
94+
likely will *not* need to apply most of them, like
95+
:class:`ua_parser.re2.Resolver`. If the resolver will very likely need
96+
to apply (and thus compile) every pattern like
97+
:class:`ua_parser.basic.Resolver`, then lazy compilation has a higher
98+
overhead.
99+
100+
.. automodule:: ua_parser.lazy
101+
:members:
102+
:member-order: bysource
103+
:show-inheritance:
104+
105+
Caching
106+
-------
107+
108+
Web clients commonly have multiple interactions with a given system,
109+
leading to significant repetition in user agents encountered. A cache
110+
allows making use of that to avoid redundant parses, at the cost of
111+
memory. This is most useful for slow base resolvers like :class:`the
112+
basic resolver <ua_parser.basic.Resolver>`.
113+
114+
.. autoclass:: ua_parser.caching.Cache
115+
:members: __getitem__, __setitem__
116+
117+
.. autoclass:: ua_parser.CachingResolver
118+
:members:
119+
120+
.. autoclass:: ua_parser.Cache
121+
122+
.. module:: ua_parser.caching
123+
124+
.. autoclass:: S3Fifo
125+
126+
.. autoclass:: Sieve
127+
128+
.. autoclass:: Lru
129+
130+
.. autoclass:: Local
131+
132+
.. _loading:
133+
134+
Loading
135+
-------
136+
137+
.. autoclass:: ua_parser.core.Matchers
138+
139+
.. autoclass:: ua_parser.core.Matcher
140+
:members:
141+
:special-members: __call__
142+
143+
.. autofunction:: ua_parser.load_builtins() -> Matchers
144+
145+
.. autofunction:: ua_parser.load_lazy_builtins() -> Matchers
146+
147+
Custom `regexes.yaml`_ data
148+
'''''''''''''''''''''''''''
149+
150+
.. module:: ua_parser.loaders
151+
152+
.. autofunction:: load_data(MatchersData) -> Matchers
153+
154+
.. autofunction:: load_lazy(MatchersData) -> Matchers
155+
156+
.. autofunction:: load_json
157+
158+
.. function:: load_yaml(f: PathOrFile, loader: DataLoader = load_data) -> Matchers
159+
160+
Loads YAML data following the ``regexes.yaml`` structure.
161+
162+
The ``loader`` parameter customises which matcher variant is
163+
generated, by default :func:`load_data` is used to generate eager
164+
matchers, :func:`load_lazy` can be used to generate lazy matchers
165+
instead.
166+
167+
.. warning:: Only available if |pyyaml|_ is installed.
168+
169+
.. _regexes.yaml: https://github.com/ua-parser/uap-core/blob/master/docs/specification.md

doc/conf.py

Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,44 @@
1+
# Configuration file for the Sphinx documentation builder.
2+
#
3+
# For the full list of built-in configuration values, see the documentation:
4+
# https://www.sphinx-doc.org/en/master/usage/configuration.html
5+
import os
6+
import sys
7+
8+
sys.path.insert(0, os.path.normpath(os.path.join(__file__, "../..", "src")))
9+
# -- Project information -----------------------------------------------------
10+
# https://www.sphinx-doc.org/en/master/usage/configuration.html#project-information
11+
12+
project = "UA Parser"
13+
copyright = "2024, UA Parser Project"
14+
author = "UA Parser Project"
15+
16+
version = "1.0"
17+
release = "1.0"
18+
19+
rst_epilog = """
20+
.. |pyyaml| replace:: ``PyYaml``
21+
.. |re2| replace:: ``google-re2``
22+
23+
.. _pyyaml: https://pyyaml.org
24+
.. _re2: https://pypi.org/project/google-re2
25+
"""
26+
27+
# -- General configuration ---------------------------------------------------
28+
# https://www.sphinx-doc.org/en/master/usage/configuration.html#general-configuration
29+
30+
extensions = [
31+
"sphinx.ext.autodoc",
32+
"sphinx.ext.todo",
33+
"sphinx.ext.viewcode",
34+
"sphinx.ext.intersphinx",
35+
]
36+
37+
templates_path = ["_templates"]
38+
exclude_patterns = ["_build", "Thumbs.db", ".DS_Store"]
39+
40+
language = "en"
41+
42+
html_theme = "alabaster"
43+
44+
intersphinx_mapping = {"python": ("https://docs.python.org/3", None)}

0 commit comments

Comments
 (0)