Skip to content

Commit 7ee2823

Browse files
committed
add caching and resolver guides
Fixes #183
1 parent 6c3fce7 commit 7ee2823

2 files changed

Lines changed: 139 additions & 13 deletions

File tree

doc/guides.rst

Lines changed: 137 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -72,20 +72,146 @@ in the process which is both the advantage and risk
7272
device=None,
7373
string='foo')
7474
75-
Cache Customisation
76-
===================
75+
Cache And Other Advanced Parser Customisation
76+
=============================================
7777

78-
.. todo::
78+
While loading custom rulesets has built-in support, other forms of
79+
parser customisations don't and require manually instantiating and
80+
composing :class:`~ua_parser.Resolver` objects.
7981

80-
- how to build a custom resolver stack and wrap it in a parser
81-
- minor discussion of caches
82-
- maybe link to an advanced document about the specifics of
83-
individual caches and their memory consumption?
82+
The most basic such customisation is simply configuring caching away
83+
from the default setup.
84+
85+
As an example, in the default configuration if |re2|_ is available the
86+
RE2-based resolver is not cached, a user might consider the memory
87+
investment worth it and want to reconfigure the stack for a cached
88+
base.
89+
90+
The process is uncomplicated as the APIs are designed to compose
91+
together.
92+
93+
The first step is to instantiate a base resolver, instantiated with
94+
the relevant :class:`Matchers` data::
95+
96+
import ua_parser.loaders
97+
import ua_parser.re2
98+
base = ua_parser.re2.Resolver(
99+
ua_parser.loaders.load_lazy_builtins())
100+
101+
The next step is to instantiate the cache [#cache]_ suitably
102+
configured::
103+
104+
cache = ua_parser.Cache(1000)
105+
106+
And compose the base resolver and cache together::
107+
108+
resolver = ua_parser.caching.CachingResolver(
109+
base,
110+
cache
111+
)
112+
113+
Finally, for convenience a :class:`ua_parser.Parser` can be wrapped
114+
around the resolver, and that can either be used as-is, or set as the
115+
global parser for all the library users to use this new configuration
116+
from here on::
117+
118+
ua_parser.parser = ua_parser.Parser(resolver)
119+
120+
.. note::
121+
122+
To be honest aside from configuring the presence, algorithm, and
123+
size of caches there currently isn't much to compose that's built
124+
in. The only remaining member of the cast is
125+
:class:`~ua_parser.caching.Local`, which is also caching-related,
126+
and serves to use thread-local caches rather than a shared cache.
84127

85128
Writing Custom Resolvers
86129
========================
87130

88-
.. todo::
89-
90-
- explanation of the resolver protocol
91-
- maybe a fanout resolver as demo?
131+
It is unclear if there would be any fun or profit to it, but an
132+
express goal of the new API is to allow writing and composing
133+
resolvers, so what is a resolver?
134+
135+
:class:`~ua_parser.Resolver` is a structural :py:class:`typing.Protocol` for
136+
implementation convenience (nothing to inherit, and not even a class
137+
to write). Here it is in full::
138+
139+
class Resolver(Protocol):
140+
@abc.abstractmethod
141+
def __call__(self, ua: str, domain: Domain, /) -> PartialResult:
142+
...
143+
144+
So a :class:`~ua_parser.Resolver` is just a callable which takes a
145+
string and a :class:`~ua_parser.Domain`, and returns a
146+
:class:`~ua_parser.PartialResult`.
147+
148+
For our first resolver, let's say that we have an API and a mobile
149+
application, and as we expect the mobile application to be the main
150+
caller we want to special-case it, we could do it in many ways but the
151+
way we're doing it is a bespoke :class:`~ua_parser.Resolver` which
152+
matches the application's user agent and performs trivial parsing::
153+
154+
def foo_resolver(ua: str, domain: Domain, /) -> PartialResult:
155+
if not ua.startswith('fooapp/'):
156+
# not our application, match failure
157+
return PartialResult(domain, None, None, None, ua)
158+
159+
# we've defined our UA as $appname/$version/$user-token
160+
app, version, user = ua.split('/', 3)
161+
major, minor = version.split('.')
162+
return PartialResult(
163+
domain,
164+
UserAgent(app, major, minor),
165+
None,
166+
Device(user),
167+
ua,
168+
)
169+
170+
This resolver is not hugely interesting as it resolves a very limited
171+
number of user agent strings and fails everything else, although it
172+
does demonstrate two important requirements of the protocol:
173+
174+
- If a domain is requested, it must be returned, even if ``None``
175+
(signaling a matching failure).
176+
- If it's efficient there is nothing wrong with returning data for
177+
domains which were not requested, at worst they will be ignored.
178+
179+
For a more interesting resolver, we can write a *fallback* resolver:
180+
it's a higher-order resolver which tries to call multiple
181+
sub-resolvers in sequence until the UA is resolved. This means we
182+
could then use something like::
183+
184+
Parser(FallbackResolver([
185+
foo_resolver,
186+
re2.Resolver(load_lazy_builtins()),
187+
]))
188+
189+
to prioritise cheap resolving of our application while still resolving
190+
third party user agents::
191+
192+
class FallbackResolver:
193+
def __init__(self, resolvers: List[Resolver]) -> None:
194+
self.resolvers = resolvers
195+
196+
def __call__(self, ua: str, domain: Domain, /) -> PartialResult:
197+
if domain:
198+
for resolver in self.resolvers:
199+
r = resolver(ua, domain)
200+
# if any value is non-none the resolver found a match
201+
if r.user_agent_string is not None \
202+
or r.os is not None \
203+
or r.device is not None:
204+
return r
205+
206+
# if no resolver found a match (or nothing was requested),
207+
# resolve to failure
208+
return PartialResult(domain, None, None, None, ua)
209+
210+
.. [#cache] If it has been written yet, see :doc:`advanced/caches` for
211+
way too much information you probably don't care about if you just
212+
want to parse user agent stings.
213+
214+
The tldr is that bigger increases hit rates which decreases costs
215+
but uses more memory, and while really easy to write in Python an
216+
:class:`~ua_parser.caching.Lru` is a pretty bad cache all things
217+
considered.

src/ua_parser/caching.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -292,8 +292,8 @@ class CachingResolver:
292292
293293
"""
294294

295-
def __init__(self, parser: Resolver, cache: Cache):
296-
self.parser: Resolver = parser
295+
def __init__(self, resolver: Resolver, cache: Cache):
296+
self.parser: Resolver = resolver
297297
self.cache: Cache = cache
298298

299299
def __call__(self, ua: str, domains: Domain, /) -> PartialResult:

0 commit comments

Comments
 (0)