@@ -72,20 +72,146 @@ in the process which is both the advantage and risk
7272 device=None,
7373 string='foo')
7474
75- Cache Customisation
76- ===================
75+ Cache And Other Advanced Parser Customisation
76+ =============================================
7777
78- .. todo ::
78+ While loading custom rulesets has built-in support, other forms of
79+ parser customisations don't and require manually instantiating and
80+ composing :class: `~ua_parser.Resolver ` objects.
7981
80- - how to build a custom resolver stack and wrap it in a parser
81- - minor discussion of caches
82- - maybe link to an advanced document about the specifics of
83- individual caches and their memory consumption?
82+ The most basic such customisation is simply configuring caching away
83+ from the default setup.
84+
85+ As an example, in the default configuration if |re2 |_ is available the
86+ RE2-based resolver is not cached, a user might consider the memory
87+ investment worth it and want to reconfigure the stack for a cached
88+ base.
89+
90+ The process is uncomplicated as the APIs are designed to compose
91+ together.
92+
93+ The first step is to instantiate a base resolver, instantiated with
94+ the relevant :class: `Matchers ` data::
95+
96+ import ua_parser.loaders
97+ import ua_parser.re2
98+ base = ua_parser.re2.Resolver(
99+ ua_parser.loaders.load_lazy_builtins())
100+
101+ The next step is to instantiate the cache [#cache ]_ suitably
102+ configured::
103+
104+ cache = ua_parser.Cache(1000)
105+
106+ And compose the base resolver and cache together::
107+
108+ resolver = ua_parser.caching.CachingResolver(
109+ base,
110+ cache
111+ )
112+
113+ Finally, for convenience a :class: `ua_parser.Parser ` can be wrapped
114+ around the resolver, and that can either be used as-is, or set as the
115+ global parser for all the library users to use this new configuration
116+ from here on::
117+
118+ ua_parser.parser = ua_parser.Parser(resolver)
119+
120+ .. note ::
121+
122+ To be honest aside from configuring the presence, algorithm, and
123+ size of caches there currently isn't much to compose that's built
124+ in. The only remaining member of the cast is
125+ :class: `~ua_parser.caching.Local `, which is also caching-related,
126+ and serves to use thread-local caches rather than a shared cache.
84127
85128Writing Custom Resolvers
86129========================
87130
88- .. todo ::
89-
90- - explanation of the resolver protocol
91- - maybe a fanout resolver as demo?
131+ It is unclear if there would be any fun or profit to it, but an
132+ express goal of the new API is to allow writing and composing
133+ resolvers, so what is a resolver?
134+
135+ :class: `~ua_parser.Resolver ` is a structural :py:class: `typing.Protocol ` for
136+ implementation convenience (nothing to inherit, and not even a class
137+ to write). Here it is in full::
138+
139+ class Resolver(Protocol):
140+ @abc.abstractmethod
141+ def __call__(self, ua: str, domain: Domain, /) -> PartialResult:
142+ ...
143+
144+ So a :class: `~ua_parser.Resolver ` is just a callable which takes a
145+ string and a :class: `~ua_parser.Domain `, and returns a
146+ :class: `~ua_parser.PartialResult `.
147+
148+ For our first resolver, let's say that we have an API and a mobile
149+ application, and as we expect the mobile application to be the main
150+ caller we want to special-case it, we could do it in many ways but the
151+ way we're doing it is a bespoke :class: `~ua_parser.Resolver ` which
152+ matches the application's user agent and performs trivial parsing::
153+
154+ def foo_resolver(ua: str, domain: Domain, /) -> PartialResult:
155+ if not ua.startswith('fooapp/'):
156+ # not our application, match failure
157+ return PartialResult(domain, None, None, None, ua)
158+
159+ # we've defined our UA as $appname/$version/$user-token
160+ app, version, user = ua.split('/', 3)
161+ major, minor = version.split('.')
162+ return PartialResult(
163+ domain,
164+ UserAgent(app, major, minor),
165+ None,
166+ Device(user),
167+ ua,
168+ )
169+
170+ This resolver is not hugely interesting as it resolves a very limited
171+ number of user agent strings and fails everything else, although it
172+ does demonstrate two important requirements of the protocol:
173+
174+ - If a domain is requested, it must be returned, even if ``None ``
175+ (signaling a matching failure).
176+ - If it's efficient there is nothing wrong with returning data for
177+ domains which were not requested, at worst they will be ignored.
178+
179+ For a more interesting resolver, we can write a *fallback * resolver:
180+ it's a higher-order resolver which tries to call multiple
181+ sub-resolvers in sequence until the UA is resolved. This means we
182+ could then use something like::
183+
184+ Parser(FallbackResolver([
185+ foo_resolver,
186+ re2.Resolver(load_lazy_builtins()),
187+ ]))
188+
189+ to prioritise cheap resolving of our application while still resolving
190+ third party user agents::
191+
192+ class FallbackResolver:
193+ def __init__(self, resolvers: List[Resolver]) -> None:
194+ self.resolvers = resolvers
195+
196+ def __call__(self, ua: str, domain: Domain, /) -> PartialResult:
197+ if domain:
198+ for resolver in self.resolvers:
199+ r = resolver(ua, domain)
200+ # if any value is non-none the resolver found a match
201+ if r.user_agent_string is not None \
202+ or r.os is not None \
203+ or r.device is not None:
204+ return r
205+
206+ # if no resolver found a match (or nothing was requested),
207+ # resolve to failure
208+ return PartialResult(domain, None, None, None, ua)
209+
210+ .. [#cache ] If it has been written yet, see :doc: `advanced/caches ` for
211+ way too much information you probably don't care about if you just
212+ want to parse user agent stings.
213+
214+ The tldr is that bigger increases hit rates which decreases costs
215+ but uses more memory, and while really easy to write in Python an
216+ :class: `~ua_parser.caching.Lru ` is a pretty bad cache all things
217+ considered.
0 commit comments