Skip to content

Commit bab6e74

Browse files
committed
Merge branch 'best-match'
* best-match: And docs for the arguments. And add by_relevance docs. Remove __eq__, since it causes hashability issues on Py3 that I don't want to deal with at the moment. Update best_match docs. Different strategy that's a lot more robust. Use ._contents in create_from Initial stab at best_match. Sort errors based on their paths.
2 parents a73d1ef + 4f171aa commit bab6e74

3 files changed

Lines changed: 336 additions & 21 deletions

File tree

docs/errors.rst

Lines changed: 86 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
Handling Validation Errors
33
==========================
44

5-
.. currentmodule:: jsonschema
5+
.. currentmodule:: jsonschema.exceptions
66

77
When an invalid instance is encountered, a :exc:`ValidationError` will be
88
raised or returned, depending on which method or function is used.
@@ -194,7 +194,7 @@ If you want to programmatically be able to query which properties or validators
194194
failed when validating a given instance, you probably will want to do so using
195195
:class:`ErrorTree` objects.
196196

197-
.. autoclass:: ErrorTree
197+
.. autoclass:: jsonschema.validators.ErrorTree
198198
:members:
199199
:special-members:
200200
:exclude-members: __dict__,__weakref__
@@ -301,3 +301,87 @@ To summarize, each tree contains child trees that can be accessed by indexing
301301
the tree to get the corresponding child tree for a given index into the
302302
instance. Each tree and child has a :attr:`~ErrorTree.errors` attribute, a
303303
dict, that maps the failed validator to the corresponding validation error.
304+
305+
306+
best_match and by_relevance
307+
---------------------------
308+
309+
The :func:`best_match` function is a simple but useful function for attempting
310+
to guess the most relevant error in a given bunch.
311+
312+
.. autofunction:: best_match
313+
314+
Try to find an error that appears to be the best match among given errors.
315+
316+
In general, errors that are higher up in the instance (i.e. for which
317+
:attr:`ValidationError.path` is shorter) are considered better matches,
318+
since they indicate "more" is wrong with the instance.
319+
320+
.. doctest::
321+
322+
>>> from jsonschema import Draft4Validator
323+
>>> from jsonschema.exceptions import best_match
324+
325+
>>> schema = {
326+
... "type": "array",
327+
... "minItems": 3,
328+
... }
329+
>>> print(best_match(Draft4Validator(schema).iter_errors(11)).message)
330+
11 is not of type 'array'
331+
332+
If the resulting match is either :validator:`oneOf` or :validator:`anyOf`,
333+
the *opposite* assumption is made -- i.e. the deepest error is picked,
334+
since these validators only need to match once, and any other errors may
335+
not be relevant.
336+
337+
:argument iterable errors: the errors to select from. Do not provide a
338+
mixture of errors from different validation attempts (i.e. from
339+
different instances or schemas), since it won't produce sensical
340+
output.
341+
:argument callable key: the key to use when sorting errors. See
342+
:func:`by_relevance` for more details (the default is to sort with the
343+
defaults of that function).
344+
:returns: the best matching error, or ``None`` if the iterable was empty
345+
346+
.. note::
347+
348+
This function is a heuristic. Its return value may change for a given
349+
set of inputs from version to version if better heuristics are added.
350+
351+
352+
.. autofunction:: by_relevance
353+
354+
Create a key function that can be used to sort errors by relevance.
355+
356+
If you want to sort a bunch of errors entirely, you can use this function
357+
to do so. Using the return value of this function as a key to e.g.
358+
:func:`sorted` or :func:`max` will cause more relevant errors to be
359+
considered greater than less relevant ones.
360+
361+
.. doctest::
362+
363+
>>> schema = {
364+
... "properties": {
365+
... "name": {"type": "string"},
366+
... "phones": {
367+
... "properties": {
368+
... "home": {"type": "string"}
369+
... },
370+
... },
371+
... },
372+
... }
373+
>>> instance = {"name": 123, "phones": {"home": [123]}}
374+
>>> errors = Draft4Validator(schema).iter_errors(instance)
375+
>>> [
376+
... e.path[-1]
377+
... for e in sorted(errors, key=exceptions.by_relevance())
378+
... ]
379+
['home', 'name']
380+
381+
:argument set weak: a collection of validators to consider to be "weak". If
382+
there are two errors at the same level of the instance and one is in
383+
the set of weak validators, the other error will take priority. By
384+
default, :validator:`anyOf` and :validator:`oneOf` are considered weak
385+
validators and will be superceded by other same-level validation
386+
errors.
387+
:argument set strong a collection of validators to consider to be "strong".

jsonschema/exceptions.py

Lines changed: 40 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,15 @@
11
import collections
2+
import itertools
23
import pprint
34
import textwrap
45

56
from jsonschema import _utils
67
from jsonschema.compat import PY3, iteritems
78

89

10+
WEAK_MATCHES = frozenset(["anyOf", "oneOf"])
11+
STRONG_MATCHES = frozenset()
12+
913
_unset = _utils.Unset()
1014

1115

@@ -24,25 +28,6 @@ def __init__(
2428
self.instance = instance
2529
self.schema = schema
2630

27-
@classmethod
28-
def create_from(cls, other):
29-
return cls(
30-
message=other.message,
31-
cause=other.cause,
32-
context=other.context,
33-
path=other.path,
34-
schema_path=other.schema_path,
35-
validator=other.validator,
36-
validator_value=other.validator_value,
37-
instance=other.instance,
38-
schema=other.schema,
39-
)
40-
41-
def _set(self, **kwargs):
42-
for k, v in iteritems(kwargs):
43-
if getattr(self, k) is _unset:
44-
setattr(self, k, v)
45-
4631
def __repr__(self):
4732
return "<%s: %r>" % (self.__class__.__name__, self.message)
4833

@@ -79,6 +64,23 @@ def __unicode__(self):
7964
if PY3:
8065
__str__ = __unicode__
8166

67+
@classmethod
68+
def create_from(cls, other):
69+
return cls(**other._contents())
70+
71+
def _set(self, **kwargs):
72+
for k, v in iteritems(kwargs):
73+
if getattr(self, k) is _unset:
74+
setattr(self, k, v)
75+
76+
def _contents(self):
77+
return dict(
78+
(attr, getattr(self, attr)) for attr in (
79+
"message", "cause", "context", "path", "schema_path",
80+
"validator", "validator_value", "instance", "schema"
81+
)
82+
)
83+
8284

8385
class ValidationError(_Error):
8486
pass
@@ -132,3 +134,22 @@ def __unicode__(self):
132134

133135
if PY3:
134136
__str__ = __unicode__
137+
138+
139+
def by_relevance(weak=WEAK_MATCHES, strong=STRONG_MATCHES):
140+
def relevance(error):
141+
validator = error.validator
142+
return -len(error.path), validator not in weak, validator in strong
143+
return relevance
144+
145+
146+
def best_match(errors, key=by_relevance()):
147+
errors = iter(errors)
148+
best = next(errors, None)
149+
if best is None:
150+
return
151+
best = max(itertools.chain([best], errors), key=key)
152+
153+
while best.context:
154+
best = min(best.context, key=key)
155+
return best

0 commit comments

Comments
 (0)