Skip to content

Commit 76b1167

Browse files
NathanielVolfangojenkins
authored andcommitted
QPR-11679 -- Add JSON file comparison
1 parent 514ff19 commit 76b1167

4 files changed

Lines changed: 492 additions & 187 deletions

File tree

Tools/PythonTools/Readme.md

Lines changed: 48 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -2,17 +2,14 @@
22

33
The file *comparison_config.json* holds the default configuration that is used for comparing expected and generated csv, json and text files during the regression and Example test runs.
44

5-
Under the field `csv_settings` and `files`, there is a collection of objects. Each object has a key that is equal to either a file name or a regular expression that should ultimately match a file name. This configuration is read into an `OrderedDict` in Python and the files that are to be compared during a test run are compared against the keys to determine the comparison configuration that they should use. For this reason, keys with exact names should be placed before regular expression keys in the configuration so that they will be found first.
5+
Under the field `csv_settings`, `json_settings` and `files`, there is a collection of objects. Each object has a key that is equal to either a file name or a regular expression that should ultimately match a file name. This configuration is read into an `OrderedDict` in Python and the files that are to be compared during a test run are compared against the keys to determine the comparison configuration that they should use. For this reason, keys with exact names should be placed before regular expression keys in the configuration so that they will be found first.
66

7-
Each object under a given key, `file_name`, has the following format:
7+
Each object under a given key, `file_name`, has the following format (note that the formats of `csv_settings` and `json_settings` are different):
88
```js
99
{
10-
"csv_settings":
11-
{
12-
"files":
13-
{
14-
"file_name":
15-
{
10+
"csv_settings": {
11+
"files": {
12+
"file_name": {
1613
"keys": [
1714
"a",
1815
"b"
@@ -25,8 +22,7 @@ Each object under a given key, `file_name`, has the following format:
2522
"optional_cols": [
2623
"col4"
2724
]
28-
"rename_cols":
29-
{
25+
"rename_cols": {
3026
"A": "a",
3127
"B": "b",
3228
},
@@ -58,20 +54,51 @@ Each object under a given key, `file_name`, has the following format:
5854
]
5955
}
6056
}
57+
},
58+
"json_settings": {
59+
"files": {
60+
"file_name": {
61+
"ignore_keys": [
62+
"key1",
63+
"key2",
64+
"key3/subkey1"
65+
],
66+
"settings": [
67+
{
68+
"names": [
69+
"key1/subkey1",
70+
"key2/subkey1/subkey2"
71+
],
72+
"abs_tol": 0.01,
73+
"rel_tol": 0.001
74+
}
75+
]
76+
},
77+
"all_string_file_name": {}
78+
}
6179
}
6280
}
6381
```
6482

65-
- The `keys` specify which columns will be used as keys for the comparison. The comparison fails if all of these keys are not in both files to be compared.
66-
- The `use_cols` specifies on which columns the actual comparisons are evaluated.
67-
- The `optional_cols`, as with `use_cols` above, specifies columns on which comparisons are evaluated, but these columns are only included if they are present in both files. If they are not present in either file or if present in one file nad not the other, the corresponding comparison defined in `column_settings` below will not be evaluated for the missing column/s.
68-
- The `rename_cols` object specifies columns that should be renamed before the comparison is performed. In the example above, `A` would be renamed to `a` etc.
69-
- The `col_types` object allows you to explicitly specify the type of a given set of columns if necessary.
70-
- The `drop_rows` object allows you to specify a threshold for the values in a given set of columns. If the absolute value for a given row, in one of the specified columns, is below the threshold, the row is dropped from the comparison.
71-
- The `column_settings` object allows you to specify an absolute and/or a relative tolerance that should be used for a group of columns when comparing their values. There can be multiple groupings used in the `column_settings` array with different values of absolute and relative tolerance.
83+
For `csv_settings`:
84+
- The `keys` specify which columns will be used as keys for the comparison. The comparison fails if all of these keys are not in both files to be compared.
85+
- The `use_cols` specifies on which columns the actual comparisons are evaluated.
86+
- The `optional_cols`, as with `use_cols` above, specifies columns on which comparisons are evaluated, but these columns are only included if they are present in both files. If they are not present in either file or if present in one file nad not the other, the corresponding comparison defined in `column_settings` below will not be evaluated for the missing column/s.
87+
- The `rename_cols` object specifies columns that should be renamed before the comparison is performed. In the example above, `A` would be renamed to `a` etc.
88+
- The `col_types` object allows you to explicitly specify the type of a given set of columns if necessary.
89+
- The `drop_rows` object allows you to specify a threshold for the values in a given set of columns. If the absolute value for a given row, in one of the specified columns, is below the threshold, the row is dropped from the comparison.
90+
- The `column_settings` object allows you to specify an absolute and/or a relative tolerance that should be used for a group of columns when comparing their values. There can be multiple groupings used in the `column_settings` array with different values of absolute and relative tolerance.
91+
92+
For new files that would require the same comparison config as another standard file, e.g "simm_additional.csv" from a SIMM Impact calc would be the same as a "simm.csv" report from a SIMM calc, they can copy that file's comparison config:
93+
94+
For the regression tests under the *RegressionTests* directory and the Example tests under *Examples* and *ExamplesPlus*, each test may have its own specific comparison configuration file following this format. If a test specific comparison configuration file is present, it is merged with this default comparison configuration file to give the final comparison configuration file used for the test. The merge function is in the file *merge_comparison_configs.py* and it uses the following logic:
95+
- The test specific file is used as the starting point for the final merged configuration `OrderedDict`.
96+
- Any file names in the default comparison configuration file that are not in the test specific comparison configuration file, are added *at the end* of the merged configuration `OrderedDict`. They will therefore only be used during comparison if there is not a match in the test specific file.
7297

73-
For new files that would require the same comparison config as another standard file, e.g "simm_additional.csv" from a SIMM Impact calc would be the same as a "simm.csv" report from a SIMM calc, they can copy that file's comparison config:
98+
For `json_settings`:
99+
- Any key value (i.e. in `ignore_keys`, `settings.names`, etc.) must include the parent, if any. Using the sample comparison_config.json template above, we would ignore "key1" and "key2" at the top level in a JSON file comparison, and "subkey1" only if it appears inside of "key3".
100+
- The `ignore_keys` is an array of strings, each string being a key in the JSON file. If the key is found in one or both of the files, any diffs will be ignored for this key and its children (i.e. if the value is itself a nested object).
101+
- The `settings` object works the same as the `column_settings` file in `csv_settings`, except that the keys must include the parent in the JSON `settings`.
102+
- **NOTE:** In order for a JSON check to be applied, a comp config must be provided for the filename, even if the config is empty (see e.g. `all_string_filename` in the template above). Otherwise a direct file comparison will be done.
103+
- **NOTE:** String diffs are automatically processed (i.e. unless they are in `ignore_keys`, then a string diff will be a failing diff) (see e.g. `all_string_filename` in the template above). Only numerical differences need to be handled in `settings`.
74104

75-
For the regression tests under the *RegressionTests* directory and the Example tests under *Examples* and *ExamplesPlus*, each test may have its own specific comparison configuration file following this format. If a test specific comparison configuration file is present, it is merged with this default comparison configuration file to give the final comparison configuration file used for the test. The merge function is in the file *merge_comparison_configs.py* and it uses the following logic:
76-
- The test specific file is used as the starting point for the final merged configuration `OrderedDict`.
77-
- Any file names in the default comparison configuration file that are not in the test specific comparison configuration file, are added *at the end* of the merged configuration `OrderedDict`. They will therefore only be used during comparison if there is not a match in the test specific file.

0 commit comments

Comments
 (0)