You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* pii-manager v. 0.5.0
* new task list parsing code, adding a "full" format based on dicts, in
addition to the previous "simplified" format based on tuples
* * refactored to allow more than one task for a given PII and country
* * added the capability to add task processors programmatically
* TASK_ANY split into LANG_ANY & COUNTRY_ANY
* * context validation spec, for all three task implementation types
* PII detectors for international phone numbers, for en-any & es-any
* PII detector for IP addresses, language independent
* added reading task descriptors from a JSON file
* * PII detectors for GOV_ID
- lang pt, countries PT & BR
- lang es, country MX
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* added missing __init__ files
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Copy file name to clipboardExpand all lines: cc_pseudo_crawl/sourcing_sheet_seeds/README.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,4 +8,4 @@ Steps:
8
8
9
9
2. do the lookups / table join, see [general instructions](../README.md) using the crawl selector `CC-MAIN-202[01]` to restrict the join for the last 2 years
41,"observatorio de la política china",https://politica-china.org,unknown,"multiple releases",,es,spain,"general news","text (web)",manual,,unknown,unknown,"1, According to the “Provisions of the Supreme People's Court on the People's Court's Publication of Judgment Documents online” (最高人民法院关于人民法院在互联网公布裁判文书的规定), the online publication of judgment documents should be based on the principle of openness, with non-publicity as an exception. Judicial documents involving national security, juvenile delinquency, divorce proceedings, support or guardianship of minor children, etc., shall not be made public. In
34
-
public judgment documents, information concerning personal privacy,
35
-
trade secrets, etc., other than the names of the parties, shall also be
33
+
41,"observatorio de la política china",https://politica-china.org,unknown,"multiple releases",,es,spain,"general news","text (web)",manual,,unknown,unknown,"1, According to the “Provisions of the Supreme People's Court on the People's Court's Publication of Judgment Documents online” (最高人民法院关于人民法院在互联网公布裁判文书的规定), the online publication of judgment documents should be based on the principle of openness, with non-publicity as an exception. Judicial documents involving national security, juvenile delinquency, divorce proceedings, support or guardianship of minor children, etc., shall not be made public. In
34
+
public judgment documents, information concerning personal privacy,
35
+
trade secrets, etc., other than the names of the parties, shall also be
36
36
deleted from the document.",
37
37
42,"icefi - instituto centroamericano de estudios fiscales",http://icefi.org/,unknown,"multiple releases",,es,gt,"general news","text (web)",manual,,unknown,unknown,"icefi - instituto centroamericano de estudios fiscales",
38
38
43,"baja california sur - gobierno del estado",http://www.bcs.gob.mx/,unknown,"multiple releases",,es,mexico,"general news","text (web)",manual,,unknown,unknown,"baja california sur - gobierno del estado",
0 commit comments