Skip to content

Commit 0caa366

Browse files
committed
Updated Scraper Reference (markdown)
1 parent 3c26efb commit 0caa366

1 file changed

Lines changed: 2 additions & 1 deletion

File tree

Scraper-Reference.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -57,7 +57,7 @@ Configuration is done via class attributes and divided into three main categorie
5757
* `version` [String] **(required)**
5858
The version of the software at the time the scraper was last run. This is only informational and doesn't affect the scraper's behavior.
5959

60-
* `base_url` [String] **(required)**
60+
* `base_url` [String] **(required in `UrlScraper`)**
6161
The documents' location. Only URLs _inside_ the `base_url` will be scraped. "inside" more or less means "starting with" except that `/docs` is outside `/doc` (but `/doc/` is inside).
6262
Unless `root_path` is set, the root/initial URL is equal to `base_url`.
6363

@@ -100,6 +100,7 @@ Default `html_filters`:
100100
* [`NormalizeUrlsFilter`](https://github.com/Thibaut/devdocs/blob/master/lib/docs/filters/core/normalize_urls.rb) — replaces all URLs with their fully qualified counterpart
101101
* [`InternalUrlsFilter`](https://github.com/Thibaut/devdocs/blob/master/lib/docs/filters/core/internal_urls.rb) — detects internal URLs (the ones to scrape) and replaces them with their unqualified, relative counterpart
102102
* [`NormalizePathsFilter`](https://github.com/Thibaut/devdocs/blob/master/lib/docs/filters/core/normalize_paths.rb) — makes the internal paths consistent (e.g. always end with `.html`)
103+
* [`CleanLocalUrlsFilter`](https://github.com/Thibaut/devdocs/blob/master/lib/docs/filters/core/clean_local_urls.rb) — remove links, iframes and images pointing to localhost (`FileScraper` only)
103104

104105
Default `text_filters`:
105106

0 commit comments

Comments
 (0)