33[ ![ npm version] ( https://badge.fury.io/js/scrapegraph-js.svg )] ( https://badge.fury.io/js/scrapegraph-js )
44[ ![ License: MIT] ( https://img.shields.io/badge/License-MIT-blue.svg )] ( https://opensource.org/licenses/MIT )
55
6- <p align =" left " >
7- <img src =" https://raw.githubusercontent.com/VinciGit00/Scrapegraph-ai/main/docs/assets/api-banner.png " alt =" ScrapeGraph API Banner " style =" width : 70% ;" >
8- </p >
9-
10- Official TypeScript SDK for the [ ScrapeGraph AI API] ( https://scrapegraphai.com ) . Zero dependencies.
6+ Official JavaScript/TypeScript SDK for the ScrapeGraph AI API v2.
117
128## Install
139
@@ -20,224 +16,143 @@ bun add scrapegraph-js
2016## Quick Start
2117
2218``` ts
23- import { smartScraper } from " scrapegraph-js" ;
19+ import { scrapegraphai } from " scrapegraph-js" ;
2420
25- const result = await smartScraper (" your-api-key" , {
26- user_prompt: " Extract the page title and description" ,
27- website_url: " https://example.com" ,
28- });
21+ const sgai = scrapegraphai ({ apiKey: " your-api-key" });
2922
30- if (result .status === " success" ) {
31- console .log (result .data );
32- } else {
33- console .error (result .error );
34- }
23+ const result = await sgai .scrape (" https://example.com" , { format: " markdown" });
24+
25+ console .log (result .data );
26+ console .log (result ._requestId );
3527```
3628
37- Every function returns ` ApiResult<T> ` — no exceptions to catch :
29+ Every method returns:
3830
3931``` ts
4032type ApiResult <T > = {
41- status: " success" | " error" ;
42- data: T | null ;
43- error? : string ;
44- elapsedMs: number ;
33+ data: T ;
34+ _requestId: string ;
4535};
4636```
4737
4838## API
4939
50- All functions take ` (apiKey, params) ` where ` params ` is a typed object.
51-
52- ### smartScraper
53-
54- Extract structured data from a webpage using AI.
40+ Create a client once, then call the available v2 endpoints:
5541
5642``` ts
57- const res = await smartScraper (" key" , {
58- user_prompt: " Extract product names and prices" ,
59- website_url: " https://example.com" ,
60- output_schema: { /* JSON schema */ }, // optional
61- number_of_scrolls: 5 , // optional, 0-50
62- total_pages: 3 , // optional, 1-100
63- stealth: true , // optional, +4 credits
64- cookies: { session: " abc" }, // optional
65- headers: { " Accept-Language" : " en" }, // optional
66- steps: [" Click 'Load More'" ], // optional, browser actions
67- wait_ms: 5000 , // optional, default 3000
68- country_code: " us" , // optional, proxy routing
69- mock: true , // optional, testing mode
43+ const sgai = scrapegraphai ({
44+ apiKey: " your-api-key" ,
45+ baseUrl: " https://api.scrapegraphai.com" , // optional
46+ timeout: 30000 , // optional
47+ maxRetries: 2 , // optional
7048});
7149```
7250
73- ### searchScraper
74-
75- Search the web and extract structured results.
51+ ### scrape
7652
7753``` ts
78- const res = await searchScraper (" key" , {
79- user_prompt: " Latest TypeScript release features" ,
80- num_results: 5 , // optional, 3-20
81- extraction_mode: true , // optional, false for markdown
82- output_schema: { /* */ }, // optional
83- stealth: true , // optional, +4 credits
84- time_range: " past_week" , // optional, past_hour|past_24_hours|past_week|past_month|past_year
85- location_geo_code: " us" , // optional, geographic targeting
86- mock: true , // optional, testing mode
54+ await sgai .scrape (" https://example.com" , {
55+ format: " markdown" ,
56+ fetchConfig: {
57+ mock: false ,
58+ },
8759});
88- // res.data.result (extraction mode) or res.data.markdown_content (markdown mode)
8960```
9061
91- ### markdownify
62+ ### extract
9263
93- Convert a webpage to clean markdown.
64+ Raw JSON schema:
9465
9566``` ts
96- const res = await markdownify (" key" , {
97- website_url: " https://example.com" ,
98- stealth: true , // optional, +4 credits
99- wait_ms: 5000 , // optional, default 3000
100- country_code: " us" , // optional, proxy routing
101- mock: true , // optional, testing mode
67+ await sgai .extract (" https://example.com" , {
68+ prompt: " Extract the page title" ,
69+ schema: {
70+ type: " object" ,
71+ properties: {
72+ title: { type: " string" },
73+ },
74+ },
10275});
103- // res.data.result is the markdown string
10476```
10577
106- ### scrape
107-
108- Get raw HTML from a webpage.
78+ Zod schema:
10979
11080``` ts
111- const res = await scrape (" key" , {
112- website_url: " https://example.com" ,
113- stealth: true , // optional, +4 credits
114- branding: true , // optional, extract brand design
115- country_code: " us" , // optional, proxy routing
116- wait_ms: 5000 , // optional, default 3000
81+ import { z } from " zod" ;
82+
83+ await sgai .extract (" https://example.com" , {
84+ prompt: " Extract the page title" ,
85+ schema: z .object ({
86+ title: z .string (),
87+ }),
11788});
118- // res.data.html is the HTML string
119- // res.data.scrape_request_id is the request identifier
12089```
12190
122- ### crawl
123-
124- Crawl a website and its linked pages. Async — polls until completion.
91+ ### search
12592
12693``` ts
127- const res = await crawl (
128- " key" ,
129- {
130- url: " https://example.com" ,
131- prompt: " Extract company info" , // required when extraction_mode=true
132- max_pages: 10 , // optional, default 10
133- depth: 2 , // optional, default 1
134- breadth: 5 , // optional, max links per depth
135- schema: { /* JSON schema */ }, // optional
136- sitemap: true , // optional
137- stealth: true , // optional, +4 credits
138- wait_ms: 5000 , // optional, default 3000
139- batch_size: 3 , // optional, default 1
140- same_domain_only: true , // optional, default true
141- cache_website: true , // optional
142- headers: { " Accept-Language" : " en" }, // optional
143- },
144- (status ) => console .log (status ), // optional poll callback
145- );
94+ await sgai .search (" What is the capital of France?" , {
95+ numResults: 5 ,
96+ });
14697```
14798
148- ### agenticScraper
149-
150- Automate browser actions (click, type, navigate) then extract data.
99+ ### schema
151100
152101``` ts
153- const res = await agenticScraper (" key" , {
154- url: " https://example.com/login" ,
155- steps: [" Type user@example.com in email" , " Click login button" ], // required
156- user_prompt: " Extract dashboard data" , // required when ai_extraction=true
157- output_schema: { /* */ }, // required when ai_extraction=true
158- ai_extraction: true , // optional
159- use_session: true , // optional
160- });
102+ await sgai .schema (" A product with name and price" );
161103```
162104
163- ### generateSchema
164-
165- Generate a JSON schema from a natural language description.
105+ ### credits
166106
167107``` ts
168- const res = await generateSchema (" key" , {
169- user_prompt: " Schema for a product with name, price, and rating" ,
170- existing_schema: { /* modify this */ }, // optional
171- });
108+ await sgai .credits ();
172109```
173110
174- ### sitemap
175-
176- Extract all URLs from a website's sitemap.
111+ ### history
177112
178113``` ts
179- const res = await sitemap (" key" , {
180- website_url: " https://example.com" ,
181- headers: { /* */ }, // optional
182- stealth: true , // optional, +4 credits
183- mock: true , // optional, testing mode
114+ await sgai .history ({
115+ page: 1 ,
116+ limit: 10 ,
117+ service: " scrape" ,
184118});
185- // res.data.urls is string[]
186119```
187120
188- ### getCredits / checkHealth
121+ ### crawl
189122
190123``` ts
191- const credits = await getCredits (" key" );
192- // { remaining_credits: 420, total_credits_used: 69 }
124+ const crawl = await sgai .crawl .start (" https://example.com" , {
125+ maxPages: 10 ,
126+ maxDepth: 2 ,
127+ });
193128
194- const health = await checkHealth (" key" );
195- // { status: "healthy" }
129+ await sgai .crawl .status ((crawl .data as { id: string }).id );
196130```
197131
198- ### history
199-
200- Fetch request history for any service.
132+ ### monitor
201133
202134``` ts
203- const res = await history ( " key " , {
204- service : " smartscraper " ,
205- page: 1 , // optional, default 1
206- page_size: 10 , // optional, default 10
135+ await sgai . monitor . create ( {
136+ url : " https://example.com " ,
137+ prompt: " Notify me when the price changes " ,
138+ interval: " 1h " ,
207139});
208140```
209141
210- ## Examples
211-
212- Find complete working examples in the [ ` examples/ ` ] ( https://github.com/ScrapeGraphAI/scrapegraph-js/tree/main/examples ) directory:
213-
214- | Service | Examples |
215- | ---| ---|
216- | [ SmartScraper] ( https://github.com/ScrapeGraphAI/scrapegraph-js/tree/main/examples/smartscraper ) | basic, cookies, html input, infinite scroll, markdown input, pagination, stealth, with schema |
217- | [ SearchScraper] ( https://github.com/ScrapeGraphAI/scrapegraph-js/tree/main/examples/searchscraper ) | basic, markdown mode, with schema |
218- | [ Markdownify] ( https://github.com/ScrapeGraphAI/scrapegraph-js/tree/main/examples/markdownify ) | basic, stealth |
219- | [ Scrape] ( https://github.com/ScrapeGraphAI/scrapegraph-js/tree/main/examples/scrape ) | basic, stealth, with branding |
220- | [ Crawl] ( https://github.com/ScrapeGraphAI/scrapegraph-js/tree/main/examples/crawl ) | basic, markdown mode, with schema |
221- | [ Agentic Scraper] ( https://github.com/ScrapeGraphAI/scrapegraph-js/tree/main/examples/agenticscraper ) | basic, AI extraction |
222- | [ Schema Generation] ( https://github.com/ScrapeGraphAI/scrapegraph-js/tree/main/examples/schema ) | basic, modify existing |
223- | [ Sitemap] ( https://github.com/ScrapeGraphAI/scrapegraph-js/tree/main/examples/sitemap ) | basic, with smartscraper |
224- | [ Utilities] ( https://github.com/ScrapeGraphAI/scrapegraph-js/tree/main/examples/utilities ) | credits, health, history |
225-
226- ## Environment Variables
142+ ## Breaking Changes In v2
227143
228- | Variable | Description | Default |
229- | ---| ---| ---|
230- | ` SGAI_API_URL ` | Override API base URL | ` https://api.scrapegraphai.com/v1 ` |
231- | ` SGAI_DEBUG ` | Enable debug logging (` "1" ` ) | off |
232- | ` SGAI_TIMEOUT_S ` | Request timeout in seconds | ` 120 ` |
144+ - The SDK now uses ` scrapegraphai(config) ` instead of flat top-level functions.
145+ - Requests target the new ` /v2/* ` API surface.
146+ - Old helpers like ` smartScraper ` , ` searchScraper ` , ` markdownify ` , ` agenticScraper ` , ` sitemap ` , and ` generateSchema ` are not part of the v2 client.
147+ - ` crawl ` and ` monitor ` are now namespaced APIs.
233148
234149## Development
235150
236151``` bash
237152bun install
238- bun test # 21 tests
239- bun run build # tsup → dist/
240- bun run check # tsc --noEmit + biome
153+ bun test
154+ bun run check
155+ bun run build
241156```
242157
243158## License
0 commit comments