Skip to content

feat: v2 documentation with versioned navigation and updated SDKs#39

Open
VinciGit00 wants to merge 28 commits intomainfrom
version-2
Open

feat: v2 documentation with versioned navigation and updated SDKs#39
VinciGit00 wants to merge 28 commits intomainfrom
version-2

Conversation

@VinciGit00
Copy link
Copy Markdown
Member

Summary

  • Versioned navigation: docs.json now uses navigation.versions with v2 as default and v1 as legacy, matching the structure from doc-private
  • Updated SDK docs: Python and JavaScript SDK pages rewritten for v2 API based on scrapegraph-py#82 and scrapegraph-js#11 — new methods (extract, search, scrape, crawl.*, monitor.*, credits, history), FetchConfig/LlmConfig config objects, factory pattern for JS
  • Updated mocking docs: Replaced built-in mock mode (removed in v2) with standard library testing patterns (unittest.mock, responses, Jest/Vitest, MSW)
  • v1 legacy pages: 24 new pages under v1/ with deprecation banners pointing to v2 equivalents
  • Design updates: New primary color (#AC6DFF), IBM Plex Sans font, updated backgrounds

Test plan

  • Verify Mintlify builds successfully with versioned navigation
  • Check v2 tab loads as default
  • Check v1 tab shows legacy docs with deprecation warnings
  • Verify all navigation links resolve correctly in both versions
  • Review SDK code examples match the v2 PR APIs

🤖 Generated with Claude Code

VinciGit00 and others added 28 commits March 31, 2026 08:00
Migrate documentation to v2 structure with versioned nav (v2 default, v1 legacy).
Update Python and JavaScript SDK docs to reflect v2 API changes (extract, search,
scrape, crawl, monitor namespaces, FetchConfig/LlmConfig). Add v1/ legacy pages
with deprecation banners.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…nd navigation

- Add v2 service pages: Extract, Search, Crawl, Monitor
- Update Scrape service for v2 format-based API
- Update LangChain integration for v2 tools (ExtractTool, SearchTool, etc.)
- Update v2 navigation: remove old services (SmartScraper, SearchScraper, Markdownify, SmartCrawler, Sitemap, AgenticScraper, Langflow)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Rename smart-scraper → extract, search-scraper → search
- Remove commands dropped from v2: agentic-scraper, generate-schema, sitemap, validate
- Update scrape with --format flag (markdown, html, screenshot, branding)
- Update crawl with v2 polling model (max-pages, max-depth, max-links-per-page, allow-external)
- Update history with v2 service names (scrape, extract, search, monitor, crawl)
- Update all examples, JSON mode docs, and AI agent skill docs

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Reorder services: Scrape, Extract, Search, Crawl, Monitor
- Remove Community anchor, Playground, x402, Langflow from v2 nav
- Update Vercel AI integration for v2 SDK

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Update all v2 documentation to match the latest SDK changes:
- JS SDK: named import, correct fetchConfig fields (render, wait), maxRetries=2, maxDepth for crawl
- Python SDK: Client.from_env(), context manager, format param, history filter params
- All service/knowledge-base/cookbook pages: migrate JS examples from v1 individual imports to v2 factory pattern

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Update v2 documentation links to use scrapegrphai.com/dashboard instead of dashboard subdomain and remove the Usage Analytics section from the dashboard overview to match the new content direction.

Made-with: Cursor
Refine the docs IA by removing the integrations drawer section and introducing a dedicated transition guide page for users migrating from v1 to v2.

Made-with: Cursor
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Rewrite proxy configuration page to document FetchConfig object with
mode parameter (auto/fast/js/direct+stealth/js+stealth), country-based
geotargeting, and all fetch options. Update knowledge-base proxy guide
and fix FetchConfig examples in both Python and JavaScript SDK pages
to match the actual v2 API surface.

Refs: ScrapeGraphAI/scrapegraph-js#11, ScrapeGraphAI/scrapegraph-py#82

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Rename smart-scraper → extract, search-scraper → search
- Remove dropped commands: agentic-scraper, generate-schema, sitemap, validate
- Replace --stealth boolean with --mode fetch mode enum
- Update scrape with --format flag (markdown, html, screenshot, branding)
- Update crawl with v2 polling model and new options
- Update env variables to SGAI_API_URL, SGAI_TIMEOUT_S
- Update response field names (remainingCredits, markdown)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Remove the advanced extract accordion snippet to keep the migration-focused section shorter and less confusing.

Made-with: Cursor
Clean up the Python SDK docs by removing the non-rendering header image and outdated llm_config parameter rows in extract/search tables.

Made-with: Cursor
Add a dedicated Plans & Pricing page and update account docs with current credit costs, proxy modifiers, and plan limits, including navigation links across the section.

Made-with: Cursor
Replace legacy v1 code snippets in all use-case pages with actual v2 method names and parameters so examples match the current SDK behavior.

Made-with: Cursor
Remove the "What changed at a glance" section from the transition guide as requested, while keeping the detailed migration mapping and examples.

Made-with: Cursor
Reorder the v1-to-v2 migration mapping and REST endpoint examples to match the Services navigation sequence (scrape, extract, search, crawl, monitor).

Made-with: Cursor
Add concrete Python and JavaScript v2 code examples for the markdownify-to-scrape migration step.

Made-with: Cursor
Delete the invalid top image reference on the Mocking & Testing SDK page to avoid broken rendering.

Made-with: Cursor
Align monitor docs with Python SDK PR #82 (scrapegraph-py v2):
- Rename `cron` parameter to `interval` across all code examples
- Add required `name` parameter to SDK snippets
- Replace Pydantic model schema with JSON Schema dict
- Update FetchConfig usage (mode enum instead of stealth bool)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace single docs-banner with separate light/dark assets and
switch images via Tailwind dark: classes on introduction pages.

Made-with: Cursor
Align sdks/javascript.mdx and sdks/python.mdx with the current schemas
from scrapegraph-js#11 and scrapegraph-py#82:

- search(): add locationGeoCode/location_geo_code, timeRange/time_range,
  prompt, format, mode; correct numResults default to 3
- extract(): drop llmConfig from params (ignored by v2 route); document
  mode, contentType, html, markdown alternatives to url
- scrape(): document the formats[] array (tagged format entries with
  per-entry config) and add a multi-format example
- crawl.start(): document maxDepth/max_depth, maxPages/max_pages,
  maxLinksPerPage, allowExternal, contentTypes
- monitor.create(): drop prompt (not in v2 schema); add formats and
  webhookUrl/webhook_url
- LlmConfig: clarify it belongs inside scrape json/summary format
  entries, not on extract/search

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
CLI (just-scrape#13):
- scrape: document 8 formats, multi-format via comma-separated -f,
  and new --html-mode / --scrolls / --prompt / --schema flags
- search: document --location-geo-code, --time-range, --format
- crawl: document -f / --format
- Add the Fetch Modes enum table (auto|fast|js|direct+stealth|js+stealth)
  that replaces the legacy --stealth boolean

MCP server (scrapegraph-mcp#16):
- Replace the stale v1 tool list with the v2 surface: markdownify,
  smartscraper, searchscraper, scrape (formats[]), smartcrawler_*
  (markdown default), crawl_stop/resume, monitor_* lifecycle,
  credits, sgai_history
- Note removal of sitemap and agentic_scrapper
- Document SCRAPEGRAPH_API_BASE_URL override and v2 auth headers

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Group the client-side surfaces (CLI, MCP server, Toonify) with the
Python/JavaScript SDKs so they live together in the nav rather than
under Services.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Member

lurenss commented Apr 13, 2026

Rechecked only the v2 side of this PR against the current sgai-stack development branch plus scrapegraph-js#11 and scrapegraph-py#82. Ignoring intentional v1 legacy content, these are the remaining gaps in the v2 docs branch:

  1. The v2 branch still does not provide a real canonical v2 API reference. The API Reference tab still points to old endpoint pages instead of /api/v2 reference pages, so there is no authoritative reference for the current stack routes (/scrape, /extract, /schema, /search, /history, /monitor, /crawl, /events, /credits, /validate).

  2. services/scrape.mdx is still documenting the wrong canonical REST shape. It shows singular format and an old { id, format, content }-style response, while the current stack contract is formats[] and a results + metadata response. The SDKs keep compatibility shortcuts, but the page is not describing the actual v2 API contract.

  3. services/extract.mdx is stale for v2. It still documents output_schema, fetch_config, llm_config, and old fetch flags like render_js, stealth, and wait_ms. The current stack contract is centered on schema, fetchConfig, and the shared v2 fetch config schema.

  4. services/search.mdx is also stale for v2. It still says the default result count is 5 and documents output_schema / llm_config, while the current stack contract is numResults default 3 with format, mode, fetchConfig, prompt, schema, locationGeoCode, and timeRange.

  5. services/crawl.mdx still documents legacy REST fields like depth, max_pages, and format as the main API shape. The current v2 contract is maxDepth, maxPages, maxLinksPerPage, allowExternal, and formats[].

  6. services/monitor.mdx still presents the prompt-driven model as the main v2 API. The canonical current monitor create contract is formats[]-based. The SDKs support legacy-compatible monitor creation, but the docs page is not aligned to the actual stack contract.

  7. sdks/javascript.mdx is still missing part of the current v2 JS SDK surface and has wrong examples for response shapes. The docs still use remainingCredits, totalCreditsUsed, endpoint, offset, and data.items, but the current stack/SDK responses are credits { remaining, used, plan, jobs } and history { data, pagination }. The page also does not document schema() or validate(), even though the JS v2 branch exposes both.

  8. sdks/python.mdx has the same issue. It still documents search default 5, and it still uses remaining_credits, total_credits_used, endpoint, offset, and items-style history examples. It also lacks proper schema() / validate() coverage for the current Python v2 branch.

  9. The MCP docs are internally inconsistent on the v2 branch. They claim full v2 coverage, but the documented tool list is still largely the old tool set rather than the actual v2-aligned surface.

  10. transition-from-v1-to-v2.mdx still points readers to service pages / API reference pages for exact v2 payloads, but those pages are still mixed or stale as noted above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants