How to use Smart Extract

Simplescraper Smart Extract uses AI to accurately extract data and generate reusable CSS selectors from any website. All that's required is a URL and a data schema that lists the properties you wish to extract.

Smart Extract can be accessed via:

The Simplescraper dashboard - click on 'Get Data' in the sidebar then 'Smart Extract'
At scrape.new
Programatically via the API. Please see the docs here: https://simplescraper.io/docs/api-guide#post-smart-extract.

Smart Extract + CSS selectors

Smart Extract returns valid CSS selectors, making it easy to convert AI-powered extractions into regular scrape recipes. Start with AI to quickly identify and validate the data you want, then switch to standard scraping for speed, accuracy, and scale.

This combines the flexibility of AI with the reliability of traditional scraping - meaning no hallucinations or context limits.

Dashboard Usage

Navigate to https://simplescraper.io/new
In the top input field, enter the URL of the page you wish to extract data from
In the bottom input field, enter a data schema (comma-seperated list of properties you wish to extract)
Click the 'Extract Data' button
After a few seconds, the data will be returned in CSV and JSON format
Click the 'Save as a scrape recipe' button to convert the smart extraction into a regular scrape recipe, allowing you to scrape at scale using Simplescraper

API Usage

API docs can be found here: https://simplescraper.io/docs/api-guide#post-smart-extract.

Tips on writing your data schema

The schema provided should be a short, accurate list of each of the visible data points on the website that you wish to extract.
- For example, if extracting data from a jobs board: 'Role, salary, location, job type, company, description, experience required' is a good schema.
Including a hint of what data is being extracted can increase accuracy.
- For example, instead of: 'name, location, price, size, bedrooms, bathrooms', including a reference to the type of data can improve results. Example: 'property name, location, price, size (sqm), bedrooms, bathrooms'.
A schema is not a prompt.
- This works: "title, old price, current price, discount, review count, description, num capsules, rating".
- This does not: "visit the website and extract everything on the page beginning with A".

Current limitations of Smart Extract

Images URLs are not extracted (will be possible soon)
The URL is required to be publically available and not behind a login

Examples of using Simplescraper Smart Extract

The following are a list of websites and example schemas that would return accurate data. Use similar style schema on the sites you wish to extract data from.

Website	Schema
https://carsandbids.com/	car name, details, time remaining, bid price, location
https://www.nike.com/gb/t/air-force-1-07-next-nature-shoes-67bFZC/DV3808-107	price, old price, name, num of colors
https://jobs.careers.microsoft.com/global/en/search	job title, location, remote possible, description
https://www.realestate.com.au/international/id/bali/	price aud, price us, location, size (m2)
https://x.com/emollick	name, @tag, tagline, joined date, link, number of posts, top tweet text

Workflow example: Using Smart Extract to create a scrape recipe

One powerful way to use Smart Extract is as a starting point for creating a regular scrape recipe. This gives you the flexibility of AI to quickly identify the data you want, and then the speed, accuracy, and reliability of standard scraping for production use.

The idea: extract data and selectors → review → save as recipe → scale.

Step 1: Extract data and selectors using Smart Extract

Use Smart Extract to extract the data you want from a page. You'll get both the structured data and a list of CSS selectors for each field.

async function runSmartExtract(apiKey, url, schema) {
  const response = await fetch('https://api.simplescraper.io/v1/smart-extract', {
    method: 'POST',
    headers: {
      'Authorization': `Bearer ${apiKey}`,
      'Content-Type': 'application/json'
    },
    body: JSON.stringify({ url, schema })
  });

  const data = await response.json();
  return data;
}

Example usage:

const apiKey = 'YOUR_API_KEY';
const url = 'https://example.com/product-page';
const schema = 'product name, price, description, rating';

const result = await runSmartExtract(apiKey, url, schema);

Step 2: Review the results

You'll receive two important things in the response:

data: an array of structured values (the extracted data)
selectors: a list of CSS selectors for each field you requested

If the selectors look good, you're ready to convert them into a scrape recipe.

Step 3: Create a recipe using the returned selectors

Once you're happy with the fields and selectors, use them to create a new recipe. This lets you run scalable scrapes using Simplescraper's standard scraping engine.

async function createRecipe(apiKey, name, url, selectors) {
  const response = await fetch('https://api.simplescraper.io/v1/recipes', {
    method: 'POST',
    headers: {
      'Authorization': `Bearer ${apiKey}`,
      'Content-Type': 'application/json'
    },
    body: JSON.stringify({
      name: name,
      url: url,
      selectors: selectors
    })
  });

  const recipe = await response.json();
  console.log('Recipe created:', recipe);
  return recipe;
}

Example usage (continued):

const recipeName = 'Product Page Scraper';
await createRecipe(apiKey, recipeName, url, result.selectors);

So start with Smart Extract to quickly find the right selectors, then save them as a recipe for fast, reliable scraping at scale. If the site changes later, you can run Smart Extract again and update the existing recipe using PUT /recipes/:recipeId - no need to manually update your recipes.

Notes:
Smart Extract is in beta and may not be 100% accurate. If you encounter any issue or incorrect data please contact us via chat

How to use Smart Extract ​

Dashboard Usage ​

API Usage ​

Tips on writing your data schema ​

Current limitations of Smart Extract ​

Examples of using Simplescraper Smart Extract ​

Workflow example: Using Smart Extract to create a scrape recipe ​

Step 1: Extract data and selectors using Smart Extract ​

Step 2: Review the results ​

Step 3: Create a recipe using the returned selectors ​