Creating an API
This guide is deprecated. Please read the new Simplescraper API guide here: https://simplescraper.io/docs/api-guide.
An API endpoint will be automatically created as soon as you run a recipe. Simply click on the API tab of your recipe to get the URL which you can query for extracted data.
When you call this URL, you have the option to return data that was scraped previously or to scrape the target website in real-time (via the &run_now=true parameter) and return live data.
The API tab on your recipe page lists all parameters that you can pass in with the URL in order to modify the recipe, some of which are explained below.
Parameter | Purpose | Example |
---|---|---|
source_url | Change the source URL (page to be scraped) of the recipe. Useful for sites like job boards, accommodation websites etc. where different URLs can be passed in to return different results | &source_url=https://example.com |
run_now | Run the recipe before returning results, i.e. scrape the website now. Without this parameter the most recent results are returned from the database | &run_now=true |
limit | Limit how many scrape results are returned in a single call to the API. Applies only to cached data already in database (so when &run_now=false) | &limit=1000 |
run_async | Run request asynchronously. Ideal for long running scrape tasks - allows you to trigger the recipe to run without risk of timeout. Requires that you handle results via webhook or other integration | &run_async=true |
Please note that URLs in the Crawler are ignored when triggering a recipe via the API. Requests via the API scrape a single URL at a time only. To scrape URLs in the Crawler without having to visit the dashboard, set a schedule and use a webhook or other integration to send the data to your desired destination.