How to scrape and save recipes with the extension
A quick preview of how to select data. More advanced usage is explained below.
Install and pin the extension
The first step is to install the extension from the Chrome store here. The extension allows you to select data that you wish to extract from websites and create scraping recipes.
To ensure the extension is visible and only a click away, pin it to your extension bar by clicking the extension menu (it looks like a jigsaw piece) and then the pin next to 'Simplescraper':
Scraping data
Navigate to the webpage that you wish to scrape and launch Simplescraper by clicking its extension icon and selecting 'Scrape this website'. Then follow these steps (the video at the top of the page also demonstrates):
- Click the + in the top menu, name the element you wish to scrape, and then select that element on the page
- The element and all related elements should be highlighted in green
- Save your selection using the check icon in the top menu and optionally add more elements
- Click 'view results' to download your data and optionally create a cloud scraping recipe and API
By default Simplescraper will attempt to select related elements to the element you first select, however on some websites the extension may highlight more or less data than you require. Below is a guide on how to handle the various scenarios:
Automatic selection of multiple elements
When the extension automatically selects only the correct data
- After you click on an element, the selected element and related elements will turn green. Simplescraper has automatically made a best guess of which data you intend to extract. If that guess is correct, 'lock in' that element selection by using the check icon in the top-menu and repeat this process to add other elements. You do not have to do anything else.
Refining your selection
When the extension selects more or less data than intended
- If the initial guess is partially correct, or if more elements than expected have been selected, you must train Simplescraper to select the correct element
- First click the small arrow that appears beside the element that you do want to scrape. Now that the correct element has been confirmed proceed to 'reject' unrelated elements by clicking the small X that appears beside them. The number of items selected in the top-menu will update to reflect the number of elements that are selected on the webpage
- When you are satisfied with your selection, 'lock in' that element group by using the check icon in the top menu and repeat the process to add other elements
Selecting only a single element
When you only require a single element
If you only wish to select a single element, like an address or a telephone number, use the 'only this' button on the popup menu that appears when you click on that element. This will select that element and no other elements.
Editing the CSS selector
When the extension cannot seem to select the correct data
- In some cases, refining the selection still doesn't capture all the elements we want to extract, or perhaps captures more than we want to extract
- In such a case, Simplescraper provides an editable field where you can edit the CSS selector so that it points to the data that you wish to extract
Saving a recipe
- After you have clicked 'view results', on the results page you will see a banner with the option to save your recipe
- Click the banner and you will be able to save your recipe. Give the recipe a name and optionally choose to run multiple pages (if you selected a pagination button earlier) and schedule the recipe. Once you're done, click 'create recipe'
- Your recipe will appear in the sidebar. Click the recipe and then click 'run' to scrape in the cloud. Your results will appear after a few seconds