Scraping data behind a login
There are two methods that Simplescraper can use to scrape data located behind a login: credentials (username / password) and cookies.
The cookie method is preferable; however, cookies may expire quickly or not work at all, in which case the credential method makes more sense. This guide covers how to use both methods.
Credentials method
The credentials method uses your username and password to login on your behalf and extract the data that you need. You can create a credential for a website and use it to login to all of your scrape recipes that share that domain. For example, if you have five LinkedIn recipes, you can create a single credential for LinkedIn that will work for all your LinkedIn recipes.
Once a credential is added, it is available under 'Credentials' in the Manage Recipes section. Your password is encrypted, securely stored and not visible to Simplescraper.
Add credentials while creating / editing recipe
- Add a new recipe or edit an existing recipe (see this guide for how to) and scroll down to the Credentials section under Advanced Options
- Click the 'add new credential +' button and add your credential details in the popup that appears. Provide a credential name that is clear and obvious, for example 'LinkedIn login'. Then enter the remaining URL, username and password details
- Click save credentials and wait for the 'Credentials added successfully' confirmation text and then close the popup
- The credential that you created will now be visible in the dropdown list - select it and then save your scrape recipe
Add credentials via Credentials menu
- In the sidebar of the Simplescraper dashboard click Manage Recipes and then choose Credentials Manager and then click the green 'add credential +' button
- In the popup that appears add your credential details. Provide a credential name that is clear and obvious, for example 'LinkedIn login'. Then enter the remaining URL, username and password details
- Click save credentials and wait for the 'Credentials added successfully' confirmation text
- Now when you add a new recipe or edit an existing recipe you will see the credential that you created in a dropdown list in the Credentials section
Cookies method
Simplescraper now allows you to add cookies via the extension and this is the preferred method. The previous method of using the EditThisCookie extension is still included below as it may be useful if adding cookies to an existing recipe.
Adding Cookies via the Simplescraper extension (preferred method)
- Open the Simplescraper extension on the page that you wish to scrape and click on the Cookies icon (it's located two positions to the right of the 'View Results' button). The icon will turn green, indicating that the cookies have been saved
- Continue selecting the properties you wish to extract as usual
- Click 'View Results', then choose to save the recipe
- In the advanced settings, under the Cookies section, you will see that the cookies have been added
- Save your recipe and run it. Simplescraper will login and your data will be returned
Adding Cookies via the EditThisCookie extension
Scraping a webpage behind a login is achieved by using your login cookies for that site. Retrieving this cookie requires using a 3rd party extension called EditThisCookie. Please review their permissions and policy here.
Here's how to scrape data that is behind a login using the EditThisCookie extension:
Install the extension that will be used to retrieve the cookie
herehereNavigate to the website that you wish to scrape and login to the site if you are not already logged in
Once logged in, click the EditThisCookie extension icon and a list of cookies will appear in a popup window. In the menu at the top of this popup window look for the arrow icon pointing outwards (it's the third icon from the right). Click this icon to copy the cookies to your clipboard
In the Simplescraper dashboard (https://simplescraper.io/dashboard), navigate to your recipe, click edit beside the recipe name and then click 'show advanced options'
Under the cookies section paste the cookies from your clipboard into the text area (ctrl + v or right-click > paste)
Save your recipe and then run it. Simplescraper should login on your behalf and scrape the data that you need
Note: some websites don't use cookies for user authentication and so this method may not always work