Scraping data behind a login

There are two methods that Simplescraper can use to scrape data located behind a login: credentials (username / password) and cookies.

The cookie method is preferable; however, cookies may expire quickly or not work at all, in which case the credential method makes more sense. This guide covers how to use both methods.

Credentials method

The credentials method uses your username and password to login on your behalf and extract the data that you need. You can create a credential for a website and use it to login to all of your scrape recipes that share that domain. For example, if you have five LinkedIn recipes, you can create a single credential for LinkedIn that will work for all your LinkedIn recipes.

Once a credential is added, it is available under 'Credentials' in the Manage Recipes section. Your password is encrypted, securely stored and not visible to Simplescraper.

Add credentials while creating / editing recipe

  • Add a new recipe or edit an existing recipe (see this guide for how to) and scroll down to the Credentials section under Advanced Options
  • Click the 'add new credential +' button and add your credential details in the popup that appears. Provide a credential name that is clear and obvious, for example 'LinkedIn login'. Then enter the remaining URL, username and password details
  • Click save credentials and wait for the 'Credentials added successfully' confirmation text and then close the popup
  • The credential that you created will now be visible in the dropdown list - select it and then save your scrape recipe

Add credentials via Credentials menu

  • In the sidebar of the Simplescraper dashboard click Manage Recipes and then choose Credentials Manager and then click the green 'add credential +' button
  • In the popup that appears add your credential details. Provide a credential name that is clear and obvious, for example 'LinkedIn login'. Then enter the remaining URL, username and password details
  • Click save credentials and wait for the 'Credentials added successfully' confirmation text
  • Now when you add a new recipe or edit an existing recipe you will see the credential that you created in a dropdown list in the Credentials section

Cookies method

Scraping a webpage behind a login is achieved by using your login cookies for that site. Retrieving this cookie requires using a 3rd party extension called EditThisCookie. Please review their permissions and policy here.

Here's how to scrape data that is behind a login using cookies:

  • Install the extension that will be used to retrieve the cookie here

  • Navigate to the website that you wish to scrape and login to the site if you are not already logged in

  • Once logged in, click the EditThisCookie extension icon and a list of cookies will appear in a popup window. In the menu at the top of this popup window look for the arrow icon pointing outwards (it's the third icon from the right). Click this icon to copy the cookies to your clipboard

  • In the Simplescraper dashboard (https://simplescraper.io/dashboard), navigate to your recipe, click edit beside the recipe name and then click 'show advanced options'

  • Under the cookies section paste the cookies from your clipboard into the text area (ctrl + v or right-click > paste)

  • Save the recipe and then run it. Simplescraper should login on your behalf and scrape the data that you need

  • Note: some websites don't use cookies for user authentication and so this method may not always work