Best way to scrape website without API

Hello, new member alert.

I am looking for the best way to bring in data from a website that does not have a API.
I have been able to find URLs that give the data I need in a json format and have them saved in a table in my database.

I am struggling to figure out the best ways to parse, and save the data into my database.

Any help would be appreciated.


there are tons of ways to scrape a website.
More requirements would be helpful to narrow down the possibilities.

Do you need a one-off scraping and import the data manually?
In this case, there are chrome extensions (Bardeen is good) that allow to configure visually a scraper and save a CSV out of it.

Or do you need an harvester to perform more automatic scraping activities?
Scrapingbee or BrowseAI offer remote browsers you can program.

Hope this help.

Hey abusedmedia,

The website I am working with stores a large amount of data in json and has accessible .json urls. I have never needed a plugin in the past to scrape this sort of data but I am not experienced with any of this by any means.

My quest was more focused on best practices and methods within Retool to scrape web data.

Should I be using a workflow is JS?

What are the other ways to scrape?


if the data is available through public json file, you don't need to scrape anything, the data is ready to be consumed.

In Retool you just need a RESTQuery resource and put that URL in it.

Depending of your use-case, the size of the json might be an issue when you need to query in real-time.

Question: what you need to do with such data? Do yo need to put in a table? What's your use-case?

Hope this help.



I do not need the data in real-time. My plan is to use a workflow to call the script 5-10 times per day.

Currently I have the script run manually based off of the selected row in table1.

I am getting the following response:

  "kits": [
      "component": "Component 1",
      "modelId": 101,
      "id": 10000,
      "modelYear": "2000",
      "platform": "AA"
      "component": "Component 2",
      "modelId": 102,
      "id": 20000,
      "modelYear": "2000",
      "platform": "AB"

This has been shortened from a list that ranges from ~200-1000 "kits."

Currently I am looking to store an array of the "kits" id's. IE:
[10000, 20000]

I would also like to save a copy of the response in json.

Glad you solved!

Nice! You can get the ids array using Javascript:

For saving the response, you'll need to connect a resource to save the data. If you're familiar with SQL & don't already have an API or database in mind for this use case, you might consider using Retool's Database feature.