As an enterprise business, you may want to scrape real estate listings. In this blog post, we will show you how to scrape Redfin listings using ParseHub, our free visual web scraping tool! Redfin is a leading real estate company providing searches and brokerage services. Their website and app allow users to buy and sell real estate. Some of the features Redfin provides are Redfin Estimate, Book It Now and 3D Walkthrough. As with any other real estate directory, there is a lot of data you can scrape from each listing or real estate agent!
To begin this tutorial, we recommend you register to ParseHub for free and download the application for your Windows, Mac or Linux system.
Starting Your Scraping Project
- To begin make sure you have registered your free ParseHub account and have logged into the application.
- Now you should be logged in to ParseHub and can click the “New Project” button on the screen.
- In the URL field on the left pane, you can add any Redfin URL you want to scrape from. We will be scraping active New York listings using this URL: https://www.redfin.com/city/30749/NY/New-York
- The Redfin website will now load on the embedded browser on the right!
- To make your first selection, click the first property’s address.
- To train the ParseHub algorithm, click the next property’s address.
- You will now see 11 listings in the data preview
- Scroll down to the 12th listing and click that listing as well.
- Finally, all 40 properties on this page should be selected.
- Rename this selection to “property”
Extracting Additional Property Data
Now that you have extracted all the properties on the first page, it’s time to go into each property’s detailed page. This will allow us to extract additional property information, such as the listing price, the MLS number and more!
- Firstly, click the PLUS(+) icon next to the property selection you made earlier.
- Choose the “Click” command so ParseHub clicks into each property.
- A popup window will appear, choose “No” as we are not clicking the next page button yet.
- Create a new template called “property_data” and click the green “Create New Template” button.
- You will now be taken to the first property’s detailed page.
- Click the price to make your first selection from this page, and rename the selection to “price”.
- Click the PLUS(+) icon next to the “page” tab and choose “Select”, now click the home’s description and rename the selection to “description”.
- Repeat this step for other property data, such as the MLS number!
Scraping Multiple Pages
To scrape more than 40 listings, which requires us to visit the next pages, we need to add pagination to ParseHub. Pagination allows ParseHub to click the next page button on Redfin and repeat the scraping process for the next set of properties.
- To begin pagination, click the PLUS(+) button next to the “page” tab and choose “Select”
- Scroll down until you see the pagination right arrow and click it.
- Rename this selection to “pagination” on the left pane.
- Click the expand icon next to your pagination selection, and delete the extraction, as it adds an unwanted column to your data.
- Click the PLUS(+) icon next to the pagination selection and choose “Click”
- A popup will appear asking if this is a next page button, choose “Yes”.
- You can now enter the additional amount of pages you want to scrape, if left at zero, it will scrape every single page available! We put the number 2, so three pages will be scraped in total.
Bypassing IP Blocks
Many websites that host large amounts of data have systems in place to prevent scraping. Redfin will give you an empty result if you scrape their website without IP Rotation. Note that IP Rotation is a paid ParseHub feature. Here is how to enable IP Rotation:
- Firstly, click the cog icon at the top left of the screen and click “Settings”.
- Under the settings, you will see “Rotate IP Addresses”
- Click this checkbox to enable IP Rotation.
- You are now ready to scrape Redfin without blocks!
Starting Your Scrape
Now that you have your selects, additional property date and pagination set up, it is time to scrape! To begin your scrape, click the green “Get Data” button on the left pane. ParseHub allows you to Test, Run or Schedule your scrape. In our case we will be scraping once, so we will choose Run. You can definitely run the scrape on a schedule to get updated properties on a recurring basis.
Excellent job, ParseHub will now scrape Redfin listings on its servers! Once the scrape is complete, you will be able to download the data as a CSV for a spreadsheet or JSON for applications.
Real estate is a lucrative industry and scraping property data can be very useful. Whether you are running analysis or finding real estate agents to contact, real estate scraping can be a game changer for your enterprise business.
We hope you enjoyed this blog post and tutorial on scraping Redfin listings!
If you’re interested in enterprise web scraping and data extraction, book a call with us and get a free sample data export.
Happy Scraping! 🏠