Playground Sign in Start free

Scraping Amazon Reviews

1

Introduction

Amazon product reviews provide valuable insights into customer sentiment, product quality, and user experience. This tutorial shows you how to scrape review data for sentiment analysis, competitive research, and customer feedback monitoring.

What You'll Extract

Reviewer names & profiles
Star ratings (1-5)
Review title & text
Review dates
Verified purchase status
Helpful vote counts
Prerequisites

Make sure you have your API key ready. You can get one from the dashboard.

2

Quick Start

Get up and running in minutes with our streamlined setup process.

Installation

Terminal
pip install requests
# or for Node.js
npm install axios

Basic Request

Python
import requests, json

response = requests.post("https://api.ujeebu.com/scrape",
    headers={
        "ApiKey": "YOUR_API_KEY",
        "Content-Type": "application/json"
    },
    json={
        "url": "https://www.amazon.com/product-reviews/B08N5WRWNW/",
        "js": True,
        "wait_for": "[data-hook='review']",
        "extract_rules": extract_rules
    })
print(response.json()["result"])
3

Core Features

Lightning Fast - Optimized performance with automatic JavaScript rendering
Reliable - Full JSON schema support for consistent data
Modern Syntax - Clean CSS selectors with nested extraction
Great DX - Comprehensive error messages and debugging
4

Extract Rules Configuration

Customize your extraction by defining the data structure you need.

CSS Selectors Reference

Data PointCSS SelectorType
Review Container[data-hook="review"]obj
Reviewer Name.a-profile-nametext
Ratingi[data-hook="review-star-rating"]attr
Review Titlea[data-hook="review-title"]text
Review Datespan[data-hook="review-date"]text
Verified Purchasespan[data-hook="avp-badge"]text
Review Textspan[data-hook="review-body"]text

Changes to extract rules require testing with real pages to ensure selectors are accurate.

Complete Extract Rules

JSON - Extract Rules
{
  "reviews": {
    "selector": "[data-hook='review']",
    "type": "obj",
    "multiple": true,
    "children": {
      "reviewer_name": {"selector": ".a-profile-name", "type": "text"},
      "rating": {"selector": "i[data-hook='review-star-rating'] span", "type": "text"},
      "title": {"selector": "a[data-hook='review-title'] span", "type": "text"},
      "date": {"selector": "span[data-hook='review-date']", "type": "text"},
      "verified_purchase": {"selector": "span[data-hook='avp-badge']", "type": "text"},
      "review_text": {"selector": "span[data-hook='review-body'] span", "type": "text"},
      "helpful_count": {"selector": "span[data-hook='helpful-vote-statement']", "type": "text"}
    }
  },
  "average_rating": {"selector": "i[data-hook='average-star-rating'] span", "type": "text"},
  "total_ratings": {"selector": "span[data-hook='total-review-count']", "type": "text"}
}
5

API Examples

Create backend endpoints easily with API routes in your application.

import requests
import json
import time

extract_rules = {
    "reviews": {
        "selector": "[data-hook='review']",
        "type": "obj",
        "multiple": True,
        "children": {
            "reviewer_name": {"selector": ".a-profile-name", "type": "text"},
            "rating": {"selector": "i[data-hook='review-star-rating'] span", "type": "text"},
            "title": {"selector": "a[data-hook='review-title'] span", "type": "text"},
            "review_text": {"selector": "span[data-hook='review-body'] span", "type": "text"}
        }
    }
}

def scrape_amazon_reviews(product_asin, page=1):
    review_url = f"https://www.amazon.com/product-reviews/{product_asin}/"
    if page > 1:
        review_url += f"?pageNumber={page}"

    response = requests.post("https://api.ujeebu.com/scrape",
        headers={
            "ApiKey": "YOUR_API_KEY",
            "Content-Type": "application/json"
        },
        json={
            "url": review_url,
            "js": True,
            "wait_for": "[data-hook='review']",
            "extract_rules": extract_rules
        })
    return response.json()["result"]

# Scrape multiple pages
all_reviews = []
for page in range(1, 6):
    data = scrape_amazon_reviews("B08N5WRWNW", page)
    all_reviews.extend(data.get("reviews", []))
    time.sleep(2)  # Rate limiting

print(f"Scraped {len(all_reviews)} reviews")
Pro Tip

Use rotating proxies for bulk scraping to avoid rate limits. Add proxy_type: "rotating" to your params.

6

Best Practices

01

Use Rate Limiting

Always implement delays between requests (2-3 seconds minimum) to avoid overwhelming servers and getting blocked.

Essential
02

Enable Rotating Proxies

For bulk scraping, use proxy_type: "rotating" to distribute requests across multiple IP addresses.

Production
03

Handle Missing Data

Not all reviews have images or verified badges. Use .get() method to safely access optional fields.

Recommended
04

Validate Response

Always check for API errors and validate the structure of extracted data before processing.

Best Practice

Ready to Start Scraping?

Try the API in our interactive playground or explore the documentation.