Overview

Amazon product pages contain valuable structured data that can be extracted for price monitoring, competitor analysis, product research, and market intelligence.

What You'll Extract

Product title & description

Current price & pricing

Product images

Star rating & reviews

Availability status

Product features

2

Analyze the HTML Structure

Before creating extract rules, inspect the Amazon product page HTML to identify CSS selectors for each data point:

Data Point	CSS Selector	Type
Product Title	`#productTitle, #title span`	text
Price	`.a-price .a-offscreen, #priceblock_ourprice, #priceblock_dealprice`	text
Main Image	`#landingImage, #imgBlkFront`	image
Rating	`span[data-hook="rating-out-of-text"], #acrPopover span.a-icon-alt`	text
Review Count	`#acrCustomerReviewText, span[data-hook="total-review-count"]`	text
Availability	`#availability span, #outOfStock span`	text
Features	`#feature-bullets ul li span.a-list-item`	text (array)
Brand	`#bylineInfo, a#bylineInfo`	text

Pro Tip

Amazon's HTML structure can vary by product category and region. Always test your selectors with multiple product pages.

3

Build Extract Rules

Extract rules define how to extract data from the page using CSS selectors:

JSON - Extract Rules

{
  "title": { "selector": "#productTitle, #title span", "type": "text" },
  "price": { "selector": ".a-price .a-offscreen, #priceblock_ourprice", "type": "text" },
  "original_price": { "selector": ".a-text-price .a-offscreen", "type": "text" },
  "main_image": { "selector": "#landingImage, #imgBlkFront", "type": "image" },
  "all_images": { "selector": "#altImages img", "type": "image", "multiple": true },
  "rating": { "selector": "span[data-hook='rating-out-of-text'], #acrPopover span.a-icon-alt", "type": "text" },
  "review_count": { "selector": "#acrCustomerReviewText", "type": "text" },
  "availability": { "selector": "#availability span", "type": "text" },
  "brand": { "selector": "#bylineInfo", "type": "text" },
  "features": { "selector": "#feature-bullets ul li span.a-list-item", "type": "text", "multiple": true },
  "description": { "selector": "#productDescription p, #productDescription span", "type": "text" }
}

Understanding Extract Rule Types

text

Extracts the text content of an element

image

Extracts the src attribute of img elements

attr

Extracts a specific attribute (requires "attribute" field)

multiple: true

Returns multiple matches as an array

4

Make the API Request

Use the Scrape API with your extract rules. Enable js for JavaScript rendering:

curl -X POST "https://api.ujeebu.com/scrape" \
  -H "ApiKey: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://www.amazon.com/Atomic-Habits-Proven-Build-Break/dp/0735211299/",
    "js": true,
    "wait_for": "#productTitle",
    "response_type": "json",
    "extract_rules": {
        "title": {
            "selector": "#productTitle",
            "type": "text"
        },
        "price": {
            "selector": ".a-price .a-offscreen",
            "type": "text"
        },
        "rating": {
            "selector": "span[data-hook=\"rating-out-of-text\"]",
            "type": "text"
        },
        "review_count": {
            "selector": "#acrCustomerReviewText",
            "type": "text"
        },
        "availability": {
            "selector": "#availability span",
            "type": "text"
        },
        "main_image": {
            "selector": "#landingImage",
            "type": "image"
        },
        "brand": {
            "selector": "#bylineInfo",
            "type": "text"
        },
        "features": {
            "selector": "#feature-bullets ul li span.a-list-item",
            "type": "text",
            "multiple": true
        }
    }
}'

from ujeebu_python import UjeebuClient

uj = UjeebuClient(api_key="YOUR_API_KEY")

res = uj.scrape_with_rules(
    url="https://www.amazon.com/Atomic-Habits-Proven-Build-Break/dp/0735211299/",
    extract_rules={
    "title": {
        "selector": "#productTitle",
        "type": "text"
    },
    "price": {
        "selector": ".a-price .a-offscreen",
        "type": "text"
    },
    "rating": {
        "selector": "span[data-hook=\"rating-out-of-text\"]",
        "type": "text"
    },
    "review_count": {
        "selector": "#acrCustomerReviewText",
        "type": "text"
    },
    "availability": {
        "selector": "#availability span",
        "type": "text"
    },
    "main_image": {
        "selector": "#landingImage",
        "type": "image"
    },
    "brand": {
        "selector": "#bylineInfo",
        "type": "text"
    },
    "features": {
        "selector": "#feature-bullets ul li span.a-list-item",
        "type": "text",
        "multiple": True
    }
},
    params={
    "js": True,
    "wait_for": "#productTitle",
    "response_type": "json"
},
)
print(res.json())

const { UjeebuClient } = require('@ujeebu-org/ujeebu-sdk');

const client = new UjeebuClient("YOUR_API_KEY");

(async () => {
  const res = await client.scrape("https://www.amazon.com/Atomic-Habits-Proven-Build-Break/dp/0735211299/", {
    extract_rules: {
    "title": {
        "selector": "#productTitle",
        "type": "text"
    },
    "price": {
        "selector": ".a-price .a-offscreen",
        "type": "text"
    },
    "rating": {
        "selector": "span[data-hook=\"rating-out-of-text\"]",
        "type": "text"
    },
    "review_count": {
        "selector": "#acrCustomerReviewText",
        "type": "text"
    },
    "availability": {
        "selector": "#availability span",
        "type": "text"
    },
    "main_image": {
        "selector": "#landingImage",
        "type": "image"
    },
    "brand": {
        "selector": "#bylineInfo",
        "type": "text"
    },
    "features": {
        "selector": "#feature-bullets ul li span.a-list-item",
        "type": "text",
        "multiple": true
    }
},
    ...{
    "js": true,
    "wait_for": "#productTitle",
    "response_type": "json"
}  });
  console.log(res.data);
})();

package main

import (
    "fmt"
    "log"

    "github.com/ujeebu/ujeebu-go"
)

func main() {
    client, err := ujeebu.NewClient("YOUR_API_KEY")
    if err != nil {
        log.Fatal(err)
    }

    res, _, err := client.Scrape(ujeebu.ScrapeParams{
        URL: "https://www.amazon.com/Atomic-Habits-Proven-Build-Break/dp/0735211299/",
        JS: true,
        WaitFor: "#productTitle",
        ResponseType: "json",
        ExtractRules: map[string]any{
            "title": map[string]any{
                "selector": "#productTitle",
                "type": "text",
            },
            "price": map[string]any{
                "selector": ".a-price .a-offscreen",
                "type": "text",
            },
            "rating": map[string]any{
                "selector": "span[data-hook=\"rating-out-of-text\"]",
                "type": "text",
            },
            "review_count": map[string]any{
                "selector": "#acrCustomerReviewText",
                "type": "text",
            },
            "availability": map[string]any{
                "selector": "#availability span",
                "type": "text",
            },
            "main_image": map[string]any{
                "selector": "#landingImage",
                "type": "image",
            },
            "brand": map[string]any{
                "selector": "#bylineInfo",
                "type": "text",
            },
            "features": map[string]any{
                "selector": "#feature-bullets ul li span.a-list-item",
                "type": "text",
                "multiple": true,
            },
        },
    })
    if err != nil {
        log.Fatal(err)
    }
    fmt.Println(res)
}

import requests
import json

# API configuration
api_key = "YOUR_API_KEY"
url = "https://api.ujeebu.com/scrape"

# Request payload
payload = {
    "url": "https://www.amazon.com/Atomic-Habits-Proven-Build-Break/dp/0735211299/",
    "js": True,
    "wait_for": "#productTitle",
    "response_type": "json",
    "extract_rules": {
        "title": {
            "selector": "#productTitle",
            "type": "text"
        },
        "price": {
            "selector": ".a-price .a-offscreen",
            "type": "text"
        },
        "rating": {
            "selector": "span[data-hook=\"rating-out-of-text\"]",
            "type": "text"
        },
        "review_count": {
            "selector": "#acrCustomerReviewText",
            "type": "text"
        },
        "availability": {
            "selector": "#availability span",
            "type": "text"
        },
        "main_image": {
            "selector": "#landingImage",
            "type": "image"
        },
        "brand": {
            "selector": "#bylineInfo",
            "type": "text"
        },
        "features": {
            "selector": "#feature-bullets ul li span.a-list-item",
            "type": "text",
            "multiple": True
        }
    }
}

# Make request
response = requests.post(
    url,
    headers={
        "ApiKey": api_key,
        "Content-Type": "application/json"
    },
    json=payload
)

# Handle response
if response.status_code == 200:
    data = response.json()
    print(json.dumps(data, indent=2))
else:
    print(f"Error: {response.status_code} - {response.text}")

const axios = require('axios');

// API configuration
const apiKey = process.env.UJEEBU_API_KEY || 'YOUR_API_KEY';
const apiUrl = 'https://api.ujeebu.com/scrape';

// Request payload
const payload = {
    "url": "https://www.amazon.com/Atomic-Habits-Proven-Build-Break/dp/0735211299/",
    "js": true,
    "wait_for": "#productTitle",
    "response_type": "json",
    "extract_rules": {
        "title": {
            "selector": "#productTitle",
            "type": "text"
        },
        "price": {
            "selector": ".a-price .a-offscreen",
            "type": "text"
        },
        "rating": {
            "selector": "span[data-hook=\"rating-out-of-text\"]",
            "type": "text"
        },
        "review_count": {
            "selector": "#acrCustomerReviewText",
            "type": "text"
        },
        "availability": {
            "selector": "#availability span",
            "type": "text"
        },
        "main_image": {
            "selector": "#landingImage",
            "type": "image"
        },
        "brand": {
            "selector": "#bylineInfo",
            "type": "text"
        },
        "features": {
            "selector": "#feature-bullets ul li span.a-list-item",
            "type": "text",
            "multiple": true
        }
    }
};

// Make request
axios.post(apiUrl, payload, {
  headers: {
    'ApiKey': apiKey,
    'Content-Type': 'application/json'
  }
})
.then(response => {
  console.log(response.data);
})
.catch(error => {
  console.error('Error:', error.response?.status, error.response?.data);
});

package main

import (
    "bytes"
    "encoding/json"
    "fmt"
    "io"
    "log"
    "net/http"
)

func main() {
    // API configuration
    apiKey := "YOUR_API_KEY"
    apiURL := "https://api.ujeebu.com/scrape"

    // Request payload
    payload := {
    "url": "https://www.amazon.com/Atomic-Habits-Proven-Build-Break/dp/0735211299/",
    "js": true,
    "wait_for": "#productTitle",
    "response_type": "json",
    "extract_rules": {
        "title": {
            "selector": "#productTitle",
            "type": "text"
        },
        "price": {
            "selector": ".a-price .a-offscreen",
            "type": "text"
        },
        "rating": {
            "selector": "span[data-hook=\"rating-out-of-text\"]",
            "type": "text"
        },
        "review_count": {
            "selector": "#acrCustomerReviewText",
            "type": "text"
        },
        "availability": {
            "selector": "#availability span",
            "type": "text"
        },
        "main_image": {
            "selector": "#landingImage",
            "type": "image"
        },
        "brand": {
            "selector": "#bylineInfo",
            "type": "text"
        },
        "features": {
            "selector": "#feature-bullets ul li span.a-list-item",
            "type": "text",
            "multiple": true
        }
    }
}

    // Marshal payload
    jsonData, err := json.Marshal(payload)
    if err != nil {
        log.Fatal(err)
    }

    // Create request
    req, err := http.NewRequest("POST", apiURL, bytes.NewBuffer(jsonData))
    if err != nil {
        log.Fatal(err)
    }

    // Set headers
    req.Header.Set("ApiKey", apiKey)
    req.Header.Set("Content-Type", "application/json")

    // Make request
    client := &http.Client{}
    resp, err := client.Do(req)
    if err != nil {
        log.Fatal(err)
    }
    defer resp.Body.Close()

    // Read response
    body, err := io.ReadAll(resp.Body)
    if err != nil {
        log.Fatal(err)
    }

    fmt.Println(string(body))
}

<?php
$apiKey = getenv('UJEEBU_API_KEY') ?: 'YOUR_API_KEY';
$apiUrl = 'https://api.ujeebu.com/scrape';

// Request payload
$payload = array (
  'url' => 'https://www.amazon.com/Atomic-Habits-Proven-Build-Break/dp/0735211299/',
  'js' => true,
  'wait_for' => '#productTitle',
  'response_type' => 'json',
  'extract_rules' => 
  array (
    'title' => 
    array (
      'selector' => '#productTitle',
      'type' => 'text',
    ),
    'price' => 
    array (
      'selector' => '.a-price .a-offscreen',
      'type' => 'text',
    ),
    'rating' => 
    array (
      'selector' => 'span[data-hook="rating-out-of-text"]',
      'type' => 'text',
    ),
    'review_count' => 
    array (
      'selector' => '#acrCustomerReviewText',
      'type' => 'text',
    ),
    'availability' => 
    array (
      'selector' => '#availability span',
      'type' => 'text',
    ),
    'main_image' => 
    array (
      'selector' => '#landingImage',
      'type' => 'image',
    ),
    'brand' => 
    array (
      'selector' => '#bylineInfo',
      'type' => 'text',
    ),
    'features' => 
    array (
      'selector' => '#feature-bullets ul li span.a-list-item',
      'type' => 'text',
      'multiple' => true,
    ),
  ),
);

// Initialize cURL
$ch = curl_init();
curl_setopt_array($ch, [
    CURLOPT_URL => $apiUrl,
    CURLOPT_RETURNTRANSFER => true,
    CURLOPT_POST => true,
    CURLOPT_HTTPHEADER => [
        'ApiKey: ' . $apiKey,
        'Content-Type: application/json'
    ],
    CURLOPT_POSTFIELDS => json_encode($payload)
]);

// Execute request
$response = curl_exec($ch);
$httpCode = curl_getinfo($ch, CURLINFO_HTTP_CODE);
curl_close($ch);

// Handle response
if ($httpCode === 200) {
    $data = json_decode($response, true);
    print_r($data);
} else {
    echo "Error: $httpCode - $response";
}
?>

use reqwest;
use serde_json::json;

#[tokio::main]
async fn main() -> Result<(), Box> {
    // API configuration
    let api_key = "YOUR_API_KEY";
    let api_url = "https://api.ujeebu.com/scrape";

    // Request payload
    let payload = json!({"url":"https://www.amazon.com/Atomic-Habits-Proven-Build-Break/dp/0735211299/","js":true,"wait_for":"#productTitle","response_type":"json","extract_rules":{"title":{"selector":"#productTitle","type":"text"},"price":{"selector":".a-price .a-offscreen","type":"text"},"rating":{"selector":"span[data-hook=\"rating-out-of-text\"]","type":"text"},"review_count":{"selector":"#acrCustomerReviewText","type":"text"},"availability":{"selector":"#availability span","type":"text"},"main_image":{"selector":"#landingImage","type":"image"},"brand":{"selector":"#bylineInfo","type":"text"},"features":{"selector":"#feature-bullets ul li span.a-list-item","type":"text","multiple":true}}});

    // Create client
    let client = reqwest::Client::new();

    // Make request
    let response = client
        .post(api_url)
        .header("ApiKey", api_key)
        .header("Content-Type", "application/json")
        .json(&payload)
        .send()
        .await?;

    // Handle response
    if response.status().is_success() {
        let data = response.json::().await?;
        println!("{:#}", data);
    } else {
        eprintln!("Error: {} - {}", response.status(), response.text().await?);
    }

    Ok(())
}

5

Handle the Response

The API returns extracted data in JSON format:

JSON - API Response

{
  "success": true,
  "result": {
    "title": "Atomic Habits: An Easy & Proven Way to Build Good Habits & Break Bad Ones",
    "price": "$18.88",
    "rating": "4.8 out of 5",
    "review_count": "(146,898)",
    "availability": "In Stock",
    "main_image": "https://m.media-amazon.com/images/I/81kg51XRc1L._SY466_.jpg",
    "brand": "James Clear (Author)",
    "features": [
      "An longest-running bestselling book on the science of habit formation",
      "Practical strategies for forming good habits and breaking bad ones",
      "An easy and proven framework for improving every day"
    ]
  }
}

Success!

The response includes all structured data from your extract rules, ready for processing.

6

Best Practices

01

Use JavaScript Rendering

Amazon heavily uses JavaScript. Always set js=true and use wait_for to ensure content loads.

Essential

02

Implement Rate Limiting

Respect Amazon's servers with 2-3 second delays between requests to avoid being blocked and maintain good scraping etiquette.

Recommended

03

Use Rotating Proxies

Enable proxy rotation for production use to distribute requests across multiple IP addresses and avoid blocks.

Production

04

Handle Variations

Amazon's HTML varies by product category, region, and session state. Always test selectors across different products.

Testing

Legal Notice

Review Amazon's Terms of Service and robots.txt before scraping. Use scraped data responsibly and in compliance with applicable laws. This tutorial is for educational purposes only.

Scraping Amazon Product Data