Playground Sign in Start free

Scraping Job Listings

1

Overview

Job boards contain valuable structured data about employment opportunities. This tutorial demonstrates how to scrape job listings including titles, companies, locations, salaries, descriptions, and application links.

What You'll Extract

Job title & position
Company name & logo
Location (remote status)
Salary range
Job description
Posting date
Use Cases

Job aggregation, recruitment automation, market analysis, salary research, skill demand tracking, and personalized job alerts.

2

Generic Job Board Structure

Most job boards follow similar patterns. Here's a generic approach that works across many platforms:

Data PointCommon SelectorsType
Job Container.job-card, .job-listing, articleobj
Job Titleh2, h3, .job-titletext
Company.company-name, .employertext
Location.location, .job-locationtext
Salary.salary, .compensationtext
Posted Date.date, time, .postedtext
Apply Linka.apply-button, a[href*='apply']link
3

Build Extract Rules

JSON - Extract Rules
{
  "jobs": {
    "selector": ".job-card",
    "type": "obj",
    "multiple": true,
    "children": {
      "title": { "selector": "h2, h3, .job-title", "type": "text" },
      "company": { "selector": ".company-name", "type": "text" },
      "location": { "selector": ".location", "type": "text" },
      "salary": { "selector": ".salary", "type": "text" },
      "description": { "selector": ".job-snippet, .description", "type": "text" },
      "posted": { "selector": ".date, time", "type": "text" },
      "job_type": { "selector": ".job-type, .employment-type", "type": "text" },
      "url": { "selector": "a", "type": "link" }
    }
  },
  "total_results": { "selector": ".results-count, .job-count", "type": "text" },
  "next_page": { "selector": "a.next, a[rel='next']", "type": "link" }
}
4

Make the API Request

import requests
import json
import time

extract_rules = {
    "jobs": {
        "selector": ".job-card",
        "type": "obj",
        "multiple": True,
        "children": {
            "title": {"selector": "h2, h3", "type": "text"},
            "company": {"selector": ".company-name", "type": "text"},
            "location": {"selector": ".location", "type": "text"},
            "salary": {"selector": ".salary", "type": "text"},
            "url": {"selector": "a", "type": "link"}
        }
    },
    "next_page": {"selector": "a[rel='next']", "type": "link"}
}

def scrape_jobs(search_url):
    response = requests.post("https://api.ujeebu.com/scrape",
        headers={
            "ApiKey": "YOUR_API_KEY",
            "Content-Type": "application/json"
        },
        json={
            "url": search_url,
            "js": True,
            "wait_for": ".job-card",
            "extract_rules": extract_rules
        })
    return response.json()["result"]

# Search for Python developer jobs
data = scrape_jobs("https://example-jobboard.com/search?q=python+developer")
print(f"Found {len(data['jobs'])} jobs")

for job in data['jobs'][:5]:
    print(f"\n{job['title']}")
    print(f"  Company: {job['company']}")
    print(f"  Location: {job['location']}")
    print(f"  Salary: {job.get('salary', 'Not listed')}")
5

Handling Pagination

Python - Paginated Scraping
def scrape_all_jobs(base_url, max_pages=10):
    """Scrape jobs across multiple pages."""
    all_jobs = []
    current_url = base_url
    page = 1

    while current_url and page <= max_pages:
        print(f"Scraping page {page}...")
        data = scrape_jobs(current_url)

        jobs = data.get('jobs', [])
        if not jobs:
            break

        all_jobs.extend(jobs)

        # Get next page URL
        next_page = data.get('next_page')
        if next_page:
            current_url = next_page if next_page.startswith('http') else base_url + next_page
        else:
            break

        page += 1
        time.sleep(2)  # Rate limiting

    return all_jobs

# Scrape up to 10 pages
all_jobs = scrape_all_jobs("https://example-jobboard.com/search?q=developer", max_pages=10)
print(f"Total jobs collected: {len(all_jobs)}")
6

Best Practices

01

Schedule Regular Runs

Job listings change frequently. Schedule daily or hourly scrapes to catch new postings quickly.

Recommended
02

Deduplicate Results

Track job IDs or URLs to avoid storing duplicate listings across scraping runs.

Essential
03

Store Full Details

Scrape job detail pages for complete descriptions, requirements, and benefits.

Advanced
04

Respect Rate Limits

Add delays between requests and use rotating proxies for large-scale scraping.

Important

Ready to Start Scraping?

Try the API in our interactive playground or explore the documentation.