Markdown

HTML in. LLM-ready Markdown out.

Convert any URL into clean Markdown: chunked, embed-ready, deduplicated. The fastest path from "the live web" to a vector store. Built specifically for RAG and LLM training pipelines.

Try in playground Read the docs

credits / call

1.1s

p50 latency

chunk strategies

POST /markdown

POST /markdown
{
  "url":      "https://docs.stripe.com/api/charges",
  "chunk":    "semantic",
  "max_tokens": 800,
  "include":  "main, article, .content"
}

→ 200 OK · 5 credits · 1.1s
{
  "chunks": [
    { "id":"c0", "tokens": 412, "text":"# Charges\n\nA Charge..." },
    { "id":"c1", "tokens": 786, "text":"## Create a Charge\n\nPOST /..." },
    ...
  ],
  "title": "Stripe API · Charges",
  "url":   "https://docs.stripe.com/api/charges"
}

Live playground

See it work, before you sign up.

Drop in a URL, run a real call against the live API, and watch the JSON come back in about a second. No API key required.

Real Markdown

Headings stay headings. Tables stay tables. Code blocks keep their language hints. Links keep their anchors. None of the GPT-flattened slop.

Smart chunking

Pick semantic (header-aware), fixed (token-bounded), sentence, or none. We respect natural document boundaries.

Token-counted

Every chunk comes with tokens pre-counted for the tokenizer of your choice (cl100k, o200k, claude). No surprises at embedding time.

Boilerplate stripped

Nav, footer, ads, "related articles", all gone. Pass include selectors to keep specific zones; pass exclude to drop them.

Embed-ready

Pass embed: "openai" or embed: "voyage" and we return chunks with embedding vectors attached. One round-trip, vector-store-ready.

Site-crawl mode

Pass a sitemap URL with crawl: "site" and we Markdown-ify the entire docs site, dedup, and stream chunks via webhook.

Drop-in code

Copy. Paste. Ship.

import { Ujeebu } from "ujeebu";
const uj = new Ujeebu(process.env.UJEEBU_KEY);

const { chunks } = await uj.markdown({
  url:        "https://docs.stripe.com/api/charges",
  chunk:      "semantic",
  max_tokens: 800,
});

await pinecone.upsert(chunks.map(c => ({
  id: c.id, values: await embed(c.text)
})));

from ujeebu import Ujeebu
uj = Ujeebu(api_key=os.environ["UJEEBU_KEY"])

result = uj.markdown(
    url="https://docs.stripe.com/api/charges",
    chunk="semantic", max_tokens=800,
    embed="openai",   # vectors included
)

pinecone.upsert([
    (c["id"], c["embedding"], {"text": c["text"]})
    for c in result["chunks"]
])

curl -X POST https://api.ujeebu.com/markdown \
  -H "ApiKey: $UJEEBU_KEY" \
  -d '{
    "url":    "https://docs.stripe.com/api/charges",
    "filter": "fit"
  }'

// Markdown-ify a whole docs site
const job = await uj.markdown({
  url:     "https://docs.stripe.com/sitemap.xml",
  crawl:   "site",
  chunk:   "semantic",
  webhook: "https://your-app.com/ingest"
});

// Chunks stream to your webhook as they finish.
console.log(job.id, job.estimated_pages); // => 4,200

What people build with it

Real things real teams shipped this quarter.

RAG ingest at scale

A devtools company keeps embeddings of 18k docs sites fresh. Site-crawl mode + nightly delta. No crawler code. No HTML stripping. No reprocessing.

LLM training corpora

Build domain-specific Markdown datasets. Filter by token count and language at extraction time, not in post-processing.

Doc-search products

Customer points at their docs URL. 90 seconds later they have a working "ask my docs" widget. Zero-config ingest, real-time freshness.

Ground-truth chunks for evals

Build LLM eval sets from real public content. Same chunks day-over-day means stable eval scores; no "the doc moved" noise.

Pricing

5 credits per Markdown call. Here's what that buys.

One credit pool covers every endpoint. Failed calls cost 0. No per-feature upcharges, no premium-proxy tax. See full pricing →

Start free

Starter per month

$49

60k

Markdown calls / month

300k credits · used flexibly across all endpoints

Choose Starter

Ship Markdown tonight.

5,000 credits free. No card. Real residential proxies on the free tier.

Get API key Try in playground

Pack	Credits	~ Markdown calls	Price	Per 1K credits
Mini	50,000	10k	$10.00	$0.20	Buy
Starter	100,000	20k	$15.00	$0.15	Buy
Standard	500,000	100k	$60.00	$0.12	Buy
Pro	1,000,000	200k	$100.00	$0.10	Buy
Enterprise	5,000,000	1M	$400.00	$0.08	Buy