BLOG

Notes from the team behind Runo.

Guides, deep-dives, and engineering notes on web scraping, anti-bot bypass, and structured extraction.

All AIAI-AgentsAnti-BotAPIArchitectureBusinessBypassCloudflareComparisonComplianceCrawlDeep-DiveDeveloperEcommerceEngineeringGuideHeadlessJavascriptJSONLead-GenerationLegalLLMNewsPipelinePricingReal-EstateSalesSchemaSearchSentimentSEOTutorialUse-CaseWeb-Scraping
The Art of Painting, by Johannes Vermeer
Ecommerce10 min read

Schema design patterns for e-commerce extraction

Battle-tested schema patterns for product pages, category pages, reviews, and inventory. Edge cases, type choices, and the fields people forget.

The Maas at Dordrecht
Tutorial8 min read

Building a news aggregator with the /crawl endpoint

Walkthrough of a working news aggregator: source discovery, crawl configuration, dedup across sources, and a 24-hour ingest cadence that scales.

Le Cheval de Troie
Engineering8 min read

Headless browser fingerprinting in 2026: how detection works and what to do

A technical breakdown of the signals anti-bot services use to detect headless browsers, and the patches that close the gap.

The Chess Players, by Moritz Retzsch
SEO9 min read

Scraping Google SERP results in 2026: what works and what doesn't

Direct Google scraping is a losing battle in 2026. Here's the realistic landscape, the alternatives that work, and how to extract structured data from search results.

Medusa, by Caravaggio
Tutorial9 min read

Sentiment analysis from product reviews: a practical pipeline

How to scrape product reviews at scale and turn them into actionable sentiment data. Schema design, aspect-based sentiment, and avoiding the common pitfalls.

Shearing the Rams
Lead-Generation10 min read

Lead generation from public web data: a builder's guide

How to extract qualified leads from company websites, public directories, and structured registries without violating terms of service or privacy law.

View of the Colosseum
Tutorial9 min read

Building a real-estate data pipeline with a scraping API

An end-to-end walkthrough of pulling listings from multiple real-estate portals into a normalized database. Schema design, dedup, refresh cadence, and cost.

Lithograph by Honoré Daumier
Legal9 min read

Is web scraping legal in 2026? A practical guide for builders

What courts, regulators, and contracts actually say about scraping public web data, with the case law that shaped the current landscape and a working playbook.

The Eclipse in Venice
Engineering8 min read

Scraping JavaScript-heavy SPAs: Next.js, Nuxt, and React in 2026

Why plain HTTP fetching returns empty pages on modern frontends, what render targets work, and how to recover server-shipped data without a headless browser.

Departure of William III from Hellevoetsluis
Guide9 min read

The complete guide to web scraping APIs in 2026

What a modern web scraping API actually does, how to evaluate one, and where each category (proxies, browsers, extractors) fits into a real pipeline.

Michelangelo in His Studio Visited by Pope Julius II, by Alexandre Cabanel
Comparison8 min read

Firecrawl vs Apify vs Runo: which scraping API to pick in 2026

An honest, side-by-side look at three popular scraping APIs. What each is built for, where each shines, and where each costs you time and money.

Seascape Study with Rain Cloud
Engineering7 min read

How to scrape Cloudflare-protected sites without getting blocked

A practical, layered approach to defeating Cloudflare's bot challenges in 2026. TLS fingerprints, hardened headless, cookie persistence, and when to escalate.

An Experiment on a Bird in the Air Pump
Engineering7 min read

LLM extraction vs CSS selectors: why selector-based scraping is dead at scale

Selectors break when sites redesign. LLMs extract by semantic meaning. Here's why the tradeoff has flipped, with cost numbers from real workloads.

A Scholar in His Study
Guide8 min read

Extracting structured JSON from any HTML: a developer's guide

How to turn arbitrary web pages into typed JSON shaped to your schema. Covers schema design, type coercion, null handling, and edge cases.

The Astronomer, by Johannes Vermeer
AI-Agents8 min read

Web scraping for AI agents: building the data layer for LLM apps

How to architect the scraping stack behind autonomous agents. Schema-typed data, low-latency tool calls, cost control, and error semantics that don't break the loop.

The Gulf Stream, by Winslow Homer
Guide9 min read

How to monitor competitor prices with a scraping API

A practical guide to building a competitor price monitoring pipeline. Schema design, change detection, alerting, and the legal and operational pitfalls.