Headless Browser Automation (Playwright, Puppeteer, Selenium)

Browser Automation Frameworks

The foundation technique used by most scrapers. A browser renders Maps, executes JS, and extracts from the DOM.

Playwright (Most Popular)

Microsoft's library. Used by gosom/google-maps-scraper, HasData, many commercial tools. Supports Chromium, Firefox, WebKit.

Puppeteer (Node.js)

Google's Node.js browser automation. puppeteer-extra-plugin-stealth provides 17 evasion modules.

Selenium (Legacy)

Original framework. undetected-chromedriver patches for detection evasion.

Extraction Strategies

StrategyToken CostWhen to Use
CSS selectors (querySelectorAll)~52/itemKnown structure — default choice
aria-label attributes~52/itemMore resilient — accessibility attrs are stabler than CSS classes
body.innerText~5K/pageDiscovery — learn structure once, then switch
Network/XHR interceptionMinimalCapture protobuf responses directly — best approach
Accessibility tree (filtered)~28K/pageFind buttons, forms, interactive elements
Screenshot~132KCAPTCHA solving, visual debugging only

Revision #1
Created 2026-06-05 23:48:41 UTC by Ben Adrian Sarmiento
Updated 2026-06-05 23:48:41 UTC by Ben Adrian Sarmiento