Skip to main content
Converts HTML content from a Playwright Page or Locator to semantic markdown format.
async def extract_markdown(
    source: Page | Locator,
) -> str

Examples

from typing import TypedDict
from playwright.async_api import Page
from intuned_browser import extract_markdown
class Params(TypedDict):
    pass
async def automation(page: Page, params: Params, **_kwargs):
    await page.goto("https://books.toscrape.com/")
    header_locator = page.locator('h1').first  # First title on the page
    markdown = await extract_markdown(header_locator)  # Extract markdown from the first title
    print(markdown)
    return markdown

Arguments

source
Page | Locator
required
The source of the HTML content. When a Page is provided, extracts from the entire page. When a Locator is provided, extracts from that specific element.

Returns: str

The markdown representation of the HTML content