extract_markdown

Converts HTML content from a Playwright Page or Locator to semantic markdown format.

async def extract_markdown(
    source: Page | Locator,
) -> str

Examples

from typing import TypedDict
from playwright.async_api import Page
from intuned_browser import extract_markdown
class Params(TypedDict):
    pass
async def automation(page: Page, params: Params, **_kwargs):
    await page.goto("https://books.toscrape.com/")
    header_locator = page.locator('h1').first  # First title on the page
    markdown = await extract_markdown(header_locator)  # Extract markdown from the first title
    print(markdown)
    return markdown

Arguments

source

Page | Locator

required

The source of the HTML content. When a Page is provided, extracts from the entire page. When a Locator is provided, extracts from that specific element.

Returns: `str`

The markdown representation of the HTML content

Introduction

Typescript SDK

Python SDK

extract_markdown

Examples

Arguments

Returns: `str`

Introduction

Typescript SDK

Python SDK

​Examples

​Arguments

​Returns: str

Examples

Arguments

Returns: `str`