extractArrayFromPage
Extracts an array of structured data from a web page in an optimized way, this function will use ai for the first n times, until it collects multiple examples then it will build reliable selectors in the background to make the process more efficient
Examples
Parameters
• page: Page
The Playwright Page object from which to extract the data.
• options
• options.itemEntityName: string
The name of the entity items being extracted, it must be between 1 and 50 characters long and can only contain letters, digits, periods, underscores, and hyphens.
• options.itemEntitySchema: SimpleArrayItemSchema
The schema of the entity items being extracted.
• options.label: string
A label for this extraction process, used for billing and monitoring.
• options.optionalPropertiesInvalidator?
Optional. A function to invalidate optional properties.
• options.prompt?: string
Optional. A prompt to guide the extraction process.
• options.strategy?: ImageStrategy
| HtmlStrategy
Optional. The strategy to use for extraction, if not provided, the html strategy with claude haiku will be used.
• options.variantKey?: string
Optional. A variant key for the extraction process, use this when the page has multiple variants/shapes.
Returns
Promise
<Record
<string
, string
>[]>
A promise that resolves to a list of extracted data.