Skip to main content

Recipe

This recipe shows how to extract structured data from web pages using AI without writing CSS or XPath selectors. Use extractStructuredData (TypeScript) or extract_structured_data (Python).
TypeScript
import { BrowserContext, Page } from "playwright";
import { extractStructuredData } from "@intuned/browser/ai";
import { z } from "zod";

// Define the schema for a single product
const ProductSchema = z.object({
  name: z.string().describe("Product name"),
  price: z.string().describe("Product price"),
  stock: z.string().describe("Stock status"),
  category: z.string().describe("Product category"),
});

// Define the schema for the list of products
const ProductsSchema = z.object({
  products: z.array(ProductSchema).describe("List of products from the table"),
});

export default async function handler(
  params: any,
  page: Page,
  context: BrowserContext
) {
  await page.goto("https://www.scrapingcourse.com/table-parsing");

  // Extract products using AI - no selectors needed
  const result = await extractStructuredData({
    source: page,
    dataSchema: ProductsSchema,
    prompt: "Extract all products from the table",
  });

  console.log(`Extracted ${result.products.length} products`);
  return result.products;
}

How it works

  1. Define a schema - Use Zod (TypeScript) or Pydantic (Python) to describe the data structure you want to extract
  2. Call the AI extractor - Pass the page and schema to extractStructuredData / extract_structured_data
  3. Get structured data - The AI analyzes the page and returns data matching your schema
No need to inspect the DOM, write selectors, or handle edge cases—the AI handles it all.

extractStructuredData (TypeScript)

TypeScript SDK helper for AI data extraction

extract_structured_data (Python)

Python SDK helper for AI data extraction