await page.goto("https://books.toscrape.com/catalogue/a-light-in-the-attic_1000/index.html");const siteMarkdown = extractMarkdownFromPage(page);// [Books to Scrape](../../index.html) We love being scraped!// - [Home](../../index.html)// - [Books](../category/books_1/index.html)// - [Poetry](../category/books/poetry_23/index.html)// - A Light in the Attic// // # A Light in the Attic// £51.77// \_\_ In stock (22 available)
You can also use File Markdown Conversion as a standalone API. Checkout Standalone File APIs for more info.
Intuned provides utilities to convert files to markdown. Markdown is a particularly good format for working with LLMs. For more info checkout: extractMarkdownFromFile reference.
Copy
const specMarkdown = await extractMarkdownFromFile({ type: "pdf", source: { type: "url", data: "https://intuned-docs-public-images.s3.amazonaws.com/27UP600_27UP650_ENG_US.pdf" },}, { label: "pdf_markdown" });// LG// Life's Good// # OWNER'S MANUAL// LED LCD MONITOR// \(LED Monitor\*\)// \* LG LED Monitor applies LCD screen with LED backlights. Please read this manual carefully before operating your set and retain it for future reference.// 27UP600// 27UP650// ....
You can also use Table Extraction as a standalone API. Checkout Standalone File APIs for more info.
Intuned provides utilities to extract tables from files. Tables are some of the common elements in data-rich files. For more info on how to use this, checkout extractTablesFromFile reference.