Learn how to use the Standalone File API to process files without a project.
The standalone APIs allow you to process PDF and image files without consuming a project. We currently provide 3 operations:
There are two ways to consume these APIs: synchronously and asynchronously. In synchronous calls, the result is returned in the same call. In asynchronous calls, the result is returned in a separate call using an operationId
obtained in the initial call.
Each of the operations listed above is available via a Sync API and an Async API. In Sync APIs, you make a single call which triggers the operation and returns the result. In Async APIs, you make two calls: the first call triggers the operation and returns an operationId
, and the second call uses the operationId
to check the status and get the result.
Depending on the input, the call might take a long time to complete, especially if the file is large or the operation is complex. If the API is taking too long, the request might time out before the file processing is finished. For this reason, we recommend using the Asynchronous API for most use cases.
The Sync API is limited to 10 requests per minute per operation. If you need a higher rate limit, contact us.
We currently support pdf files and image files. We will be working on supporting other formats soon. Contact us if you have any specific requirements.
In PDF files, you can specify the page numbers to run processing on. If no page numbers are specified, the operation will run on all pages. Check out the API reference for more information.
This API allows you to extract data from a file following a JSONSchema. This is useful when you have a document with a known data structure that you want to extract, such as a contract document.
This API allows you to extract markdown from the file, including headers, paragraphs, lists, tables and links. The output is human-readable and can be used for further processing or display.
This API allows you to extract tables from the file in JSON format. This is useful when you have a document with tabular data that you want to extract and process further. The result is an array of tables, each table including the page number, title (if any), and the table data.
Learn how to use the Standalone File API to process files without a project.
The standalone APIs allow you to process PDF and image files without consuming a project. We currently provide 3 operations:
There are two ways to consume these APIs: synchronously and asynchronously. In synchronous calls, the result is returned in the same call. In asynchronous calls, the result is returned in a separate call using an operationId
obtained in the initial call.
Each of the operations listed above is available via a Sync API and an Async API. In Sync APIs, you make a single call which triggers the operation and returns the result. In Async APIs, you make two calls: the first call triggers the operation and returns an operationId
, and the second call uses the operationId
to check the status and get the result.
Depending on the input, the call might take a long time to complete, especially if the file is large or the operation is complex. If the API is taking too long, the request might time out before the file processing is finished. For this reason, we recommend using the Asynchronous API for most use cases.
The Sync API is limited to 10 requests per minute per operation. If you need a higher rate limit, contact us.
We currently support pdf files and image files. We will be working on supporting other formats soon. Contact us if you have any specific requirements.
In PDF files, you can specify the page numbers to run processing on. If no page numbers are specified, the operation will run on all pages. Check out the API reference for more information.
This API allows you to extract data from a file following a JSONSchema. This is useful when you have a document with a known data structure that you want to extract, such as a contract document.
This API allows you to extract markdown from the file, including headers, paragraphs, lists, tables and links. The output is human-readable and can be used for further processing or display.
This API allows you to extract tables from the file in JSON format. This is useful when you have a document with tabular data that you want to extract and process further. The result is an array of tables, each table including the page number, title (if any), and the table data.