Overview
By the end of this guide, you’ll have an Intuned project (+ scraping Job with S3 sink) that sends scraped data directly to AWS S3. You’ll:- Create an S3 bucket and configure AWS credentials for Intuned.
- Configure a Job with an S3 sink.
- Trigger a Job and verify data lands in S3.
Prerequisites
Before you begin, ensure you have the following:- An AWS account with S3 access.
- An Intuned account.
This guide assumes you have a basic understanding of Intuned Projects and Jobs. If you’re new to Intuned, start with the getting started guide.
When to use S3 integration
Scrapers built on Intuned typically run via Jobs on a schedule. When a JobRun completes, you want that data sent somewhere for processing or persistence. S3 integration automatically delivers scraped data to your S3 bucket as JSON files. From there, you can process results using AWS tools like Lambda—or connect to other services.While this guide focuses on scraping, S3 integration works for any Intuned Job—the files sent to S3 are Run results from any automation.
Guide
1. Create an S3 bucket and access credentials
Create an S3 bucket and IAM credentials that Intuned can use to write data:Create an S3 bucket
- Log in to the AWS Management Console
- Navigate to the S3 service
- Select Create bucket
- Enter a unique bucket name (e.g.,
my-intuned-data)
Configure bucket settings
When creating your bucket:
- Object Ownership: Set to “Access Control Lists (ACLs) disabled”
- Block Public Access: Keep all public access blocked (recommended for security)
- Bucket Versioning: Optional - enable if you want to keep historical versions of files
- Encryption: Optional - enable default encryption for data at rest
- Select Create bucket to finish
Intuned only needs write access to your bucket, so keeping public access blocked is safe and recommended.
Create an IAM user for Intuned
Create a dedicated IAM user with limited permissions for Intuned:
- Navigate to IAM in the AWS Console
- Select Users in the left sidebar, then Create user
- Enter a username (e.g.,
intuned-s3-writer) - Select Next, which takes you to the permissions page
- Select Attach existing policies directly
- Select Create policy (opens in new tab)
- Select the JSON tab and paste this policy:
- Replace
YOUR-BUCKET-NAMEwith your actual bucket name - Select Next, which takes you to the Review page
- Name the policy
IntunedS3WritePolicy - Select Create policy
Attach policy and generate access keys
Back in the user creation flow:
- Refresh the policies list
- Search for
IntunedS3WritePolicy - Select the checkbox next to the policy
- Select Next to go to the Review page
- Select Create user
- Go to the Security credentials tab
- Select Create access key
- Choose Application running outside AWS and select Next
- Select Create access key
- Copy the Access key ID - you’ll need this for Intuned
- Copy the Secret access key - you’ll need this for Intuned (only shown once)
- Download the CSV or save these credentials securely
Note your configuration details
You now have everything needed to configure S3 in Intuned. Save these details:
- Bucket name: Your S3 bucket name
- Region: Your AWS region (e.g.,
us-west-2) - Access key ID: From the IAM user
- Secret access key: From the IAM user
2. Configure a Job with an S3 sink
Now that your S3 bucket is ready, add an S3 sink to a Job so Run results are delivered to your bucket.Prepare a project
You can use an existing project or create a new one.For this example, we’ll use the
ecommerce-scraper-quickstart project that you can deploy using the Deploy your first scraper quickstart tutorial.Create a Job with S3 sink
- Dashboard
- TypeScript SDK
- Python SDK
- Go to app.intuned.io
- Open your
ecommerce-scraper-quickstartproject - Select the Jobs tab
- Select Create Job
- Fill in the Job details:
- Job ID:
default-with-s3 - Payload API:
list - Payload Parameters:
{}
- Job ID:
- Enable sink configuration and add your S3 details:
- Type:
s3 - Bucket: Your S3 bucket name (e.g.,
my-intuned-scraper-data) - Region: Your AWS region (e.g.,
us-west-2) - Access Key ID: Your IAM user access key
- Secret Access Key: Your IAM user secret key
- Prefix (optional): A path prefix to organize files (e.g.,
ecommerce-data/) - Skip On Fail (optional): Check to skip writing failed Runs to S3
- Type:

- Select Save to create the Job.
Trigger the Job
- Dashboard
- TypeScript SDK
- Python SDK
- In the Jobs tab, find your new Job (
default-with-s3) - Select … next to the Job
- Select Trigger
- JobRun starts immediately - Visible in the Intuned dashboard
- API Runs execute - The
listAPI runs first, thendetailsAPIs for each product - Files written to S3 - When each API Run completes, Intuned writes a JSON file to your bucket
Inspect data in S3
After the Job completes, view your data in S3:
- Navigate to the S3 Console
- Open your bucket (e.g.,
my-intuned-scraper-data) - Navigate to your prefix path if you specified one (e.g.,
ecommerce-data/)
- Job sink:
{prefix}/{jobId}/run-{jobRunId}/{apiRunId}.json - Run sink:
{prefix}/runs/{apiRunId}.json
- One JSON file per API Run
- The initial
listAPI Run has one file - Each
detailsAPI Run (created byextendPayload) has its own file
View sample S3 file content
View sample S3 file content
Configuration options
For full details on S3 sink configuration and available options, see the S3 Sink API Reference. Key configuration fields:| Field | Required | Description |
|---|---|---|
bucket | Yes | S3 bucket name |
region | Yes | AWS region (e.g., us-west-2) |
accessKeyId | Yes | AWS access key ID |
secretAccessKey | Yes | AWS secret access key |
prefix | No | Path prefix for organizing files |
skipOnFail | No | Skip writing failed Runs to S3 (default: false) |
apisToSend | No | List of specific API names to send (default: all APIs) |
endpoint | No | Custom endpoint for S3-compatible services |
forcePathStyle | No | Use path-style URLs for S3-compatible services |
Processing data from S3
Once data lands in S3, you can process it in various ways depending on your needs. A common pattern is using an AWS Lambda that triggers automatically when a new file arrives. Typical processing steps include:- Normalizing the data structure
- Removing empty fields
- Validating against a schema
- Persisting to a database or data warehouse
Best practices
- Use least privilege IAM policies: Create a dedicated IAM user for Intuned with only
s3:PutObjectpermission. Restrict access to specific bucket paths using resource ARNs. Never use root account credentials. - Organize data with prefixes: Use meaningful prefix structures like
{environment}/{project-name}/{date}/to make data easier to find, manage, and set lifecycle policies on. - Set up lifecycle policies: Reduce storage costs by transitioning older data to S3 Glacier and deleting data you no longer need. This can reduce costs significantly for infrequently accessed data.
- Monitor usage and costs: Enable S3 Storage Lens for bucket-level insights, set up CloudWatch alarms for unexpected growth, and use Cost Explorer to track costs by bucket.
Troubleshooting
Job paused: “Failed to write to S3 sink”
Cause: Intuned automatically pauses the Job when it fails to write data to S3. Common reasons include invalid or expired AWS credentials, insufficient IAM permissions (missings3:PutObject), incorrect bucket name or region, or the bucket doesn’t exist.
Solution: Check the Job status in the Intuned dashboard (shows as “Paused”). Fix the underlying issue by verifying AWS credentials, ensuring IAM policy includes s3:PutObject permission, and confirming bucket name and region match your configuration. Test credentials with aws s3 ls s3://your-bucket-name. Update the Job configuration if needed, then select Resume from the dashboard. The Job continues from where it paused.
Related resources
S3 Sink API Reference
Complete API documentation for S3 sink configuration and options
Jobs
Learn more about creating and managing batched Job executions
Runs (Single API executions)
Learn about running single API executions outside of Jobs
Monitoring and traces
Debug and monitor your automation runs with traces and logs