Skip to main content

Overview

By the end of this guide, you’ll have an Intuned project (+ scraping Job with R2 sink) that sends scraped data directly to Cloudflare R2. You’ll:
  1. Create an R2 bucket and configure credentials for Intuned.
  2. Configure a Job with an R2 sink.
  3. Trigger a Job and verify data lands in R2.

Prerequisites

Before you begin, ensure you have the following:
  • A Cloudflare account with R2 access.
  • An Intuned account.
This guide assumes you have a basic understanding of Intuned Projects and Jobs. If you’re new to Intuned, start with the getting started guide.

When to use R2 integration

Scrapers built on Intuned typically run via Jobs on a schedule. When a JobRun completes, you want that data sent somewhere for processing or persistence. R2 integration automatically delivers scraped data to your R2 bucket as JSON files. From there, you can process results using Cloudflare Workers at the edge—or connect to other services. R2’s zero egress fees make it ideal for frequent data access.
While this guide focuses on scraping, R2 integration works for any Intuned Job—the files sent to R2 are Run results from any automation.

Guide

1. Create an R2 bucket and access credentials

Create an R2 bucket and API credentials that Intuned can use to write data:

Create an R2 bucket

  1. Log in to the Cloudflare Dashboard
  2. Navigate to Storage & databases > R2 object storage > Overview from the sidebar
  3. Select Create bucket
  4. Enter a bucket name (e.g., my-intuned-scraper-data)
  5. Select Create bucket
R2 buckets are globally accessible with automatic redundancy - no region selection needed unlike AWS S3.

Get your Cloudflare Account ID

Your Account ID is needed to construct the R2 endpoint URL:
  1. In the Cloudflare dashboard, look at the URL - it contains your Account ID
  2. The URL format is: https://dash.cloudflare.com/<account-id>/r2
  3. Copy your Account ID (the alphanumeric string)
  4. Your R2 endpoint will be: https://<account-id>.r2.cloudflarestorage.com
Your Account ID is also visible in the sidebar of the Cloudflare dashboard under your account name.

Generate R2 API token

Create API credentials for Intuned to access your R2 bucket:
  1. In the Storage & databases > R2 object storage > Overview section, select Manage Tokens
  2. Select Create Account API Token
  3. Configure the token:
    • Token name: Enter intuned-r2-writer
    • Permissions: Select Object Read & Write
    • TTL: Leave as default or set a custom expiration
    • Bucket scope: Select your specific bucket or choose All buckets
  4. Select Create Account API Token
  5. Copy both the Access Key ID and Secret Access Key immediately
The Secret Access Key is only shown once and cannot be retrieved later. Save both credentials securely. Never commit credentials to version control.

Note your configuration details

You now have everything needed to connect Intuned to R2. Save these details:
  • Bucket name: Your R2 bucket name
  • Account ID: From the Cloudflare dashboard URL
  • Endpoint: https://<account-id>.r2.cloudflarestorage.com
  • Access Key ID: From the API token
  • Secret Access Key: From the API token
You’ll use these in the next section to configure your Intuned Job.

2. Configure a Job with an R2 sink

Prepare a project

You can use an existing project or create a new one.For this example, we’ll use the ecommerce-scraper-quickstart project that you can deploy using the Deploy your first scraper quickstart tutorial.

Create a Job with an R2 sink

  1. Go to app.intuned.io
  2. Open your ecommerce-scraper-quickstart project
  3. Select the Jobs tab
  4. Select Create Job
  5. Fill in the Job ID and payloads
  • Set Job ID to: default-with-r2
  • Set payload api to list and empty parameter object {}
  1. Enable sink configuration and add your R2 details with the details from the previous section:
    • Type: s3 (R2 uses S3-compatible API)
    • Bucket: Your R2 bucket name (e.g., my-intuned-scraper-data)
    • Region: Set to any value (R2 is zone-agnostic) (e.g., auto)
    • Endpoint: https://<your-account-id>.r2.cloudflarestorage.com
    • Access Key ID: Your R2 API token access key
    • Secret Access Key: Your R2 API token secret key
    • Force Path Style: true (required for R2)
    • Prefix (optional): A path prefix to organize files (e.g., ecommerce-data/)
    • Skip On Fail (optional): Check to skip writing failed Runs to R2
Job Sink Configuration
  1. Select Save to create the Job.

Trigger the Job

  1. In the Jobs tab, find the Job you created
  2. Select next to the Job
  3. Select Trigger
The Job starts running immediately. You’ll see the JobRun appear in the dashboard with status updates.

Inspect data in R2

After the Job completes, view your scraped data in R2:
  1. Navigate to the Cloudflare Dashboard → Storage & databases > R2 object storage > Overview
  2. Open your bucket (e.g., my-intuned-scraper-data)
  3. Navigate to your prefix path if you specified one (e.g., ecommerce-data/)
R2 file structure:Files are organized differently depending on whether you’re using a Job sink or a Run sink:
  • Job sink: {prefix}/{jobId}/run-{jobRunId}/{apiRunId}.json
  • Run sink: {prefix}/runs/{apiRunId}.json
Since we’re using a Job sink in this example, your files follow the Job sink pattern.What to expect:
  • One JSON file per API Run
  • The initial list API Run has one file
  • Each details API Run (created by extendPayload) has its own file
The ecommerce scraper uses extendPayload to create detail tasks for each discovered product. You’ll see multiple files: one for the initial list Run, then one for each details Run.
Example R2 payload:
{
  "workspaceId": "e95cb8d1-f212-4c04-ace1-c0f77e8708c7",
  "apiInfo": {
      "name": "details",
      "runId": "656CxOdANRlR5lWUAt_eC",
      "parameters": {
          "detailsUrl": "https://www.scrapingcourse.com/ecommerce/product/abominable-hoodie/",
          "name": "Abominable Hoodie"
      },
      "result": {
          "status": "completed",
          "result": [
            {
              "id": "prod-1",
              "name": "Wireless Headphones",
              "price": "$79.99"
            },
            {
              "id": "prod-2",
              "name": "Smart Watch",
              "price": "$199.99"
            }
          ],
          "statusCode": 200
      }
  },
  "project": {
      "id": "482bf507-5fcc-43ed-9443-d8fff86015c4",
      "name": "ecommerce-scraper-quickstart"
  },
  "projectJob": {
      "id": "default"
  },
  "projectJobRun": {
      "id": "08523ea6-5c6b-413e-995a-40e4f6fd7846"
  }
}
After triggering the Job:
  1. Job starts immediately - Visible in Intuned dashboard
  2. API Runs execute - The list API runs first, then details APIs for each product
  3. Files written to R2 - When each API Run completes, Intuned writes a JSON file to your bucket
Check your R2 bucket - you should see the JSON files with the structure documented in the previous step.
If writing to R2 fails (e.g., due to incorrect credentials, wrong endpoint format, or missing forcePathStyle), Intuned pauses the Job automatically. The pause reason is “Failed to write to S3 sink”. Verify your R2 endpoint and credentials, fix the issue, and resume the Job from the dashboard.

Configuration options

For full details on S3-compatible sink configuration, see the S3 Sink API Reference. Key configuration fields for R2:
FieldRequiredDescription
typeYesMust be "s3" (R2 uses S3-compatible API)
bucketYesR2 bucket name
endpointYesR2 endpoint: https://<account_id>.r2.cloudflarestorage.com
accessKeyIdYesR2 API token access key
secretAccessKeyYesR2 API token secret key
forcePathStyleYesMust be true for R2
prefixNoPath prefix for organizing files
skipOnFailNoSkip writing failed Runs (default: false)
apisToSendNoList of specific API names to send
R2 doesn’t require a region field since it’s zone-agnostic (globally distributed). The endpoint and forcePathStyle fields are essential for R2 to work correctly.

Processing data from R2

Once data lands in R2, process it using Cloudflare Workers—a common pattern for serverless data pipelines. Workers run at the edge with zero-latency access to R2 and no egress fees. Alternatively, use R2 Event Notifications to trigger workflows, or access data via the S3-compatible API from any programming language or tool (AWS CLI, boto3, aws-sdk).

Best practices

  • Organize data with prefixes: Use meaningful prefix structures like {environment}/{project-name}/{date}/ to make data easier to find, manage, and process.
  • Use Cloudflare Workers for processing: Workers have zero-latency access to R2 data with no egress fees. Use them to transform, filter, or aggregate data at the edge before storing or forwarding.
  • Monitor R2 usage: Check the R2 dashboard for storage usage and request counts. Set up alerts for storage thresholds and track API request patterns.

Troubleshooting

Endpoint connection errors

Cause: Incorrect Account ID in endpoint URL, missing or malformed endpoint field, or typo in URL format. Solution: Find your Account ID in the Cloudflare dashboard URL (https://dash.cloudflare.com/<account-id>/r2). Verify endpoint format is exactly https://<account-id>.r2.cloudflarestorage.com with no trailing slash or extra paths.

forcePathStyle errors

Cause: Missing or incorrect forcePathStyle setting. R2 requires path-style URLs, not the virtual-hosted-style URLs that AWS S3 uses by default. Solution: Ensure forcePathStyle is set to true in your sink configuration. This field is required for R2 to work correctly.

Job paused: “Failed to write to S3 sink”

Cause: Intuned automatically pauses the Job when it fails to write data to R2. Common reasons include invalid or expired API token, incorrect R2 endpoint format, missing forcePathStyle: true setting, insufficient token permissions, or the bucket doesn’t exist. Solution: Check the Job status in the Intuned dashboard (shows as “Paused”). Fix the underlying issue by verifying R2 credentials, confirming endpoint format, ensuring forcePathStyle is true, and checking token permissions include Object Read & Write. Test credentials with aws s3 ls s3://your-bucket --endpoint-url=https://your-account-id.r2.cloudflarestorage.com. Update the Job configuration if needed, then select Resume from the dashboard. The Job continues from where it paused.