Reference
API Reference
The HTTP API for submitting documents directly, without the Laravel SDK. Use it to integrate from any language or runtime. The SDK is the recommended path for Laravel applications and wraps everything described here.
All parsing through the API uses bring-your-own-storage (BYO): your files stay in your own bucket and no document bytes pass through our servers. You hand us a presigned URL to read the source and a presigned URL to write the result. The managed storage mode used by the SDK in local development is not available on the API.
The base URL is https://parseforartisans.com/api/v1. All requests and responses
are JSON unless noted.
Authentication
Every request is authenticated with a bearer token. Create an API key in the
dashboard under API Keys, then send it in the Authorization header:
Authorization: Bearer <your-api-key>
A key is scoped to the team it was created in. Requests with a missing or invalid
key return 401 with the error type invalid_api_key.
GET /ping is a lightweight authenticated endpoint for verifying a key. It
returns 200 when the key is valid.
Core concepts
Asynchronous. Submitting a document returns immediately with a job in the
pending state. The work happens in the background. You learn the outcome by
polling the status endpoint or by receiving a webhook.
Client-generated id. You generate the job id (a UUID) and send it with the
submission. This is the idempotency key: re-submitting the same id does not
create a second job, it returns the state of the existing one. Generate a fresh
UUID per document and persist it so you can correlate the result later.
Bring-your-own-storage. You provide two presigned URLs for each job:
file_url: a presignedGETURL we use to download the source document.upload_url: a presignedPUTURL we use to upload the resulting Markdown.
The parsed Markdown is written to your upload_url location. It never lands on
our storage, so there is no result-download endpoint in BYO mode. Once the job is
completed, read the Markdown from your own bucket.
Presigned URL lifetime. Both URLs must stay valid long enough to cover queue time plus processing. Sign them with a generous expiry; one hour is a safe default for typical documents and well beyond the processing time for most files. If a URL has expired by the time we use it, the job fails.
Submit a document
POST /parse
Submit one document for parsing. Returns 202 with the job id and status.
| Field | Type | Required | Description |
|---|---|---|---|
id |
string (UUID) | yes | The job id you generate. Idempotency key. |
extension |
string | yes | The source file extension, lowercase, without a dot (for example pdf). Must be a supported type. |
filename |
string | no | An optional label for the source file. Does not affect routing; the type is determined by extension. |
source |
object | yes | The BYO storage descriptor. |
source.mode |
string | yes | Must be byo. |
source.file_url |
string (URL) | yes | Presigned GET URL for the source document. |
source.upload_url |
string (URL) | yes | Presigned PUT URL where the Markdown result is written. |
delivery |
object | no | How you want to be notified of completion. Defaults to polling. |
delivery.mode |
string | no | poll or webhook. Defaults to poll. |
delivery.callback_url |
string (URL) | no | Required when delivery.mode is webhook. Must be https. See Webhooks. |
options |
object | no | Parsing options. |
options.force_ocr |
boolean | no | Force OCR. OCR is auto-detected for scanned PDFs by default. |
options.ocr_language |
string | no | OCR language hint, for example eng or eng+fra. |
options.pages |
string | no | Restrict to a page range, for example 1-20. Only valid for paginated formats. |
options.frontmatter |
boolean | no | Prepend YAML frontmatter (author, dates, page count) to the Markdown. |
Response
202 Accepted
| Field | Type | Description |
|---|---|---|
id |
string | The job id. |
status |
string | The job status, pending on a new submission. |
Re-submitting an id that already exists returns 202 with the existing job's
current status rather than creating a new job.
Supported extensions
pdf, docx, pptx, xlsx, csv, doc, ppt, xls, eml, msg. An
unsupported value returns 422 with the type unsupported_type.
The pages option is only meaningful for paginated formats: pdf, pptx,
xlsx, csv, ppt, xls. Sending it for any other extension returns 422
with the type unsupported_option.
Check status
GET /parse/{id}
Returns the current state of a job. Scoped to the authenticated team; an unknown
id returns 404.
| Field | Type | Description |
|---|---|---|
id |
string | The job id. |
status |
string | One of pending, processing, completed, failed. |
page_count |
integer or null | Number of pages parsed, once known. |
credits_used |
integer or null | Credits consumed by the job, once known. |
started_at |
string or null | ISO 8601 timestamp when processing began. |
completed_at |
string or null | ISO 8601 timestamp when the job reached a terminal state. |
duration_ms |
integer or null | Processing duration in milliseconds, once complete. |
error |
object or null | Present only when status is failed. Contains type and message. |
The response is returned at the top level, with no data envelope. Additional
fields may be added over time, so parse defensively and ignore unknown fields.
Status lifecycle
A job moves pending to processing to a terminal state of either completed
or failed. Poll until status is terminal, or use a webhook to avoid polling.
Retrieve the result
In BYO mode the Markdown is written to the upload_url you supplied at
submission. Once status is completed, read the object from your own bucket.
There is no result-download endpoint, because the bytes never reach our storage.
Webhooks
Set delivery.mode to webhook and provide an https delivery.callback_url
to be notified when a job reaches a terminal state instead of polling. The
callback host must be publicly resolvable; URLs that resolve to private,
loopback, link-local, or cloud-metadata addresses are rejected at submission.
When the job finishes we send a POST to your callback URL. The request body is
the same JSON object returned by the status endpoint. Delivery is retried with
backoff (up to five attempts) if your endpoint is unavailable.
Verifying the signature
Each webhook carries an X-Parse-Signature header so you can confirm it came
from us and was not modified. The header has the form:
X-Parse-Signature: t=<timestamp>,v1=<signature>
t is a Unix timestamp and v1 is a hex-encoded HMAC-SHA256 signature. To
verify:
- Read
tandv1from the header. - Build the signed payload by joining the timestamp and the raw request body
with a single period:
<t>.<raw-body>. Use the exact bytes of the body, not a re-serialized copy. - Compute
HMAC-SHA256over that string using your team's webhook signing secret (thewhsec_value from the dashboard) as the key, hex-encoded. - Compare the result to
v1using a constant-time comparison. Reject the request if they do not match.
Optionally reject requests whose t is too far from the current time to limit
replay. Rotating the signing secret in the dashboard invalidates signatures
verified against the previous value.
Errors
Errors return a non-2xx status and a JSON body with an error object:
| Field | Type | Description |
|---|---|---|
error.type |
string | A stable, machine-readable error code. |
error.message |
string | A human-readable description. |
| Type | Status | Meaning |
|---|---|---|
invalid_api_key |
401 | Missing or invalid API key. |
invalid_request |
400 | The request body failed validation. |
unsupported_type |
422 | The extension is not a supported file type. |
unsupported_option |
422 | An option is not valid for the given file type, such as pages on a non-paginated format. |