Error Handling
All API errors return a consistent JSON structure for predictable client-side handling.
Error Response Structure
ParseSphere APIs return errors in a standardized format:
{
"error": "ErrorType",
"message": "Human-readable error description",
"details": { }
}Response Fields:
error: Machine-readable error type suitable for programmatic handlingmessage: Human-readable description for display to usersdetails: Additional context when available (e.g., which field failed validation, which limit was exceeded)
HTTP Status Codes
ParseSphere uses standard HTTP status codes to indicate the result of API requests.
Success Codes (2xx)
200 OK: Standard successful request
201 Created: Resource successfully created
202 Accepted: Asynchronous processing has begun
204 No Content: Successful deletion or operation with no response body
Client Error Codes (4xx)
Warning
Client errors indicate problems with the request that need to be fixed before retrying.
400 Bad Request: Request parameters are malformed or invalid
401 Unauthorized: Missing or invalid authentication credentials
403 Forbidden: Authenticated user lacks permission for the requested operation
404 Not Found: The requested resource doesn't exist
413 Payload Too Large: Uploaded file exceeds size limits
422 Unprocessable Entity: Request is well-formed but fails validation rules
429 Too Many Requests: Rate limit exceeded. Retry after the duration specified in the Retry-After header
Server Error Codes (5xx)
500 Internal Server Error: An internal server error occurred
Information
Server errors are logged automatically. If you encounter them repeatedly, contact support.
Document Parsing Errors
File Size Limit (413)
Files exceeding 50 MB are rejected before processing:
{
"detail": "File too large (55.0MB). Maximum allowed: 50MB"
}Solution: Compress or split the document before uploading.
Unsupported File Type (422)
Files with unsupported extensions are rejected:
{
"detail": "Unsupported file type: .xyz. Supported formats: pdf, docx, pptx, xlsx, csv, txt"
}Supported Formats:
- PDF (
.pdf) - Word (
.docx) - PowerPoint (
.pptx) - Excel (
.xlsx) - CSV (
.csv) - Plain Text (
.txt)
Solution: Convert the document to a supported format.
Parse Not Found (404)
Requesting a parse that doesn't exist or has expired:
{
"detail": "Parse 550e8400-e29b-41d4-a716-446655440000 not found. Parses expire after 24 hours."
}Information
Parse results expire based on the session_ttl parameter (default: 24 hours, min: 60 seconds). Adjust the TTL when creating the parse if needed.
Solution: Adjust the session_ttl when creating the parse (default is 24 hours) or save results immediately if you need them longer.
Parse Failed
When a parse job fails, the status endpoint returns details about the failure:
{
"parse_id": "550e8400-e29b-41d4-a716-446655440000",
"status": "failed",
"error": "Failed to extract PDF: Document is corrupted or encrypted",
"created_at": "2025-11-30T12:00:00Z",
"completed_at": "2025-11-30T12:00:05Z"
}Common Failure Reasons:
- Corrupted files: Document structure is damaged
- Password-protected documents: Encrypted PDFs require decryption
- Invalid file formats: Files disguised with supported extensions
- Processing timeout: Document too complex (exceeds 10 minute limit)
Tip
For password-protected PDFs, decrypt them before upload. For corrupted files, try re-exporting from the source application.
Tabular Data & Querying Errors
Dataset File Size Limit (413)
Dataset files exceeding 100 MB are rejected:
{
"detail": "File too large (150.0MB). Maximum allowed: 100MB"
}Solution: Split large datasets into smaller files or aggregate data before upload.
Unsupported Dataset Format (422)
Only CSV and Excel files are supported for datasets:
{
"detail": "Unsupported file type: .json. Supported formats: csv, xlsx, xls"
}Supported Formats:
- CSV (
.csv) - Excel (
.xlsx,.xls)
Workspace Not Found (404)
Accessing a workspace that doesn't exist or you don't have access to:
{
"detail": "Workspace 550e8400-e29b-41d4-a716-446655440000 not found or access denied"
}Warning
Workspaces can only be accessed by their owner. Check that you're using the correct API key or user authentication.
Dataset Not Found (404)
Accessing a dataset that doesn't exist in the workspace:
{
"detail": "Dataset 880e8400-e29b-41d4-a716-446655440000 not found in workspace"
}No Datasets in Workspace (400)
Attempting to query a workspace with no completed datasets:
{
"detail": "No completed datasets found in workspace. Please upload and process datasets first."
}Solution: Upload and wait for dataset processing to complete before querying.
Query Execution Failed (500)
When a natural language query generates invalid SQL or encounters a database error:
{
"detail": "Query execution failed: DuckDB error: Binder Error: column 'invalid_column' not found"
}Information
The AI agent will automatically retry with corrected SQL if possible. Repeated failures are logged for system improvement.
Query Timeout (500)
Queries that exceed 60 seconds are terminated:
{
"detail": "Query execution timeout after 60 seconds"
}Solution: Simplify your question, add filters to reduce data scope, or query specific datasets instead of all workspace data.
Dataset Processing Errors
Dataset Transformation Failed
When CSV/Excel to Parquet conversion fails, the dataset status returns error details:
{
"job_id": "770e8400-e29b-41d4-a716-446655440000",
"dataset_id": "880e8400-e29b-41d4-a716-446655440000",
"workspace_id": "550e8400-e29b-41d4-a716-446655440000",
"status": "failed",
"error_message": "Failed to parse CSV: Invalid delimiter or malformed data",
"created_at": "2025-12-06T12:00:00Z",
"completed_at": "2025-12-06T12:00:15Z"
}Common Causes:
- Malformed CSV: Inconsistent delimiters, unescaped quotes
- Empty file: File contains no data rows
- Encoding issues: Non-UTF-8 encoding
- Excel corruption: Workbook structure is damaged
Exception Hierarchy
ParseSphere uses specific exception types for different error categories:
Document Parsing:
ExtractionError: General extraction failureUnsupportedFileError: File type not supportedFileCorruptedError: Document structure is invalid
Tabular Data:
DataTransformError: CSV/Excel transformation failedSchemaAnalysisError: Unable to analyze dataset schemaBlobStorageError: Azure storage operation failedDuckDBConnectionError: Query execution errorSQLValidationError: Generated SQL failed validation
Error Recovery Strategies
Automatic Retries
Information
ParseSphere automatically retries certain operations with exponential backoff.
Celery Task Retries:
- Document parsing: Max 3 retries
- Dataset processing: Max 3 retries
- Retry delay: Exponential backoff
Webhook Delivery Retries:
- Max 3 delivery attempts
- Initial delay: 1 second
- Exponential backoff: delay × 2^attempt
Handling Failed Operations
For Parse Jobs:
- Check the parse status endpoint for detailed error message
- Review common failure reasons
- Fix the issue (e.g., decrypt PDF, repair file)
- Create a new parse job
For Dataset Jobs:
- Check the dataset status endpoint for error details
- Validate CSV format and encoding
- Verify Excel file isn't corrupted
- Delete failed dataset and re-upload
For Queries:
- Review the query log for SQL execution details
- Simplify your natural language question
- Ensure dataset column names match your query intent
- Try querying fewer datasets at once
Best Practices
Preventing Errors
File Validation
Validate file size and format before upload to avoid rejected requests.
Before Uploading Documents:
- Check file size is under 50 MB
- Verify file extension matches actual format
- Test that file opens in native application
- Remove password protection from PDFs
Before Uploading Datasets:
- Check file size is under 100 MB
- Validate CSV has consistent delimiter
- Ensure Excel has data in first sheet
- Use UTF-8 encoding for CSV files
Error Monitoring
Information
All operations are logged with timestamps and error details for debugging and monitoring.
Key Metrics to Track:
- Parse success rate by file type
- Dataset processing time and failures
- Query execution time and errors
- LLM token usage and costs
Getting Help
If you encounter persistent errors:
- Check Status Endpoints: Always review detailed error messages from status endpoints
- Review Logs: Query logs include SQL execution details and LLM reasoning
- Verify Authentication: Ensure API keys are valid and have correct permissions
- Contact Support: For repeated 500 errors, contact support with
parse_id,dataset_id, orquery_id
What's Next?
Learn more about ParseSphere:
- Quick Start - Get started with document parsing
- Authentication - Manage API keys
- Rate Limits - Understand usage limits
- Document Parsing - Deep dive into extraction
- Tabular Data & Querying - Natural language queries
