🔧 Diagnose · Recover · Rebuild · Zero File Uploads

PDF Repair Tool

Diagnose and repair corrupted, damaged, or broken PDF files — analyse structure integrity, detect errors in cross-reference tables and object streams, recover readable pages, and download a clean rebuilt PDF. Fully browser-based, zero uploads, instant results.

Drop your damaged PDF here

or click to browse

Broken PDF Repaired PDF

Corrupted · Truncated · Bad xref · Garbled pages · Password issues

Repair Options

Recover readable pages
Rebuild from valid objects
Skip unreadable pages
High quality render (2×)
Include repair report
Force white background

Share this Tool

The Complete Guide to PDF Repair & Recovery

Everything you need to know about PDF corruption — what causes it, what types of damage are repairable, how browser-based PDF repair works, and what to do when recovery fails.

What Is PDF Corruption?

PDF corruption occurs when a PDF file's binary data is damaged, incomplete, or structurally invalid in a way that prevents standard PDF readers from parsing and rendering its contents correctly. A corrupted PDF may refuse to open entirely, displaying an error message; open but show garbled or missing pages; display some pages correctly while failing on others; show a blank white document with no visible content; or open with warning messages about incorrect structure.

PDF corruption is surprisingly common — PDFs are complex binary documents with strict structural requirements, and any disruption to the data stream during creation, transfer, or storage can produce a file that violates those requirements. The good news is that many common types of PDF corruption are at least partially recoverable, because the underlying page content (fonts, graphics, text) is often intact even when the file's navigation structures are damaged.

Key insight: Most PDF readers fail on an entire file when they encounter a single structural error in the file header, cross-reference table, or page tree — even if 95% of the page content is perfectly intact. PDF repair tools exploit this by using more permissive parsers (like PDF.js) that can extract individual pages despite header-level damage, then rebuild a clean file from the recoverable content.

Causes of PDF Damage

Understanding what causes PDF corruption helps identify which recovery approach is most likely to succeed:

⚡ Incomplete Transfer / Download Interruption

The most common cause. When a PDF download is interrupted — by a network drop, browser crash, or user cancellation — the file is truncated. The beginning of the PDF may be intact while the end (containing the cross-reference table and file trailer) is missing or incomplete. PDF.js can often recover most pages from these files despite the missing trailer.

💾 Storage Media Failure

Bad sectors on a hard drive, SSD bit-rot, failing USB drives, and corrupted flash storage can damage individual bytes or entire blocks of a PDF file. This type of corruption tends to affect specific page content rather than the file structure, often leaving a PDF that can be partially opened with individual page rendering errors.

✉️ Email Encoding Issues

PDFs transmitted via email are binary files encoded in Base64 for MIME compatibility. If the email client incorrectly encodes or decodes the attachment, or if line-ending conversions are applied to a binary file, the resulting PDF will have corrupted byte sequences throughout. This produces consistent garbling that is difficult to recover.

🔧 PDF Generator Bugs

Older or poorly implemented PDF generators (certain versions of Word "Save as PDF", some printer drivers, legacy scanning software) produce non-standard PDF structures — incorrect cross-reference offsets, malformed object headers, or invalid content streams — that trigger parsing failures in strict readers. PDF.js's permissive parser handles many of these cases.

📁 File System Corruption

Operating system crashes during file writes, abrupt power loss, or NTFS/ext4 journal errors can leave a PDF file partially written with corrupted metadata. The file system may report the file as intact (with its original size) while the contents are zeroed-out or garbled from a certain byte offset onward.

🔐 Version / Encryption Mismatch

PDFs opened in a viewer that doesn't support their PDF version number (e.g., a PDF/A-4 or PDF 2.0 file opened in an old Reader), or encrypted PDFs where the encryption dictionary references unsupported cipher suites, may fail to open even though the file is structurally intact. Recovery involves isolating the page content streams.

Inside a PDF File — What the Repair Tool Examines

A PDF file has a precise binary structure. When our repair tool analyses a PDF, it inspects each of these structural components and reports the status of each:

1
File Header — The first 1024 bytes must begin with %PDF- followed by a version number (1.0–2.0). The tool checks for valid PDF header bytes and reports the detected version. A missing or corrupted header is the most severe damage type — it means the file is completely unreadable by standard parsers.
2
Cross-Reference Table (xref) — The xref section maps object numbers to their byte offsets in the file, allowing the reader to jump directly to any object. Damaged xref tables (incorrect offsets, missing entries, truncated table) are one of the most common causes of "cannot open PDF" errors. The tool scans for xref integrity and reports mismatches.
3
File Trailer & EOF Marker — The file trailer points to the xref table and root catalogue. The EOF marker (%%EOF) signals the end of the file. Truncated files may be missing both — the tool checks for valid trailer and EOF markers and diagnoses truncation.
4
Page Tree & Catalogue — The document catalogue contains the page tree root, which lists all pages and their resources. A corrupted page tree means the reader can't enumerate the document's pages. The tool attempts to traverse the page tree and reports accessible vs. missing pages.
5
Content Streams — Each page's visual content (text operators, graphics commands, image data) is stored in compressed content streams. The tool tests whether each page's content stream can be decoded and rendered, reporting which pages are fully recoverable, partially recoverable, or unreadable.
6
Encryption & Permissions — The encryption dictionary specifies whether the PDF is password-protected and what permissions are granted. The tool detects encryption flags and reports whether the file requires a password, distinguishing between encrypted-but-accessible (open password cleared) and fully locked files.

How the Repair Process Works

Our browser-based PDF repair uses PDF.js's permissive parsing mode to attempt reading the file even when structural errors would stop standard readers, then rebuilds a clean PDF from the recoverable content using jsPDF.

Phase 1: Binary Analysis

The raw file bytes are read and inspected for: PDF header signature, version number, xref keyword presence, trailer dictionary, and EOF marker. This gives an initial structural health score before any page rendering is attempted. Issues found here are flagged in the diagnostic report.

Phase 2: PDF.js Permissive Parse

PDF.js is initialised with disableAutoFetch: true and tolerant error handling. It attempts to open the file and enumerate pages, using its internal xref repair routine for damaged cross-reference tables. Pages that PDF.js can access are queued for recovery.

Phase 3: Page-by-Page Recovery

Each accessible page is rendered to an HTML5 Canvas at 2× resolution. Pages that render successfully are captured as JPEG images. Pages that throw errors during rendering are marked as failed and skipped (if skip-errors is enabled) — preserving all recoverable pages without blocking on bad ones.

Phase 4: Clean PDF Rebuild

jsPDF creates a fresh, structurally valid PDF document. Each recovered page canvas is embedded as a full-page JPEG image. An optional repair report page summarising all diagnosed issues, recovered pages, and failed pages is appended. The result is a clean, fully valid PDF with no structural errors.

Repair Limitations — What This Tool Cannot Fix

Honest about limitations is important for a repair tool. Here are the cases where browser-based repair cannot fully restore a PDF:

Password-Protected PDFs (locked content)

If a PDF requires a password to open and you don't have that password, neither this tool nor any other browser-based tool can decrypt the content. Password-protected content streams use 128-bit or 256-bit AES encryption — mathematically infeasible to break without the key.

Completely Zeroed or Random-Byte Files

If a PDF file's contents have been replaced by zeros (common with some types of storage failure) or random bytes (malware-damaged files), there is no content to recover — the file contains no valid PDF object structures, and repair is impossible regardless of the tool used.

Heavily Corrupted Content Streams

If the individual page content streams are corrupted (not just the xref/header), the page will render incorrectly or not at all. The repaired PDF will mark these pages as failed and skip them. The text and visual content from corrupted pages cannot be recovered without server-side tools like Ghostscript or QPDF.

Searchable Text Layer After Recovery

Because recovery works by rendering pages to canvas images and re-embedding them as JPEGs, the recovered PDF is image-based — the text is no longer selectable or searchable. If searchable text is critical, use the recovered PDF as a visual reference and consider running OCR software on the output to restore text selectability.

Who Benefits?

Office & Admin Professionals

Administrative staff who encounter "cannot open PDF" errors on important documents — contracts, invoices, reports received from clients or systems — use repair tools to recover the content without waiting for the sender to re-send the file. Even partial recovery is valuable when a document is urgently needed.

IT Support & Helpdesk

IT support technicians who receive support tickets about corrupted PDF files use repair tools as a first-line diagnostic — running the diagnostic report to identify the specific corruption type before deciding whether to attempt recovery, ask the user to re-download, or escalate to server-side recovery tools.

Legal & Compliance Teams

Legal professionals handling archived contracts, court filings, and compliance documents in PDF format occasionally find historical files that have become corrupted in storage. Repair tools provide a first attempt at recovery before escalating to professional data recovery services for critical legal documents.

Students & Researchers

Students who download academic papers or receive course materials as PDFs sometimes encounter corrupted downloads — especially from older repositories or slow institutional servers. The repair tool provides a fast, free first recovery attempt before re-downloading or contacting the source.

Real-World Use Cases

📋 Recovering a Truncated Contract

A sales manager receives a 30-page contract as a PDF email attachment. The email server truncates the file at 5MB, cutting off the last 8 pages. The PDF opens but shows only 22 pages, with the final pages missing. The repair tool diagnoses "truncated EOF — 22 of 30 pages recovered" and rebuilds the PDF with the 22 recoverable pages, allowing the manager to review the available content while requesting a re-send of the full document.

💾 Recovering an Archived Report from a Failing Drive

An IT administrator finds a quarterly financial report PDF on a partially failing hard drive. The file opens in Acrobat Reader with "There was an error reading this document" on pages 4–7, while pages 1–3 and 8–12 render normally. The repair tool's page-by-page recovery mode renders all pages individually, successfully recovering 10 of 12 pages and skipping the 2 with corrupted content streams.

🔧 Diagnosing a PDF Generator Bug

A developer receives user reports that PDFs exported from their web application won't open in Safari. The repair tool's diagnostic report identifies "invalid xref offset — objects appear sequentially but xref table references incorrect byte positions." This pinpoints the issue to the PDF generator's cross-reference calculation logic, enabling a targeted fix rather than a full PDF library replacement.

📚 Salvaging a Research Paper Download

A student downloads a 50-page journal article over a slow university Wi-Fi connection. The browser reports the download completed, but the PDF shows "file is damaged and cannot be repaired" in Adobe Reader. The repair tool's binary analysis immediately identifies "file truncated — downloaded 847KB of estimated 1.2MB — EOF marker missing." The student knows to re-download rather than spending time on recovery attempts.

  • Key Features of Our PDF Repair Tool

    A comprehensive PDF diagnostic and recovery suite — with structural analysis, page-by-page recovery, repair logs, and health scoring — all running privately in your browser.

    01

    Comprehensive Diagnostic Suite

    Analyses 12+ PDF structural components: file header validity, PDF version, xref table integrity, EOF marker presence, page tree accessibility, encryption flags, content stream decodability, linearisation status, and file size vs. expected size. Each check produces a pass/warn/fail result with a specific technical explanation.

    02

    PDF Health Score

    A 0–100 health score summarises the overall integrity of the PDF, colour-coded from critical (red) through poor/fair/good/excellent (green). The score is calculated from the weighted results of all diagnostic checks, with critical issues (missing header, no readable pages) reducing the score more heavily than minor warnings.

    03

    Page-by-Page Recovery

    Every accessible page is individually rendered to a canvas at 2× resolution, with per-page error handling that skips failed pages without blocking the recovery of subsequent pages. A recovery grid shows a thumbnail of each page with its status — recovered (green), partial (amber), or failed (red).

    04

    Zero Upload · 100% Private

    Your PDF — whether it contains confidential contracts, personal documents, financial records, or proprietary content — never leaves your device. All analysis, parsing, rendering, and PDF rebuilding runs in your browser's JavaScript engine. Fully safe for the most sensitive documents.

    Pro Tips for PDF Recovery

    🔍
    Read the diagnostic report before deciding on next steps

    The diagnostic report often reveals the specific issue immediately — "file truncated", "xref offsets incorrect", "encryption detected". Knowing the exact problem saves you from trying recovery methods that won't work. A truncated file needs re-downloading; a password-protected file needs the password; a generator-bug file may recover fully with this tool.

    💾
    Always keep a backup of the original corrupted file

    Before any repair attempt, ensure you have a copy of the original corrupted file. Repair tools replace the corrupted file with a rebuilt version — if the rebuilt version is missing pages that might have been recoverable with a different tool or approach, you need the original to try again. Keep both the original and the repaired version.

    🖨️
    Try "Print to PDF" as an alternative repair method

    If a PDF opens in a viewer but shows rendering issues or garbled text, opening it in Google Chrome and using File → Print → Save as PDF often produces a clean, re-rendered PDF by having Chrome's built-in renderer reprocess the content. This works for many minor corruption issues and preserves text selectability in the output.

    ⚙️
    For severe corruption, try Ghostscript or QPDF as next steps

    If browser-based repair recovers only a fraction of pages, server-side tools provide deeper recovery. Ghostscript (gs -o repaired.pdf -sDEVICE=pdfwrite input.pdf) and QPDF (qpdf --repair input.pdf output.pdf) are the gold standard for serious PDF structural repair and are available free for Windows, macOS, and Linux.

    Frequently Asked Questions

    Conclusion

    PDF corruption is a common, frustrating problem that affects everyone who works with PDF documents — from home users who find they can't open a downloaded form to IT professionals managing document archives with thousands of files. A comprehensive diagnostic report that clearly identifies the type and severity of corruption is often as valuable as the repair itself — it tells you exactly what happened, which pages can be recovered, and whether more powerful tools are needed. This free browser-based repair tool provides instant structural analysis, page-by-page recovery with a visual grid, a detailed health score, and a clean rebuilt PDF download — all without uploading your document to any server, making it safe for the most sensitive files in your possession.

    Ready to Repair Your PDF?

    Drop your damaged PDF above — get an instant diagnostic report, page recovery grid, and a repaired PDF download. Zero uploads, completely free.