Home/Formats/PDF → Text
Converter · PDF → Text (OCR)

PDF to
text.

Free · OCR · 20 MB · batch of 5 Works on real text PDFs and scanned image-PDFs alike.

Drop a PDF, get a .txt.

Use the converter on the home page and pick OCR (Extract Text).

Open the converter →

What you get

A plain UTF-8 .txt file containing the text from the PDF, page-marked. The text is extracted via Google Cloud Vision OCR, so it works whether the PDF is real text or a scanned image of text — both come out the same way on the other side.

Why OCR a PDF?

How it works

  1. Open the converter. Go to the Formatly converter — no signup required.
  2. Drop your PDF. Drag and drop one or more PDFs into the upload box (up to five files, 20 MB each). Both real text PDFs and scanned image-PDFs work.
  3. Pick OCR (Extract Text) from the dropdown. Each page is rendered to an image and sent to Google Cloud Vision — typical processing time is 1 to 3 seconds per page.
  4. Convert and download the .txt. Click Convert; a download link appears for a UTF-8 text file, page-delimited with --- Page N --- markers for easy splitting.

What works well

What doesn't

Tips

FAQ

Does this work on scanned PDFs? Yes — that's exactly what OCR is for. A scanned PDF is just a stack of images, so plain copy-paste or text extraction won't find anything. The converter renders each page to an image, runs it through Google Cloud Vision OCR, and assembles the recognized text into a single .txt file with per-page delimiters.

Does the PDF to text output preserve formatting? No. Output is plain UTF-8 text, page-delimited with --- Page N --- markers. Bold, italics, fonts, columns, and tables are not preserved. If you need a formatted, editable version of the PDF, convert to DOCX instead — that's a different tool because the OCR pipeline is text-only.

What languages does the PDF OCR support? English by default, plus most Latin-script European languages (French, Spanish, German, Italian, Portuguese, Dutch). CJK and right-to-left scripts (Arabic, Hebrew) are available on request via the contact form.

How long does PDF OCR take? Roughly 1 to 3 seconds per page. A 50-page document typically takes about a minute to render and OCR. If you only need a few pages, extract them first in Preview or Adobe Reader and upload the smaller file — that's much faster than processing the full document.

What happens to my PDF after the OCR runs? The uploaded PDF and the resulting text are auto-deleted from our servers after one hour. We don't store or analyze your document beyond completing the OCR pass. See Security for the full data-handling policy.

Related