There are many ways to convert a PDF to text, and some of those methods won’t give you the results you need.
The Easiest Ways to Convert PDF to Text on Mac: Here, we’ll show you the right ways to convert PDF to text and highlight some PDF-to-text converter apps that we like.
OCR PDF to text
Prizmo is a very powerful document scanner and PDF exporter for Mac. It stands out when you need to convert a scanned PDF to text on your desktop, but it can do a lot more than that!
Prizmo also converts PDF files to text using advanced OCR (optical character recognition). Its ‘recognize’ feature scans your PDF files for characters, helping you know what should be converted into a text file and what could be an image or format that can be skipped.
We like Prizmo because its PDF-to-text OCR is brilliant and omits formatting that other converters may overlook. In parallel testing, we have found that it produces better text documents compared to many other services.
Here’s how to use Prizmo to convert your PDF document into a text file:
- Open Prizmo. Select “New Document”
- Choose “Open Image File”
- Select the PDF you want to convert to text on your computer.
- Select ‘Recognize’ in the top right corner of the screen.
- On the menu bar, go to File > Export.
- From the drop-down menu, select “Rich Text.”
- Select “Export to File”.
- Name your new text file and choose where you want to save it.
Convert pdf to searchable text
If you have large PDFs that you want to convert into text documents, chances are you want the data in those new files to be as easy to search as it was with your PDF. This is especially important for PDFs with many words, as text documents lack formatting. Instead of visual clues that tell you where to look, you’ll have plenty of monotonous text.
PDFPen is a great application programmed to make PDF files editable, which is perfect when you need to sign documents or for people to fill out forms. But it’s also a really powerful PDF-to-text converter and allows for bulk conversion in case you need that functionality.
Here’s how you can convert a PDF to text with PDFPen:
- Open PDFPen and select the document you want to convert. Select ‘Open’.
- On the menu bar, go to File > Export.
- Choose ‘Rich Text’ from the drop-down menu.
- Select “export” and you’re sure.
That’s all you have to do! You now have a searchable text document from your PDF.
Keep a few things in mind when using PDFPen to convert PDF documents into text files. It retains some of the formatting, which can be useful in certain cases. While there are times when you’ll definitely want to remove all the formatting that a PDF has, sometimes objects are actually necessary bits of information.
We had converted a financial document with zero percent starting rates for a credit card, and most text converters omitted the “0” from the document because it was an object, not a character. PDFPen didn’t have any problems with that but retained some formatting after converting the PDF to a text document, which may not be exactly what you’re looking for.
Convert handwriting to text
When we talked about converting PDF to text, the most common thing is to talk about OCR (optical character recognition). While many OCR services try to remove from the final product anything that isn’t immediately recognizable as a character, MathKey does things a little differently.
The app is very useful for those who prefer to write by hand, especially for mathematical purposes. The app is designed to recognize mathematical equations, which is ideal for those who like to use the pen on the screen on an iPad with Apple Pencil to solve math problems.
Because it’s focused on math, MathKey allows you to export scanned images as images, LaTex, or MathML.
MathKey also has a very useful option to link your iPad or iPhone with the Mac. In the mobile app, you can scan a QR code to link it to the MathKey website. Once linked, you can transfer your math problem drafts to your Mac, where you can convert your handwriting. After converting it, you can export it as an image file, LaTex or MathML, which can be used in any document you want.
This is great for those times when you have to include some math equations in a document, but you don’t feel like wasting time with Mac keyboard shortcuts to typing the equations.
How to Use Automator to Convert PDF Files to Text: A Native MacOS Solution
You can use Preview to save text files as PDFs, but you can’t do the same to save PDF to txt. Instead, Apple’s built-in Automator is your ideal solution. But keep in mind that Automator is a “professional” tool, so follow these steps to not spoil anything:
- Open Automator on your Mac.
- Select “New Document”.
- Select “Workflow” from the menu that appears.
- From the menu on the left side, select “Files and Folders”.
- In the submenu to the right of the main menu, find “Ask for Finder items”. Drag it to the open space at the far right of the window.
- Select ‘PDFs’ from the menu.
- Select “Extract PDF Text” from the submenu that appears. Drag it to the right side of the screen, under “Ask for Finder items”.
- Under “Extract PDF Text,” select “rich text” as the output type. Your Automator screen should look like this:
- From the menu bar, select File > Save.
- Enter the name of your new app.
- Choose where you want to save your app.
- Select “Application” as the file type. (The default value is “workflow.”)
That’s all you have to do to set up your new Automator app. Now, let’s run it and extract the text from the PDF:
- Double-click your app.
- Choose the PDF file you want to convert and select ‘Choose’ at the bottom right of the window.
Now your PDF has been converted into a text document and saved to your desktop. All you have to do is open it and your PDF can be read as a text document!
Some things to keep in mind. First, your PDF will not be destroyed or altered in any way. This Automator app-only extracts the text from the PDF and saves it as a new file.
Because the app will only convert PDF to text, the images will not be converted.
How to convert text to PDF in Adobe Acrobat
You can use OCR with Adobe Acrobat, although there are a few things to keep in mind. Acrobat is a professional tool and can be really difficult to use for beginners or newbies. Because it was designed by professionals, those who don’t need a powerful PDF viewer will likely find Adobe Acrobat too complicated for most uses.
And remember, while we are talking about a specific case of converting PDF files into text documents, that does not mean that it has to be difficult. That’s why we suggest Prizmo, PDFPen, and MathKey. Chances are, one of those apps will suit your needs much better than Adobe Acrobat.
But if you want to use Adobe Acrobat’s OCR functionality, here’s how:
- Open a PDF file in Acrobat.
- Select ‘Tools’ in the upper right corner of the window.
- Select “Recognize Text.”
- Select “In this file”.
- On the next screen, choose how many pages of your document you want to scan. You can also edit the language or output preferences by clicking “edit”.
That’s all you need to do to scan the document, but it’s much more accurate than other apps.
This also doesn’t export your document, so keep that in mind. All you’ve done with Adobe Acrobat is that you can search within the PDF and make each character recognizable.
And what about online PDF OCRs?
There are several online services for scanning PDF files using an online version of the OCR. Their functions are similar – they all scan your PDF files and convert them into text documents using an optical character recognition service, but there are things to keep in mind as well.
First, you’re using a cloud-based service and there’s no way to know what’s going on with your document. While we don’t doubt that most simply convert PDF files to text, we wonder why they do it for free. It may be to train a machine learning algorithm using what amounts to crowdsourced data, or they may be saving copies of your PDFs or text files for some purpose.
Online services are sometimes a front to get your email address for marketing purposes. They may also have a business model where your first scan is free and subsequent scans cost money, or you need to subscribe to some service. Many might do it just to do a few scans, but if you forget you have the subscription, the cost will add up over time.
Simply put, we prefer to use native apps to get things done.
Converting a PDF to a text file is one of those cases where your needs are unique enough that it may be difficult to find a solution, but urgent enough to need a solution in no time.
It’s hard to recommend Adobe Acrobat. While it’s robust, it’s usually quite difficult for most of us to use. Adobe thrives in legacy enterprise environments where businesses need the power that Acrobat provides.
Automator is practical and provides good results. It has kept too much formatting for our liking and has rendered text in colors. The text documents were opened with the TextEdit app on Mac, and it was difficult to read much of the text that Automator has generated for us. We wanted a clear and accurate text of a PDF that was readable and searchable in the text.
Prizmo and PDFPen stand out in this. We both like them because they allow us to do these tasks easily. Each has strengths in different aspects; Prizmo has a much stricter OCR engine, while PDFPen captures more page information. We’ve done some parallel testing of three documents and found that this is true in all the PDF documents we tested.
There’s no clear recommendation, but we think both are easy enough to use, so it wouldn’t be fair to name a winner. If one doesn’t work the way you want, the other app can meet your needs quickly and easily.
Best of all, Prizmo, PDFPen, and MathKey are available as part of a free trial of Setapp, the most comprehensive suite of productivity apps for Mac.