Home
Posted By: JoBoy OCR to convert .tiff to .doc or docx - 07/14/14 11:27 PM
Somewhere on my Mac Pro I have used an application that contains an OCR function to convert a .tiff document to .doc or .docx. It has been so long since I've done it that I can't remember where it is. I have the Adobe Creative Suite CS4. it's old but it does the job for me just fine. i also have the Microsoft Office 2008 collection. I thought it was on Photoshop, but I can't find it there. Any suggestions?
Posted By: ryck Re: OCR to convert .tiff to .doc or docx - 07/15/14 11:32 AM
Is it included with your printer software? I have a Canon that came with software that does OCR. It's not as good as a dedicated OCR program but, with the limited OCR I need to do and the help of a spell checker, it gets the job done.

I haven't used it for a while but, as I recall, it didn't convert directly to any writing software. It created a text file that could then be popped into Word or whatever.
Posted By: JoBoy Re: OCR to convert .tiff to .doc or docx - 07/15/14 04:03 PM
Thanks for the reply. My printer is an HP LaserJet 4200n. However, I see no reference to OCR associated with it. I had a stand-alone OCR app before OS X came on the scene and it was superb, but it did not carry over to OS X.

I'd be grateful for a text file. That would be all I need.
Posted By: grelber Re: OCR to convert .tiff to .doc or docx - 07/15/14 04:52 PM
It seems to me that you should be able to import a .tiff file directly into Word, view it and manipulate it there, much like a PDF file.
(But then I've never seen a TIFF file. If there's one easily accessible on the Web, I could check it out, play with it a bit, and see what I come up/out with.)

Every PDF file that I've come across (but especially those with text, such as journal articles) allows me to highlight passages and then cut and paste them into a Word document. That should work with a TIFF file which is tantamount to the same thing (from what I can gather).
Posted By: JoBoy Re: OCR to convert .tiff to .doc or docx - 07/15/14 05:59 PM
Thank you for the suggestion. I tried it and it definitely did not work. A .tiff document is a picture just the same as one that I get when I take a picture with my digital camera. There is no text involved. The text I'm looking for results from using an Optical Character Reader (OCR) that has been programmed to interpret an image into text. As I mentioned above, I had a stand alone OCR before OS X, but I don't have one now. I still believe that there is one somewhere on my current machine, but it has been so long since I used it that I can't remember where it is located. Spotlight doesn't help, so I'm wondering if it has disappeared with an application that I have discontinued.
Posted By: joemikeb Re: OCR to convert .tiff to .doc or docx - 07/16/14 12:52 AM
There are a number of OCR applications in the App store but I have not found any that begin with a Tiff, jpeg, etc image. Instead they all start with a PDF. However, Adobe Acrobat, PDF Pen, PDF Pen Pro, Evernote and several others will work from a PDF image and all of them can with some degree of success, convert the OCRed output into some Word compatible output. Often requiring a substantial bit of editing to get the format close to correct.

You might start by using Graphic Converter 9 to convert the Tiff to PDF and yarn using one of the aforementioned applications OCR the resulting PDF file. If you have the printed image and a scanner, Graphic Converter can drive the scanner and go direct to PDF. Evernote can drive the scanner to drive the document and go directly to OCR PDF text.
Posted By: JoBoy Re: OCR to convert .tiff to .doc or docx - 07/16/14 03:44 AM
Thanks for the info. I have Adobe Actrobat Pro and have no problem converting .tiff to PDF. I'm more concerned about getting a good result from a PDF file into text. Do you have a favorite? If you're uncomfortable endorsing a specific product, never mind. I'll check them out.
Posted By: joemikeb Re: OCR to convert .tiff to .doc or docx - 07/16/14 03:15 PM
If you have Adobe Acrobat Pro why not use its builtin OCR capability? Certainly you paid enough for it. Quite honestly since I got Evernote most of my work goes through it and I have found its OCR to be very reliable and accurate. The software that came with my Fujitso Scansnap scanner also has a very satisfactory OCR capability. Although I do not use it often, it is simply easier and more direct to use other tools, PDF Pen has good OCR capability.

There all sorts of applications with builtin OCR ability but often the trick is finding out where it is hidden the menus.

You can find the instructions for using Acrobat OCR here.
Posted By: JoBoy Re: OCR to convert .tiff to .doc or docx - 07/17/14 10:42 PM
Thank you for your very helpful post. With great enthusiasm, I carefully followed the instructions for Acrobat OCR. I converted the PDF document to MS Word.doc. The result was an MS Word document in proper layout, but there were strange lines enclosing every sentence and paragraph. I could not select and modify words or sentences. I also tried .rtf and .txt, but these did not even yield any text. It was just a blank page for both of them. Assuming the directions were error-free, I conclude that my version of Acrobat Pro is too old to accomplish the purpose of the instructions. At the very beginning, the instructions say I should click the blue Tools button in the top right of the toolbar. There is no such thing on my version of Acrobat 9.5.5. Instead, I start by going to Document>OCR Text Recognition>Recognize Text Using OCR. I'm afraid I can't get it done. Unless you have an Ace up your sleeve that I can borrow, I'll have to look for an OCR app on the App Store. Best regards. It's always good to hear from you. I hope you healed completely from your brush with the doctors.
Posted By: grelber Re: OCR to convert .tiff to .doc or docx - 07/24/14 07:03 PM
Just out of curiosity ...
If you're running MacOS X 10.8.5, have you tried Preview to convert file formats (by opening the file and then choosing File → Export)?
Posted By: JoBoy Re: OCR to convert .tiff to .doc or docx - 07/24/14 10:15 PM
I just tried it. There is no OCR function available as far as I could determine. It will change a PDF to other image formats, but not to a text format.
Posted By: grelber Re: OCR to convert .tiff to .doc or docx - 07/25/14 12:10 AM
I must be missing something salient in this discussion. I was under the impression that you had a TIFF file which you wanted to convert to some other format so that you could perform OCR on text which formed part of the file. If you can convert your TIFF file to PDF (via Preview, for example) then you should be able to grab (highlight) the text portion and then cut-and-paste it into a Word document and manipulate it from there. (I do the latter all the time, as I've pointed out.)

So, what am I misconstruing?

Posted By: JoBoy Re: OCR to convert .tiff to .doc or docx - 07/25/14 12:39 AM
I want to change an entire document from .tiff to text. I have Acrobat Pro 9.5.5 that has a built-in OCR function, but it is apparently too old to properly create a useable text document. (See item #30590 - 07/17/14 05:42 PM sent to joemikeb in this thread.) Acrobat Pro has already converted the .tiff to PDF.

I then decided that I needed to buy a stand-alone OCR app from the Apple Store, but I have not yet done so.

Since I already have the document in a PDF file, all I need is an OCR app to do the job so I can then copy and paste it into MSWord.

I looked at Preview to see if it had an OCR function, but I did not see one. If there is one there, I'd love to know where in Preview it is located.

I hope this clears up what I perceive my need to be. Thanks for your patience.
Posted By: artie505 Re: OCR to convert .tiff to .doc or docx - 07/25/14 05:06 AM
I'm running Snowy, and I'll take a look to see what I can see if you are able to link me to a .tiff similar to the one you want to work with.
Posted By: grelber Re: OCR to convert .tiff to .doc or docx - 07/25/14 08:23 AM
Same for me – re TIFF example.
I still don't understand the problem (since an OCR app isn't required to do what you seem to want to do). So I'd like to fiddle around with a specific example too.
Posted By: joemikeb Re: OCR to convert .tiff to .doc or docx - 07/25/14 01:39 PM
Originally Posted By: JoBoy
I want to change an entire document from .tiff to text. I have Acrobat Pro 9.5.5 that has a built-in OCR function, but it is apparently too old to properly create a useable text document. (See item #30590 - 07/17/14 05:42 PM sent to joemikeb in this thread.) Acrobat Pro has already converted the .tiff to PDF.

I then decided that I needed to buy a stand-alone OCR app from the Apple Store, but I have not yet done so.

Since I already have the document in a PDF file, all I need is an OCR app to do the job so I can then copy and paste it into MSWord.

I looked at Preview to see if it had an OCR function, but I did not see one. If there is one there, I'd love to know where in Preview it is located.

I hope this clears up what I perceive my need to be. Thanks for your patience.

In other words you need selectable, searchable, and editable text, not a graphic image of the text. Correct?

No matter what image format, tiff, jpeg, PDF, etc. you are working from it will have to be OCRed to make it searchable, selectable, and editable. I got to thinking that the very accurate OCR feature in Evernote is performed remotely on the Evernote servers so I did a Google search for "online OCR" and came up with a number of hits for FREE online OCR services. Admittedly I have not tried any of them but at FREE the price seems right. Among the services I found are:
That should be enough to provide a reasonable cross section of the available online OCR tools. Hope this helps
Posted By: grelber Re: OCR to convert .tiff to .doc or docx - 07/25/14 02:07 PM
Originally Posted By: joemikeb
Originally Posted By: JoBoy
I want to change an entire document from .tiff to text. ...

Since I already have the document in a PDF file, all I need is an OCR app to do the job so I can then copy and paste it into MSWord.

In other words you need selectable, searchable, and editable text, not a graphic image of the text. Correct?

No matter what image format, tiff, jpeg, PDF, etc, you are working from it will have to be OCRed to make it searchable, selectable, and editable.


To my mind this makes no sense.
A PDF file (essentially a photograph) is selectable, searchable and editable via Preview and Adobe Reader inter alia.
Why would it need to be OCRed when it's already manipulable?

I'm getting a surreal feeling of being in a Monty Python sketch or a Dilbert cartoon.
Posted By: JoBoy Re: OCR to convert .tiff to .doc or docx - 07/25/14 03:21 PM
Thank you. It is very helpful. However, price is not the key issue. My work involves privileged or confidential information and I am very hesitant to run the info through an online service not knowing what may be done with it. I haven't purchased a free standing OCR yet because the need went away. I did a work-around that was sufficient for that situation. It wasn't perfect, but it was good enough. I'll follow up on your recommendations when I get a little time to give them a good look. Again, thanks.
Posted By: JoBoy Re: OCR to convert .tiff to .doc or docx - 07/25/14 03:49 PM
Being manipulable and being completely and conveniently selectable, searchable and editable are two different things. I need to go from a graphic image to something with Word's capabilities.
Posted By: alternaut Re: OCR to convert .tiff to .doc or docx - 07/25/14 04:09 PM
Originally Posted By: grelber
To my mind this makes no sense.
A PDF file (essentially a photograph) is selectable, searchable and editable via Preview and Adobe Reader inter alia.
Why would it need to be OCRed when it's already manipulable?

To begin with your final exhortation: because it isn't already manipulable. Elaborating on Joboy's previous post: it seems to me that you may have missed the following: PDFs may contain both text and images. Depending on the PDF(author)'s settings, you may select and copy those elements. JoBoy is referring to image elements which depict text, but can only be selected and copied as image. It's those text images he'd like to convert to editable text.
Posted By: grelber Re: OCR to convert .tiff to .doc or docx - 07/25/14 05:20 PM
Originally Posted By: alternaut
To begin with your final exhortation: because it isn't already manipulable. Elaborating on Joboy's previous post: it seems to me that you may have missed the following: PDFs may contain both text and images. Depending on the PDF(author)'s settings, you may select and copy those elements. JoBoy is referring to image elements which depict text, but can only be selected and copied as image. It's those text images he'd like to convert to editable text.

OK, now I get it. Merci.
Posted By: tacit Re: OCR to convert .tiff to .doc or docx - 07/25/14 10:39 PM
Acrobat 9 should absolutely be able to do OCR; that capability has existed in Acrobat for a really long time. I remember doing it in Acrobat 6, if memory serves.

Posted By: grelber Re: OCR to convert .tiff to .doc or docx - 07/25/14 11:38 PM
In Acrobat 10 (and probably 9):
Open document.
Bring View menu down and select Tools, under which select Recognize Text. Alternatively, just select Tools in the Acrobat toolbar.
That should open a Tools sidebar on the right under which OCR manipulations can be carried out.
Posted By: JoBoy Re: OCR to convert .tiff to .doc or docx - 07/26/14 02:35 AM
The OCR function is under Document on Acrobat Pro 9.5.5. I found no reference to OCR under the View menu and also under Tools. Recognize Text also could not be found.

I really appreciate your desire to be of help, but I have decided to just get a stand alone OCR app and let it go at that. Again, thanks for the effort. By the way, I am using old software because it suits my purposes. When it fails to do that, I'll move up, but I don't grab the latest and greatest every time a new offering appears. I notice a lot of us on this forum tend to do that. If you're using Acrobat 10, then you know what I mean. Best regards.
Posted By: JoBoy Re: OCR to convert .tiff to .doc or docx - 07/26/14 02:48 AM
I know it SHOULD, but it just won't do what it's supposed to do. I could reinstall the app, but that is a very tedious thing with Adobe CS4. I don't have the steam to try it. Thanks for the encouragement, but it isn't worth the effort for the few times that I really need OCR.
Posted By: ryck Re: OCR to convert .tiff to .doc or docx - 07/26/14 01:34 PM
Originally Posted By: JoBoy
....but it isn't worth the effort for the few times that I really need OCR.

If a page at a time might work for you, perhaps there's an option here. I cannot personally speak for this software but it seems to be reasonably well regarded. Maybe someone else at FTM knows more.

This product has a full-blown version that costs 30 beans....but also has a free download whose caveat is that you can only do single pages.
Posted By: JoBoy Re: OCR to convert .tiff to .doc or docx - 07/26/14 05:17 PM
Thanks, Ryck. I'll look at it when I pick my stand-alone OCR. I often have multi-page jobs when I do an OCR. I don't mind paying for the app.
Posted By: JoBoy Re: OCR to convert .tiff to .doc or docx - 08/11/14 05:14 AM
I spent a lot of time browsing the App Store for OCR applications. After reading a huge number of comments from users, I finally decided to buy a $10.00 app named Text Extractor. I have been pleasantly surprised. It extracted the contents of the four page PDF containing a graphic image of text and presented it as extracted or converted text showing each of the four pages. I tested it by giving it a workout extracting a document that had three changes in font size and boldness and also had some text centered and some against the right margin and the rest against the left margin. I then took the option to download it to my desk top in .txt format. The placement of text against the right margin and also text that was centered was lost. Everything was against the left margin. After that, I placed it in Word and began to work with it. Word seemed to be resisting some of my attempts to arrange things, so I started over using Adobe InDesign which is my preferred app for producing finished documents. It enables stable arrangement on a page and leaves it there until I decide to change it. I also have the choice of making each page independent so stuff on page 2 doesn't slop over onto page 3 unless I want to put it there. So it was a big workout to put things back together, but it's done and I'm very happy with it. I was especially pleased by the low number of errors in the extracted text. For ten bucks, it is a great value, but there is a lot of effort required when the document is as complex as the one I tried it on. A simple document would not have been difficult. Thanks again for everyone's input on this project. My problem is solved and I'm grateful to you.
© FineTunedMac