OS X Service for Copying Comments from PDFs

Today I was trying out ReaddleDocs on the iPad for reading PDFs of papers. This app is not as full-featured as GoodReader but it importantly supports syncing of entire folders with Dropbox, and I like the cleaner interface.[1]

While apps like this make it very easy to annotate PDFs with comments, which can then be synced across all your devices through the power of Dropbox, getting these comments back out again once you get back to your Mac can be a pain.

This is a ruby script that uses the pdf-reader gem to find all the comments in a PDF document and copy them to the clipboard:

require 'rubygems'
require 'pdf-reader'

ARGV.each do |file|

  pdf = PDF::Reader.new( file )

  comments = []
  pdf.objects.each_pair do |key, value| 
    if value.class == Hash
      comments << "* #{value[:Contents]}\n\n" if value[:Name] == :Comment
    end
  end

  IO.popen('pbcopy', 'w').puts comments

end

This script can be accessed as a Service with the following setup in Automator:

The only complication here is Automator’s insistence on using the system version of ruby (1.8.7) instead of v. 1.9.3 I have installed elsewhere in my path, which meant that the service initially could not find the pdf-reader gem. This was fixed by reinstalling pdf-reader into the system gem path with /usr/bin/gem install pdf-reader.

I would also like to be able to pull out highlighted text, and to add a page number to each comment in the copied output, but these appear to be advanced topics.[2]

Update: An alternative workflow using AppleScript and PDFpen is described here.


  1. The only feature that I miss from GoodReader is the ability to crop all the pages in a document to remove margins.  ↩

  2. Or left as an exercise for the reader.  ↩