Together with my father, I've decided to digitalize our family's photo collection. And with photo collection, I mean real physical photos in a big 80 pages album.
We went the boring way of scanning each album page with a flatbed scanner and then cutting out each individual photo from the scanned page (each page contained on average 4 photos).

There were 8 albums each with roughly 50 to 80 pages. This resulted in a lot images and as lazy as I am, I didn't want to crop each photo manually. I've built a little PHP command which uses ImageMagick to do the heavy lifting of detecting photos in a scanned image, crop the photo and save it in a separate file.

Sidenote: I've started the project in June 2016 and since then, Google has released an app called PhotoScan which does similar stuff as I've described here. I've tested the app with multiple images but the quality of the final photo didn't reach the quality as my "manual" workflow. Besides I had to digitalize like 1000 photos. This would be a tedious job if you have to hold your phone for multiple seconds on the same picture and wait for the app to generate the final picture.

Scan and Cut

My father scanned each individual album page with his flatbed scanner with maxed out DPI settings and send the files to me. At the end, I had hundreds of files ready to be analyzed by my script.

A handful of scanned album pages. These are the source files which are being processed in the next step.

As a programmer, I obviously didn't want to open each image, copy and paste each photo onto a new canvas and store it into a given folder structure. So I researched a bit and found the perfect solution to my problem: ImageMagick Multicrop. As the name suggests it's a script which relies on ImageMagick and does a wonderful job at recognizing shapes (photos) in a bigger photo, crops them and stores them separately.

I've developed a small wrapper around the script so I can pass an entire folder to the script and let it run for an hour or two and let it to its magic ✨. I've published my wrapper on Github and on Packagist.

After the script has run through all the scanned pages, I've manually reviewed each scan and the cropped photos. I kept track of each image in a simple Excel Sheet and manually cropped some images which didn't look right.

In the end, I had to manually crop and adjust only 100 out of 1700 images. Not bad right?

This is the output of the `multi-photo-crop` command. Each photo has been cut out and saved in a separate file.

However, it was quite a time-intensive task. As I mentioned in the beginning, I pitched this idea to my father in August 2016 and it took until July 2017 until I've finished the last batch of photos and sent him a copy of the images back.
Even though I've worked on this project many evenings I didn't feel that those hours were wasted. I thoroughly enjoyed going through those old images.