Thursday, 27 November 2008

iPhoto Script to Tag Duplicates

I merged a whole lot of photo folders and albums recently and ended up with hundreds of duplicates. Not wanting to manually clean up the mess I searched for something that would help me out. I eventually found a script from Karl.

It inspired me to develop the idea a bit more. I wanted it to tag exact duplicate photos, photos that should be duplicates but are not, and photos that have been processed in some way. I also added some dialogs to remind you how to use it.

To use it, open iPhoto, select some or all of your photos and run the script.
I have also written a short script to remove the "duplicate", "similar" and "processed" comment tags so that you can 'reset' everything.

Note, this script only tags photos. It will not delete any photo (although it could be altered to do that if you wanted it to).

This is the AppleScript code. Copy and paste it into a file called pcIPhotoMarkDuplicates.scpt.

(*
iPhoto Mark Duplicates

Based on work by Karl Smith
http://blog.spoolz.com/2008/11/03/iphoto-applescript-to-remove-duplicates/
Copyright 2008 Phil Colbourn

This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with this program. If not, see
.
*)
tell application "iPhoto"
display alert "This script will tag photos that are
* duplicates - identical,
* similar - dates and sizes match but somehow different, and
* processed - dates match but smaller than the 'original'.

WARNING: This script is only effective if the selected photos are in date order. Please ensure that the photos are sorted by selecting the

View - Sort Photos - By Date, Ascending

from the iPhoto menu."
set curPhotos to selection
if (count of curPhotos) ≤ 1 then
display alert "You need to select the photos you want me to process."
else
-- this assumes the selected list of photos is in date order
set lastPhoto to item 1 of curPhotos
repeat with thisPhoto in rest of curPhotos
-- skip tagged duplicates
if comment of thisPhoto = "duplicate" then
else
set dupFound to false
try
if (date of thisPhoto = date of lastPhoto) and (width of thisPhoto = width of lastPhoto) and (height of thisPhoto = height of lastPhoto) then

set thisSize to size of (info for (image path of thisPhoto as POSIX file))
set lastSize to size of (info for (image path of lastPhoto as POSIX file))

if thisSize = lastSize then
set diff to "anything but empty"
try
-- run the unix diff program to compare the files.
-- if they are the same the variable diff will be empty.
-- if they are different or an error occurs then diff will not be empty.
set diff to (do shell script "/usr/bin/diff -q '" & (image path of thisPhoto as text) & "' '" & (image path of lastPhoto as text) & "'")
end try

if diff = "" then
set comment of thisPhoto to "duplicate"
set dupFound to true
else
-- there must be subtle changes so I will mark thisPhoto
--set comment of lastPhoto to "similar" -- for testing
set comment of thisPhoto to "similar"
set dupFound to true
end if
else
-- here I assume that the larger file has more information
-- and therefore it is the original
if lastSize > thisSize then
set comment of thisPhoto to "processed"
set dupFound to true
else
set comment of lastPhoto to "processed"
-- thisPhoto is assumed to be the original
end if
end if
end if
end try
-- Last=This keep using Last
-- Last~This keep using Last
-- Last>This keep using Last - Last is assumed to be he original but the next could be the original
-- Last
<>This step onto This
if not dupFound then
set lastPhoto to thisPhoto
end if
end if
end repeat

beep
beep
display alert "All duplicate, similar and processed photos have been marked.

Switch to the Photos Library and search for one of these keywords: duplicate, similar or processed.

Then delete the photos you do not want.

NOTE: If you do nothing then no photos will be harmed.

(I will now try to switch you to the Photo Library.)"
set current album to photo library album
end if
end tell


This is the AppleScript code. Copy and paste it into a file called pcIPhotoClearDuplicateComments.scpt.

(*
iPhoto Clear Duplicate Comments

Copyright 2008 Phil Colbourn

This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with this program. If not, see .
*)
-- clear comment field on all selected photos if it is duplicate, similar or processed
tell application "iPhoto"
set current album to photo library album
repeat with thisPhoto in (photos of current album)
if (comment of thisPhoto) is in {"duplicate", "similar", "processed"} then
set comment of thisPhoto to ""
end if
end repeat
end tell

16 comments:

  1. useful script - i modified it for iphoto 8 to use keywords instead of comments, makes it easier to clean up at the end, don't need an extra script, easy to select all the photos that were tagged, so you can see the originals next to those that were presumed to be duplicates. so yes, i tag the photos that are presumed to be originals with 'original'. i also changed the algorithm to only check if the ratio of height to width is the same - that way it will pick up duplicates that were scaled down in size (which is useful for me), but shouldn't pick up cropped photos which presumably really are different.

    the script will add the keywords 'original', 'duplicate', 'similar', 'processed' to iphoto - which is why this will only work with iphoto 8 - it is a nasty hack, but it works.

    not the fastest, but it is free . . . and it could be cleaned up plenty, but it appears to be functional as is . . .

    (*
    iPhoto Mark Duplicates

    Based on work by Karl Smith and Phil Colbourn
    http://blog.spoolz.com/2008/11/03/iphoto-applescript-to-remove-duplicates/


    This program is free software: you can redistribute it and/or modify
    it under the terms of the GNU General Public License as published by
    the Free Software Foundation, either version 3 of the License, or
    (at your option) any later version.

    This program is distributed in the hope that it will be useful,
    but WITHOUT ANY WARRANTY; without even the implied warranty of
    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
    GNU General Public License for more details.

    You should have received a copy of the GNU General Public License
    along with this program. If not, see .
    *)

    property delayAfterOpeningPreferences : 0.25
    property delayAfterSwitchingKeywords : 0.25
    property delayAfterTypingAKeyword : 0.25
    property delayAfterReturn : 0.2

    tell application "Finder"
    set theapp to application "iPhoto"
    set theversion to version of theapp
    if theversion does not start with "7" then return 0
    end tell

    tell application "iPhoto"
    activate
    display alert "This script will tag photos that are
    * duplicates - identical,
    * similar - dates and sizes match but somehow different, and
    * processed - dates match but smaller than the 'original'.
    * original - the photo that is likely the 'original'

    WARNING: This script is only effective if the selected photos are in date order. Please ensure that the photos are sorted by selecting the

    View - Sort Photos - By Date, Ascending

    from the iPhoto menu. Only works with iPhoto 8."
    set curPhotos to selection
    -- check for keywords, if not there, add them . . .
    set existingKeywords to name of every keyword
    end tell

    repeat with newkey in {"original", "processed", "duplicate", "similar"}

    if existingKeywords does not contain newkey then
    tell application "System Events"
    tell application process "iPhoto"

    -- open keywords
    keystroke "k" using {command down}

    delay my delayAfterOpeningPreferences

    -- switch to keywords editing
    try
    click button "Edit Keywords" of window "Keywords"
    end try
    delay my delayAfterSwitchingToKeywords

    -- add keyword
    tell window "Edit Keywords" to click button 6


    delay my delayAfterTypingAKeyword
    keystroke "k" using {command down}
    keystroke "k" using {command down}
    keystroke newkey
    delay my delayAfterTypingAKeyword
    keystroke tab
    keystroke "h" using {control down}
    keystroke return
    delay my delayAfterReturn

    tell window "Edit Keywords"
    -- exit the keywords windows
    click button "OK"
    end tell

    keystroke "k" using {command down}
    end tell
    end tell

    end if

    end repeat

    -- check for selection of photos
    tell application "iPhoto"

    if (count of curPhotos) ≤ 1 then
    display alert "You need to select the photos you want me to process."
    else
    -- this assumes the selected list of photos is in date order
    set countDups to 0
    set lastPhoto to item 1 of curPhotos
    set countPhotos to count of items in curPhotos
    repeat with i from 2 to countPhotos
    set thisPhoto to item i of curPhotos
    set allkeys to name of keywords of thisPhoto
    if "duplicate" is in allkeys then
    else
    set dupFound to false
    set ThisIsOriginal to false
    --try
    if (date of thisPhoto = date of lastPhoto) and ((width of thisPhoto) / (height of thisPhoto) = (width of lastPhoto) / (height of lastPhoto)) then


    set thisSize to size of (info for (image path of thisPhoto as POSIX file))
    set lastSize to size of (info for (image path of lastPhoto as POSIX file))

    if thisSize = lastSize then
    set diff to "anything but empty"
    try
    -- run the unix diff program to compare the files.
    -- if they are the same the variable diff will be empty.
    -- if they are different or an error occurs then diff will not be empty.
    set diff to (do shell script "/usr/bin/diff -q '" & (image path of thisPhoto as text) & "' '" & (image path of lastPhoto as text) & "'")
    end try

    if diff = "" then
    select thisPhoto
    assign keyword string "duplicate"

    set dupFound to true
    else
    -- there must be subtle changes so I will mark thisPhoto
    --set comment of lastPhoto to "similar" -- for testing
    select thisPhoto
    assign keyword string "similar"
    set dupFound to true
    end if
    else
    -- here I assume that the larger file has more information
    -- and therefore it is the original
    if lastSize > thisSize then
    select thisPhoto
    assign keyword string "processed"
    set dupFound to true
    else
    select lastPhoto
    assign keyword string "processed"
    select thisPhoto
    assign keyword string "original"
    set ThisIsOriginal to true
    set countDups to countDups + 1
    -- thisPhoto is assumed to be the original
    end if
    end if
    end if
    --end try
    -- Last=This keep using Last
    -- Last~This keep using Last
    -- Last>This keep using Last - Last is assumed to be he original but the next could be the original
    -- Last <>This step onto This
    if not dupFound then

    set lastPhoto to thisPhoto
    else
    if not ThisIsOriginal then
    select lastPhoto
    assign keyword string "original"
    end if
    set countDups to countDups + 1
    end if

    end if


    end repeat
    beep
    beep
    display dialog "Finished Processing - Found " & (countDups as text) & " Duplicates in the Selection. Check your keywords for duplicate, similar, processed, or original. (Hold down the shift key as you select keywords for 'or' searching. Enjoy the feeling of freedom from redundancy . . ."
    end if

    end tell

    ReplyDelete
  2. Nice work.

    Thanks for posting your work and let's hope others find them useful.

    And maybe someone will improve upon them even more.

    Thanks again.

    ReplyDelete
  3. you might also want to consider adding a try clause to catch if a file is 'missing' from iphoto. I've run into a problem where pictures have been moved around and iphoto has, for some reason, failed to properly clean up after itself. Using a try, and then marking them with 'missing' as a keyword makes it possible to find and delete them. It also helps keep the script from crashing.

    ReplyDelete
  4. Hi Eileen,

    Thanks for your suggestion. If you have already modified the script would you like to post it?

    Phil

    ReplyDelete
  5. Thank you for the script, I am a complete novice to writing and running scripts, so please forgive my question...

    I want to just use the last part of your script to delete the comments on some photos which were mistaken dupes.

    I have pasted the script into script editor and run, but with no success, either an error occurs or nothing happens.

    I saved the script as pcIPhotoClearDuplicateComments.scpt.
    and ran again, but still nothing,

    I followed the steps in Karls post, but with your script...


    Download and unzip the script
    Double-click the script to open in Script Editor
    Go into iPhoto and select a group of photos you want to compare
    Switch back to Script Editor and run the script
    Don’t Touch Anything! Just let the script finish, it could take a while if you are comparing a lot of photos

    But still nothing...

    Please advise me of what to do, I would really appreciate your help,


    Thanks, Emma

    ReplyDelete
  6. for those of you working with iphoto 8 this script will still work...follow all the above steps and then in line 9 of the script change the "7" to a "8"

    this should do it worked fine with me...

    ReplyDelete
  7. Thanks Blake. I suppose when iPhoto 9 is released we will need to change the number again.

    ReplyDelete
  8. Trying this with Iphoto 9.. changed the number on line 7 to 9, but all it does it give an immediate 0 in results and does no scanning.. any ideas

    ReplyDelete
  9. Like others tried this with iphoto9 with no luck at first, but without much tinkering got it to work. Here is what you need to do.

    Change version from 7 to 8

    And set up the key words in iphoto to match original, duplicate, processed and similar and run from applescript...works a treat.

    keywords can be selected from iphoto menu then window or command+k

    ReplyDelete
  10. Thanks for sorting out the iPhoto 9 problem.

    ReplyDelete
  11. Thanks a lot ! Works very well.

    What is the level of recognition with this script, compared to the MD5 Checksum, from other softwares?

    (if I put a lot of files, I get the message of time out, but it still tag my photos and I can find the similar/duplicates.)

    ReplyDelete
  12. Hi Thierry, thanks for your comment.

    I guess that they use MD5 to create a checksum for each photo and then simply compare checksums. Perhaps the MD5 is stored for future runs to speed things up?

    I simply compare the files each time. To detect the original I simply assume that any photo processing looses information and therefore the file size will be smaller.

    I have not had a time-out. Is your library on a network drive perhaps?

    ReplyDelete
  13. hey.

    i just tried that script with iphoto 8.1.2 (iPhoto '09).
    i changed the code to match the version (i.e. replaced the 7 by an 8) and added the needed keywords to iphoto.
    unfortunately i still get the immediate 0-reply.
    any ideas what's wrong?

    ReplyDelete
  14. What do you get if you display the version?

    ie. Place something like this before the 'if' statement that you modified:

    display alert "Version = " & (theversion as text)

    ReplyDelete
  15. It seems that iPhoto 9.1.5 doesn't want to reveal the keyword names. I get a bunch of missing values when I try to find that out, and various attempts on my part to suss it out have failed (which I find is all too typical with AS).

    Anyway, my solution was just to comment out the section of the script that checks for the right keywords. If they aren't there, iPhoto seems to fail silently, so no harm in the end.

    ReplyDelete
  16. OK, these boxes don't accept enough text, but here's what to do:

    Remove the bit that checks for the keywords.
    If you like, remove the now unnecessary assignments to the various time variables.
    Be nice to yourself and insert a line or too in the warning explaining that those keywords have to exist.

    ReplyDelete

Please use family friendly language.