Facial Recognition in Photos
One facet of my DFIR Summit talk I want to expand upon is a look into the Photos application, and a few of the derivative pieces of that endeavor. While trying to focus on the topic of facial recognition, it seemed prudent to include a brief progression from snapping a photo thru to a persons name being placed beside their face in the Photos application.
When you use the Native camera and snap a photo, depending on user options, at least a few standard things occur. It ultimately writes the newly taken photo to /private/var/mobile/Media/DCIM/1**APPLE/IMG_0001.HEIC / .JPG. As the photo is taken, the Photos.sqlite database is updated to reflect a lot of the metadata about the photo, which will be covered a bit later. Additionally, the “PreviewWellImage.tiff” is created. The “PreviewWellImage.tiff” represents the photo you see when you open your Photos application and see a preview of the most recent image, which is the photo just taken by the camera in this instance.
The beginning of the user’s photos reside in the ../100APPLE/ directory, but this directory iterates upwards (101APPLE, 102APPLE, etc) as more and more photos and videos are saved. If iCloud syncing is turned on by the user, then several others behaviors occur - but that is for another time.
Let’s focus on the analysis and intelligence built into the Photos application. I’m able to type text strings and my photos are immediately searched for matching objects within them. There is a section of my Photos dedicated to “People” where a name has been associated with a face that Apple has analyzed.
Some of the analysis pieces occurring with the Photos happens in the mediaanalysis.db file. This file is analyzing and scoring the media files and producing results that seems to feed into other pieces of the analysis. Some scoring results to highlight are ones that focus on Humans, Faces, and Pets.
Path: /private/var/mobile/Media/MediaAnalysis/mediaanalysis.db
The ‘Results’ table of the mediaanalysis.db file contains the ‘assetId’ which represents a media file, a ‘resultsType’ which is the specific analytical type and score value found in that media file, and a BLOB (binary large object) which is a binary .plist (bplist). You can see in the image below, the ‘assetId’ 3 has numerous ‘resultsType’ associated with it and the BLOB for ‘resultsType’ of 1 is selected and on the right you can see the bplist.
That bplist can be printed to a text file by saving the bplist as a file and then using ‘plutil’ to print it to a text file. As you can see below beside the red star, the printed text is a clean presentation of that .bplist and it tells us that ‘resultsType’ of 1 is associated with Faces based on the scoring.
I repeated that process for the remaining pieces of data and wrote a brief description of the results type for each, although my device still had a few types that did not have any results.
After running a SQL Query against this database, you can sort the results to potentially see just files that have the results type for ‘humanBounds’ and ‘humanConfidence’. The ‘localIdentifier’ column from the “assets’ table is a UUID which matches up to the ZGENERICASSETS table of the Photos.sqlite.
Here is the query for the mediaanalysis.db file. It’s big and ugly but please test it out if you’re interested, but this piece just seems to build into what we will see later in the Photos.sqlite where this all comes together.
select
a.id,
a.localIdentifier as "Local Identifier",
a.analysisTypes as "Analysis Types",
datetime(a.dateModified+978307200, 'unixepoch') as "Date Modified (UTC)",
datetime(a.dateAnalyzed+978307200, 'unixepoch') as "Date Analyzed (UTC)",
CASE
when results.resultsType = 1 then "Face Bounds / Position / Quality"
when results.resultsType = 2 then "Shot Type"
when results.resultsType = 3 then "Duration / Quality / Start"
when results.resultsType = 4 then "Duration / Quality / Start"
when results.resultsType = 5 then "Duration / Quality / Start"
when results.resultsType = 6 then "Duration / Flags / Start"
when results.resultsType = 7 then "Duration / Flags / Start"
when results.resultsType = 15 then "Duration / Quality / Start"
when results.resultsType = 19 then "Duration / Quality / Start"
when results.resultsType = 22 then "Duration / Quality / Start"
when results.resultsType = 23 then "Duration / Quality / Start"
when results.resultsType = 24 then "Duration / Quality / Start"
when results.resultsType = 25 then "Duration / Quality / Start"
when results.resultsType = 27 then "Duration / Quality / Start"
when results.resultsType = 36 then "Duration / Quality / Start"
when results.resultsType = 37 then "Duration / Quality / Start"
when results.resultsType = 38 then "Duration / Quality / Start"
when results.resultsType = 39 then "Duration / Quality / Start"
when results.resultsType = 48 then "Duration / Quality / Start"
when results.resultsType = 8 then "UNK"
when results.resultsType = 11 then "UNK"
when results.resultsType = 13 then "UNK"
when results.resultsType = 21 then "UNK”
when results.resultsType = 26 then "UNK"
when results.resultsType = 31 then "UNK"
when results.resultsType = 42 then "UNK"
when results.resultsType = 45 then "UNK"
when results.resultsType = 49 then "UNK"
when results.resultsType = 9 then "Attributes - junk"
when results.resultsType = 10 then 'Attributes - sharpness'
when results.resultsType = 12 then "Attributes - featureVector"
when results.resultsType = 14 then "Attributes - Data"
when results.resultsType = 16 then "Attributes - orientation"
when results.resultsType = 17 then 'Quality'
when results.resultsType = 18 then "Attributes - objectBounds"
when results.resultsType = 20 then "Saliency Bounds and Confidence"
when results.resultsType = 28 then "Attributes - faceId / facePrint"
when results.resultsType = 29 then "Attributes - petsBounds and Confidence"
when results.resultsType = 30 then "Various Scoring Values"
when results.resultsType = 32 then "Attributes - bestPlaybackCrop"
when results.resultsType = 33 then "Attributes - keyFrameScore / keyFrameTime"
when results.resultsType = 34 then "Attributes - underExpose"
when results.resultsType = 35 then "Attributes - longExposureSuggestionState / loopSuggestionState"
when results.resultsType = 40 then "Attributes - petBounds and Confidence"
when results.resultsType = 41 then "Attributes - humanBounds and Confidence"
when results.resultsType = 43 then "Attributes - absoluteScore/ humanScore/ relativeScore"
when results.resultsType = 44 then "Attributes - energyValues/ peakValues"
when results.resultsType = 46 then "Attributes - sceneprint/ EspressoModelImagePrint"
when results.resultsType = 47 then "Attributes - flashFired, sharpness, stillTime, texture"
end as "Results Type",
hex(results.results) as "Results BLOB"
from assets a
left join results on results.assetId=a.id
Before diving into the Photos.sqlite file, I want to first point out the file recording the text strings as an apparent result of Apple’s object analysis of the photos. This file stores the text results which empower our ability to search text strings in the Photos application and return results. The text strings are not necessarily a result of any specific user activity, but instead an output from analysis automatically being deployed by Apple against the user’s media files.
Path: /private/var/mobile/Media/PhotoData/Caches/search/psi.sqlite
The ‘word_embedding’ table within psi.sqlite contains columns ‘word’ and ‘extended_word’ which are just strings stored in BLOB’s. Using DB Browser for SQLite you can export the table to a CSV and it prints the strings from the BLOB’s pretty cleanly. Separately there is also a table named ‘collections’ that has ‘title’ and ‘subtitle’ columns that appear to be a history of the Memories and Categories that have been used or ones that will be used.
The last table in psi.sqlite to mention for this piece is the ‘groups’ table. Within the groups table the ‘content_string’ contains some really interesting data. I initially set out to find just the words “Green Bay” as it was something populated in my text search for the letter “g”. What I found was far more interesting. I did find “Green Bay” but additionally found in one of the other BLOB’s, “Green Bay Packers vs. Miami Dolphins”. That BLOB has a little extra flavor added by Apple for me. Whether they simply used my geo coordinates baked into my photos from being at Lambeau Field, or analyzed the content of the photos and found the various Miami Dolphins jerseys - I’m not sure. But a very interesting artifact that is absolutely accurate, dropped in there for me. Thanks Apple!
Now let’s tackle Photos.sqlite, but only as it pertains to facial recognition and associating photos of people to an actual name. Because quite honestly this singular file is nearly a full time job if someone wanted to parse every inch of it, and maintain that support.
Path: /private/var/mobile/Media/PhotoData/Photos.sqlite
My instance of Photos.sqlite is a beast, weighing in at over 300MB and containing 67 tables packed full of data about my Photos. We are going to focus on two tables - ZDETECTEDFACE and ZPERSON.
ZDETECTEDFACE
This table contains values that indicate features about faces to include an estimate of age, hair color, baldness, gender, eye glasses, and facial hair. Additionally there are indicators for if the left or right eyes were closed, and X and Y axis measurements for the left eye, right eye, mouth and center. So the data in this table is extremely granular, and was quite fun to work through. Who doesn’t like looking at old photos?
ZPERSON
This table contains a count for the number of times Apple has been able to identify a certain face from the media files. So in my device, I am recognized by name for hundreds of photos, but there are also photos of me where it hasn’t associated my name with my face. For each face identified, a UUID (Unique Identifier) is assigned. So although the analytics piece may not be able to connect a face with a name, it can group all identified instances of the unknown faces as being the same person.
If there is an association made between the person’s face and a saved contact, the ZCONTACTMATCHINGDICTIONARY column’s BLOB data can possibly reveal a full name and the phone number. This again can be achieved by printing the bplist to a .txt file.
select
zga.z_pk,
zga.ZDIRECTORY as "Directory",
zga.ZFILENAME as "File Name",
CASE
when zga.ZFACEAREAPOINTS > 0 then "Yes"
else "N/A"
end as "Face Detected in Photo",
CASE
when zdf.ZAGETYPE = 1 then "Baby / Toddler"
when zdf.ZAGETYPE = 2 then "Baby / Toddler"
when zdf.ZAGETYPE = 3 then "Child / Young Adult"
when zdf.ZAGETYPE = 4 then "Young Adult / Adult"
when zdf.ZAGETYPE = 5 then "Adult"
end as "Age Type Estimate",
case
when zdf.ZGENDERTYPE = 1 then "Male"
when zdf.ZGENDERTYPE = 2 then "Female"
else "UNK"
end as "Gender",
zp.ZDISPLAYNAME as "Display Name",
zp.ZFULLNAME as "Full Name",
zp.ZFACECOUNT as "Face Count",
CASE
when zdf.ZGLASSESTYPE = 3 then "None"
when zdf.ZGLASSESTYPE = 2 then "Sun"
when zdf.ZGLASSESTYPE = 1 then "Eye"
else "UNK"
end as "Glasses Type",
CASE
when zdf.ZFACIALHAIRTYPE = 1 then "None"
when zdf.ZFACIALHAIRTYPE = 2 then "Beard / Mustache"
when zdf.ZFACIALHAIRTYPE = 3 then "Goatee"
when zdf.ZFACIALHAIRTYPE = 5 then "Stubble"
else "UNK"
end as "Facial Hair Type",
CASE
when zdf.ZBALDTYPE = 2 then "Bald"
when zdf.ZBALDTYPE = 3 then "Not Bald"
end as "Baldness",
CASE
when zga.zlatitude = -180
then 'N/A'
else zga.ZLATITUDE
end as "Latitude",
CASE
when zga.ZLONGITUDE = -180
then 'N/A'
else zga.ZLONGITUDE
end as "Longitude",
datetime(zga.zaddeddate+978307200, 'unixepoch') as "Date Added (UTC)",
ZMOMENT.ztitle as "Location Title"
from zgenericasset zga
left join zmoment on zmoment.Z_PK=zga.ZMOMENT
left join ZDETECTEDFACE zdf on zdf.ZASSET=zga.Z_PK
left join ZPERSON zp on zp.Z_PK=zdf.ZPERSON
where zga.ZFACEAREAPOINTS > 0
Below is a sample of the output of this analysis, paired with the photo the metadata came from. You can see it is able to identify me and my two daughters by name, and accurately assess our genders, my sunglasses and facial hair.
Please test, verify, and give me a shout on Twitter @bizzybarney with any questions or concerns.