Image Analysis in SharePoint Online: some first steps

Image Analysis in SharePoint Online: some first steps

A while ago Microsoft announced their efforts into AI & Automation with SharePoint content. You can find a great writeup on Enrich you SharePoint Content with Intelligence and Automation. Part of these efforts are around Image Analysis: a form of auto tagging images in your SharePoint environment. It does sound like a great option to make it easier to find back images, so it only makes sense to play around with it a bit more.

Image analysis

According the the original blogpost any image that is uploaded to your SharePoint libraries is scanned. This provides object recognition to identify objects and extract data as text. This information is then extracted and then added as metadata to the original file, as provided in the original screenshot shared.

Image analysis

Having seen that, and the line: Image Analysis – Create immersive formatting for any list or library with scripting [Available now] I couldn’t wait to start to play with it, but a few things where a bit confusing.

Starting with image analysis on your tenant

So the blog clearly states that it is available now so I picked my development tenant and uploaded some images expecting the fields to be available. However they were not there yet. Then I had a quick look at the search schema to figure out if the columns where there as metadata instead of list columns, again no luck. So on on the next tenant, as it turns out the columns weren't there either on a new group or team site. So slightly confused I gave up only to find one of the columns available the next day on the SharePoint Starter Kit provided by PnP. First thing I assumed is that it had to do with the fact that the Starter Kit contains a Hub site, but upon close inspection it is an entry in the template that is deployed over the Hub site.

In that provisioning template on the Asset Library they add the following field:

<Field ID="{d1cff744-ba61-4189-94d6-97d0a9eb4f6a}" Type="Text" 
    DisplayName="MediaServiceAutoTags" Name="MediaServiceAutoTags" 
    Group="_Hidden" Hidden="FALSE" Sealed="TRUE" 
    ReadOnly="TRUE" ShowInNewForm="FALSE" ShowInDisplayForm="FALSE" 
    ShowInEditForm="FALSE" ShowInListSettings="FALSE" Viewable="FALSE" 
    Json="FALSE" SourceID="{{listid:Site Assets}}" 
    StaticName="MediaServiceAutoTags" ColName="nvarchar14" RowOrdinal="0" />

As you can see on line three the Hidden property of the field is set to false, and as it turns out it was the only field that was visible. Based on that I did some more searching and retrieved all created properties on that tenant so I could set the hidden property on all of them resulting in the following template:

<pnp:Fields>

<Field ID="{d1cff744-ba61-4189-94d6-97d0a9eb4f6a}" Type="Text" DisplayName="MediaServiceAutoTags" Name="MediaServiceAutoTags" Group="_Hidden" Hidden="FALSE" Sealed="TRUE" ReadOnly="TRUE" ShowInNewForm="FALSE" ShowInDisplayForm="FALSE" ShowInEditForm="FALSE" ShowInListSettings="FALSE" Viewable="FALSE" Json="FALSE" SourceID="{{listid:Site Assets}}" StaticName="MediaServiceAutoTags" ColName="nvarchar13" RowOrdinal="0" />

<Field ID="{f34611d5-65a6-322f-ac39-d880b14ce28f}" Type="Text" DisplayName="MediaServiceDateTaken" Name="MediaServiceDateTaken" Group="_Hidden" Hidden="FALSE" Sealed="TRUE" ReadOnly="TRUE" ShowInNewForm="FALSE" ShowInDisplayForm="FALSE" ShowInEditForm="FALSE" ShowInListSettings="FALSE" Viewable="FALSE" Json="FALSE" SourceID="{{listid:Site Assets}}" StaticName="MediaServiceDateTaken" ColName="nvarchar14" RowOrdinal="0" />

<Field ID="{617f8947-74b2-36bc-9f7e-21ded7029bb5}" Type="Note" DisplayName="MediaServiceMetadata" Name="MediaServiceMetadata" Group="_Hidden" Hidden="FALSE" Sealed="TRUE" ReadOnly="TRUE" ShowInNewForm="FALSE" ShowInDisplayForm="FALSE" ShowInEditForm="FALSE" ShowInListSettings="FALSE" Viewable="FALSE" Json="TRUE" SourceID="{{listid:Site Assets}}" StaticName="MediaServiceMetadata" ColName="ntext2" RowOrdinal="0" />

</pnp:Fields>

You can add these fields to the asset library and get the available fields. Once you have done so you can see the values yourself:

Image analysis on your tenant

You can apply this to any document or image library to see the values that Microsoft will apply for you. Currently it does look however that your tenant needs to be on first release for this to work. On tenants that have first release enabled for selected users, or not enabled at all the web hook that pushes the values in to the columns seems not present or triggering yet. It is also worth to remark that the MediaServiceOCR column isn’t working in any of my tenants yet. Having said with the above template you can start by enabling the columns on at least on library just to play a bit around with it.

What can you expect from image analysis

You will get four or five columns by default, the first three are working on my tenant and contains the tags, a date time and the metadata. Both the tags and the date will only be filled if analysis is confident enough to be filled. Having said that you can start using those columns in either search results by promoted them as Managed Properties or use them in your views. The other property contains a JSON result sample that the Analysis Service writes back. It can be used for a bit of debugging  so you understand the process. As you can see in the following sample the JSON contains all tags and their confidence level.

{
    "ctag": "\"c:{1a20e2fb-244e-48dd-94b1-d7d2948c8f5b},3\"",
    "generationTime": "2018-07-27T12:27:37.0011819Z",
    "lastStreamUpdateTime": "2018-07-27T12:27:37.0011819Z",
    "modules": [{
        "module": "ThumbnailGeneration",
        "version": 1
    }, {
        "module": "MetadataExtraction",
        "version": 1
    }, {
        "module": "ObjectDetection",
        "version": 7
    }, {
        "module": "ReverseGeocoding",
        "version": 1
    }, {
        "module": "TextRecognition",
        "version": 2
    }],
    "photo": {
        "width": 2048,
        "height": 2048
    },
    "tags": [{
        "name": "person",
        "localizedName": null,
        "confidence": 0.999962449
    }]
}

There is no official documentation yet as to what confidence level is required before tags will be populated, but testing against images provided in the Azure demo to Analyze an image provide slightly different tags. Uploading the same sample image to both Azure and SharePoint does result in some different tags. The image can be found on Pexels so you can reproduce it yourself.

image

Uploading the result to Azure results in the following tags:

[{
    "name": "sky",
    "confidence": 0.997940838
}, {
    "name": "outdoor",
    "confidence": 0.9965514
}, {
    "name": "building",
    "confidence": 0.9741702
}, {
    "name": "city",
    "confidence": 0.9601095
}, {
    "name": "factory",
    "confidence": 0.7408691
}, {
    "name": "background",
    "confidence": 0.7064705
}, {
    "name": "day",
    "confidence": 0.338326663
}]

While uploading the image to SharePoint results in the following tags:

[{
    "name": "outdoor",
    "localizedName": null,
    "confidence": 0.9857998
}, {
    "name": "city",
    "localizedName": null,
    "confidence": 0.9325022
}]

Final thoughts

As you can see the confidence level differs slightly and the tag sky is missing. It is safe to say that Microsoft is using a different model for the Azure demo as they do for SharePoint Online. So the best way to see if this is a fit for your use case is to play around with it on a demo tenant just to see what tags you can expect. The image size does not seem to be a bottle neck as even 35Mb files are processed properly. So the Image Analysis looks like a promising step towards adding additional context of your content in SharePoint Online.

Leave a Reply