In this article I will demonstrate the steps necessary to correctly extract the embedded image MIME entities from within the HTML provided from the Microsoft Graph API (v1.0).
Using the Microsoft Graph API we are able to extract the HTML body from an email given the message id. That HTML could then in theory be stored separately and the email reviewed for later. Here is a simple example using the graph explorer example website .
Here is the Original email
Here is the graph API representation of that email with the HTML highlighted
Here is the HTML extracted and saved as marky.html
and here is the marky.html file displayed in the browser.
This works well for this kind of email because it is HTML and references external CSS images and pictures – it can be recreated with a high confidence in fidelity.
This is unfortunately not the case when you send internal email through outlook and other email clients.
Generally when email is sent directly from a mail client it contains embedded MIME images and when you download the HTML for that web page you get contentId (cid: ) references like this.
The original email sent with an inline image:
The HTML generated from the Graph API contains a “cid” reference
which ultimately leads to a failed image as there is now no real reference to the image:
The image reference is src=”cid:image001.png@01D1C6EE.58865610″ and that reference we will use in the next section
Getting the image
We have to use the graph API to retrieve the images in different way. The images are actually recorded as “Attachments” within the graph API and are accessible via the API in the following manner.
What we get back from that request looks like this with the cid reference in the contentId.
The contentBytes value of the attachment is actually the base64 encoded version of the image originally references in the email. If we take that base64 encoded value (in this case it was 96k in size) and replace the image source in marky.html it looks like this:
and when we view marky.html we now see the image.
In this article we have looked at where all the information can be found to get the images for storage outside of the Microsoft Graph, but we have not looked into the complications of how you would go about automating this process. That would a potentially large number of requests (based on the number of embedded images) and would also be time consuming.
But, the information is out there if you know where to look.