Using RegEx to extract MIME parts from a Microsoft Graph API email stream

In this article I will demonstrate how a Regular Expression can be used to extract all MIME content references from within an HTML Stream.

Introduction

In the previous article we looked at how the base64 encoded version of an embedded MIME Image can be extracted from the Microsoft Graph. In this article we will start to look at how we are going to  automate the solving of that problem by identifying all the MIME encoded images from within the graph API HTML stream.

Regular expressions

RegEx is mentally challenging to most of us, and that is why some beautiful people created Stackoverflow and Google. I was able to find this solution to a similar problem of extracting tag attributes from an HTML string.

Modifying this slightly for my needs in context of the graph I was able to come up with the following:

var content = data.body.content;
var cids = content.match(/cid["']?((?:.(?!["']?\s+(?:\S+)=|[>"']))+.)?/g);

and this will take an HTML stream with different kinds of MIME reference and return them all as an array.

Here is the sample HTML string with just the images highlightedc1

and here’s the result of the test in firebug logging the cids array

c2

 

Conclusion

In this article we have seen that with a simple Regular expression we can extract the Image src attributes relating to the MIME parts within the MS Graph API feed.

Caveat: This assumes a lot about the structure of the API and that is will continue to conform to this structure.

 

IBM Cloud Champion 2016-17

I am very flattered and honoured to be awarded in the first ever group of IBM Cloud Champions. It has been a really great year learning about IBM Bluemix and working with IBM ICS dev team on the XPages runtime capabilities.

I want to say thank you to those people who nominated me and encouraged me to look into Bluemix.

Thanks also need to go to PSC Group for giving me the job I love doing and the opportunity to look into cloud and other modern technologies. 

Where are the MIME-embedded images in a Microsoft Graph REST API call?

In this article I will demonstrate the steps necessary to correctly extract the embedded image MIME entities from within the HTML provided from the Microsoft Graph API (v1.0).

Introduction

Using the Microsoft Graph API we are able to extract the HTML body from an email given the message id. That HTML could then in theory be stored separately and the email reviewed for later. Here is a simple example using the graph explorer example website .

https://graph.microsoft.com/v1.0/users(‘GUID’)/messages/messageId

Here is the Original email

h1

Here is the graph API representation of that email with the HTML highlighted

h2

Here is the HTML extracted and saved as marky.html

h3

and here is the marky.html file displayed in the browser.

h4

This works well for this kind of email because it is HTML and references external CSS images and pictures – it can be recreated with a high confidence in fidelity.

This is unfortunately not the case when you send internal email through outlook and other email clients.

Embedded MIME

Generally when email is sent directly from a mail client it contains embedded MIME images and when you download the HTML for that web page you get contentId (cid: ) references like this.

The original email sent with an inline image:

h5

The HTML generated from the Graph API contains a “cid” reference

h6

which ultimately leads to a failed image as there is now no real reference to the image:

h7

(sad face)

The image reference is src=”cid:image001.png@01D1C6EE.58865610″  and that reference we will use in the next section

Getting the image

We have to use the graph API to retrieve the images in different way. The images are actually recorded as “Attachments” within the graph API and are accessible via the API in the following manner.

https://graph.microsoft.com/v1.0/users(‘GUID’)/messages/messageId/attachments

What we get back from that request looks like this with the cid reference in the contentId.

h8

The contentBytes value of the attachment is actually the base64 encoded version of the image originally references in the email. If we take that base64 encoded value (in this case it was 96k in size) and replace the image source in marky.html it looks like this:

src=”data:image/jpeg;base64,contentBytesHERE”

h9

and when we view marky.html we now see the image.

h10

Conclusion

In this article we have looked at where all the information can be found to get the images for storage outside of the Microsoft Graph, but we have not looked into the complications of how you would go about automating this process. That would a potentially large number of requests (based on the number of embedded images) and would also be time consuming.

But, the information is out there if you know where to look.

 

 

Setting up a secure, custom domain, node.js site on Azure

In this article I will demonstrate the necessary steps to set up a node.js server running https, hosted in Azure.

Introduction

This article is a combination of my own work and a conglomeration of reference point blog articles which I had to find to achieve all of this.

Creating a node.js site on Azure

If you follow the instructions on this site you should be able to create an azure site (Get started with Node.js web apps in Azure App Service)

a1

Creating a custom domain

Once you have registered your new domain (in my case marky.co) you need to go to the azure portal and follow the instructions posted here (Configuring a custom domain name for an Azure cloud service). You cannot do this on your free tier though and this where you have to chose your plan carefully. To be able to interact with Office Add-Ins I need my service to be SSL enabled.

a2 a3

Once you have selected a Basic plan you should get the following options

a4

Assign your site and as the instructions stated – you can “Bring your domain” by changing the CName within your domain name provider DNS management tools.

a5

Adding SSL

There are a number of methods for getting an SSL certificate but I have taken to doing it for free – you can use the same process I detailed here for exposing your node server to manually collect the letsencrypt certificates  (Using Let’s Encrypt to create an SSL certificate for my Bluemix hosted web site) to create the .pem files.

To turn the .pem files into .pfx files you need to follow the openssl instructions here (How To: Get LetsEncrypt working with IIS manually)

openssl pkcs12 -export -out “certificate.pfx” -inkey “privkey.pem” -in “cert.pem” -certfile chain.pem

a6

The certificate.pfx file can then be loaded into the azure portal. When you import the certificate successfully it is displayed on the main blade automatically.

a7

 

Add the SSL binding

a71

aaaah we love the cloud….

a8

a9

IMPORTANT – Restart your instance and there we go

a10

Conclusion

In this article we have seen how to deploy an instance of node.js on Azure, applied a custom domain to it, created an SSL certificate and added to an azure instance. Once this is complete you should have an SSL secured node.js  instance running which can then be used for Office Add-in deployments.