How to improve Azure: Can you keep a secret?

In this blog series I explore some of the shortcomings of the Windows Azure platform (as of this date, March 2014) and discuss ways it could be improved. This isn’t a rant against the platform: I’ve been using and promoting the platform for more than four (4) years now and I’m very passionate about it. Here I am pointing at problems and suggesting solutions. Feel free to jump in the discussion in the comments section!

   
 

What is a secret in the context of a Cloud Application?

A secret is any credentials giving access to something. Do I mean a password? Well, I mean a password, a username, an encryption key, a Share Access Signature (SAS), whatever gives access to resources.

A typical Cloud application interacting with a few services accumulates a few of those. As an example:

  • User name / password to authenticate against the Azure Access Control Service (ACS) related to an Azure Service Bus (you access more than one Service Bus namespace? You’ll have as many credentials as namespaces you are interacting with)
  • SAS to access a blob container
  • Storage Account Access key to access a table in a Storage Account (yes you could do it with SAS now, but I’m striking for diversity in this example ;) )

All those secrets are used as input to some Azure SDK libraries during the runtime of the application. For instance, in order to create a MessagingFactory for the Azure Service Bus, you’ll need to call a CreateAsync method with the credentials of the account you wish to use.

This means your application requires to know about the credentials: a weakness right there!

Compare this with a typical way you configure an application on Windows Server. For instance, you want an IIS process to run under a given Service account? You asked your favorite sys-admin to punch in the Service Account name & password into the IIS console at configuration time (i.e. not at runtime). The process will then run under that account and never the app will need to know the password.

This might look like a convenience but it’s actually a big deal. If your app is compromised in the Windows Server scenario, there is no way it can reveal the user credentials. In the case of your Azure app, well, it could reveal it. Once a malicious party has access to account credentials, it gives it more freedom to attack you than just having access to an app running under that account.

But it doesn’t stop there…

Where do you store your secret on your Azure app? %99 of the time, in the web.config. That makes it especially easy for a malicious party to access your secrets.

Remember, an application deployed in Azure is accessible by anyone. The only thing protecting it is authentication. If you take an application living in your firewall and port it to the cloud, you just made it much more accessible (which is often an advantage because partners or even your employees, from an hotel room, have access to it without going through the hoops of VPN connections) but are also forced to store credentials in a less secure way!

On top of that, in terms of management, it’s a bit awkward because it mixes application parameters with secrets. Once a developer deploys or creates a deployment package to pass it to the sys-admin (or whoever plays that role, it might be a dev-ops developer, but typically, not everyone in the dev group will know about production credentials), it must specifies some arbitrary config keys the sys-admin must override.

So in summary, we have the following issues:

  • Application knows secrets
  • Secrets are stored in an unsecure way in the web.config
  • Secrets are stored with other configuration parameters and do not have a standard naming (you need to come up with one)

 

Ok. How do we fix it?

This one isn’t easy. Basically, my answer is: in the long run we could but cloud platforms haven’t reached a mature enough level to implement that today. But we can establish a roadmap and get there one day with intermediary steps easing the pain along the way.

Basically, the current situation is:


That is, the app gets credentials from an unsecure secret store (typically web.config) then request an access token from an identity / token provider. It then uses that token to access resource. The credentials aren’t used anymore.

So a nice target solution would be:


Here the application requests the token from Windows Azure (we’ll discuss how) and Azure reads the secrets and fetch the token on behalf of the application. Here the application never knows about the secrets. If the application is compromised, it might still be able to get tokens, but not the credentials. This is a situation comparable to the Windows Server scenario we talked above.

Nice. Now how would that really work?

Well, it would require a component in Azure, let’s call it the secret gateway, to have the following characteristics:

  • Have access to your secrets
  • Knows how to fetch tokens using the secrets (credentials)
  • Have a way to authenticate the application so that only the application can access it

That sounds like a job for an API. Here the danger is to design a .NET specific solution. Remember that Azure isn’t only targeting .NET. It is able to host PHP, Ruby, Python, node.js, etc. . On the other hand, if we move it to something super accessible (e.g. Web Service), we’ll have the same problem to authenticate the calls (i.e. requirement #3) than how we started.

I do not aim at a final solution here so let’s just say that the API would need to be accessible by any runtime. It could be a local web service for instance. The ‘authentication’ could then be a simple network rule. This isn’t trivial in the case of a free Web Site where a single VM is shared (multi-tenant) between other customers. Well, I’m sure there’s a way!

The first requirement is relatively easy. It would require Azure to define a vault and only the secret gateway to have access to it. No rocket science here, just basic encryption, maybe a certificate deployed with your application without your knowledge…

The second requirement is where the maturity of the cloud platform becomes a curse. Whatever you’ll design today, e.g. oauth-2 authentication with SWT or JWT, is guaranteed to be obsolete within 2-3 years. The favorite token type seems to be changing every year (SAML, SWT, JWT, etc.), so is the authentication protocol (WS-Federation, OAuth, OAuth-2, XAuth, etc.).

Nevertheless it could be done. It might be full of legacy stuff after 2 years, but it can keep evolving.

I see the secret gateway being configured in two parts:

  • You specify a bunch of key / values (e.g. BUS_SVC_IDENTITY : “svc.my.identity”)
  • You specify token mechanism and their parameter (e.g. Azure Storage SAS using STORAGE_ACCOUNT & STORAGE_ACCOUNT_ACCESS_KEY)

You could even have a trivial mechanism simply providing you with a secret. The secret gateway would then act as a vault…

We could actually build it today as a separate service if it wasn’t from the third requirement.

 

Do you think this solution would be able to fly? Do you think the problem is worth Microsoft putting resources behind it (for any solution)?

Hope you enjoyed the ride!

How to improve Azure

I’m very passionate about Windows Azure. I’ve been using and promoting the platform for more than four (4) years now.

So I’ve been working with the technology for a while but in the recent month I’ve been involved on an intensive architecture project where we pushed the envelope of the platform. As a consequence we did hit quite a few limitation of the platform.

I also had the pleasure of working directly with Microsoft to resolve some of those issues.

In this blog series I will address what still remain to this date (March 2014) limitations of the platform. Instead of winning about it, I will suggest ways Azure could be improve to address those shortcomings. That will be more constructive and will generate some discussion. Feel free to jump in the discussion in the comments section!

Azure ACS fading away

ACS is on life support for quite a while now.  It was never never fully integrated to the Azure Portal, keeping the UI it had in its Azure Labs day (circa 2010, for those who were born back then).

In an article last summer, Azure Active Directory is the future of ACS, Vittorio Bertocci announces the roadmap:  the demise of ACS as Windows Azure Active Directory (WAAD) beefs up its feature set.

In a more recent article about Active Directory Authentication Library (ADAL), it is explained that ACS didn’t get feature parity with WAAD on Refresh Token capabilities.  So it has started.

For me, the big question is Azure Service Bus.  The Service Bus uses ACS as its Access Control mechanism.  As I explained in a past blog, the Service Bus has a quite elegant and granular way of securing its different entities through ACS.

Now, what is going to happened to that when ACS goes down?  It is anyone’s guess.

Hopefully the same mechanisms will be transposed to WAAD.

Full Outer Join with LINQ to objects

Quite a few times it happened to me to be looking for a way to perform a full outer join using LINQ to objects.

To give a general enough example of where it is useful, I would say ‘sync’. If you want to synchronize two collections (e.g. two collections of employees), then an outer join gives you a nice collection to work with.

Basically, a full outer join returns you a collection of pairs. Every time you have both items in the pair, you are facing an update: i.e. the item was present in both collections so you need to update it to synchronize. If only the first item of the pair is available, you have a creation while if only the second item is you have a delete (I’m saying first and second, but it actually really depends on how you formulated the query but you get the meaning).

Whatever the reason (a sync is the best example I could find), here is the best way I found to do it. It is largely inspired on an answer I found on stack overflow.

public static IEnumerable<TResult> FullOuterJoin<TOuter, TInner, TKey, TResult>(

this IEnumerable<TOuter> outer,

IEnumerable<TInner> inner,

Func<TOuter, TKey> outerKeySelector,

Func<TInner, TKey> innerKeySelector,

Func<TOuter, TInner, TResult> resultSelector,

IEqualityComparer<TKey> comparer)

{

if (outer == null)
{

throw new ArgumentNullException("outer");

}

if (inner == null)

{

throw new ArgumentNullException("inner");

}

if (outerKeySelector == null)

{

throw new ArgumentNullException("outerKeySelector");

}

if (innerKeySelector == null)

{

throw new ArgumentNullException("innerKeySelector");

}

if (resultSelector == null)

{

throw new ArgumentNullException("resultSelector");

}

if (comparer == null)

{

throw new ArgumentNullException("comparer");

}

var innerLookup = inner.ToLookup(innerKeySelector);

var outerLookup = outer.ToLookup(outerKeySelector);

var allKeys = (from i in innerLookup select i.Key).Union(

from o in outerLookup select o.Key,

comparer);

var result = from key in allKeys

from innerElement in innerLookup[key].DefaultIfEmpty()

from outerElement in outerLookup[key].DefaultIfEmpty()

select resultSelector(outerElement, innerElement);

return result;

}

So here it is and it works.

You can easily optimize the signature by specializing for special cases (e.g. bumping the comparer, considering two collections of the same type hence requiring only one key selector, etc.).

For performance, I didn’t bother… but I wonder if creating those two lookups isn’t actually slower than doing a cross product (double loop) over both collection items and checking for key equality. My gut feeling is that it’s probably wasteful for small collections, worth it for big ones, hence if you optimize it, you do it for small collection which do not have performance problem anyway.

Enjoy!

Copy blob using SAS

I have been trying for a couple of days to find an easy way (read:  using tools) to copy blobs in Windows Azure Storage, not by using management keys but using Shared Access Signature (SAS).

Sounds simple enough.  I remembered the AzCopy tool.  I looked around and found a blog post explaining how to use it with SAS, using the DestSAS switch.

I spent hours and I could never make it work.  For starter, AzCopy is designed to copy folders instead of individual files.  But also, I could never get the SAS to work.

After those lost hours, I turned around and look at the Storage REST API.  Turns out you simply need to do an HTTP PUT in order to write a blob into a container.  If the blob doesn’t exist, it creates it, if it exists, it updates it.  Simple?

In PowerShell:

$wc = new-object System.Net.WebClient

$wc.UploadFile(<Blob Address>, "PUT", <Local File Path>)

The Blob Address needs to be the URI containing a SAS.

Enjoy!

Securing Azure Messaging Service Bus access

I am currently working on a very exciting project involving systems integration across the Azure Messaging Service Bus. I thought I would share some of the painfully acquired knowledge nuggets with you.

About %90 of examples you’ll find on Internet uses Azure Bus SDK with ‘owner’. That is basically ‘admin’ privilege because owner has read/write AND manage on an entire Service Bus namespace.

Although that is ok nice to get use to the SDK, but isn’t a very secure setting for a production environment. Indeed, if the owner credentials get compromise, it would compromise the entire namespace. To top it, Microsoft recommends not to change the password & symmetric key of the owner account!

So what is it we can do?

I’ll give you a few guidelines here but you can read in length on this excellent blog post or watch Clemens Vasters’s video.

Entities in Service Bus (i.e. Queues, Topics & Subscriptions) are modelled as relying parties in a special Azure Access Control Service (ACS): the Service Bus trust that its buddy-ACS, i.e. the one having the same name with a -sb happened to it, as a token Issuer. So access control is going to happened in that ACS.

You do not have access to that ACS directly, you must pass by the Service Bus page:


Once on that ACS, you can find the Service Identities tab:


And there, you’ll find our friend the owner:


So owner is actually a Service Identity in the buddy-ACS of the Service Bus.

Now, let’s look at the relying parties:


As I said, Relying parties represents Service Bus’ entities. Basically, any topic is the realm:

http://<namespace&gt;.servicebus.windows/net/<topic name>

while any subscription is

http://<namespace&gt;.servicebus.windows/net/<topic name>/Subscriptions/<subscription name>

But there is a twist: if you do not define a relying party corresponding exactly to you entity, ACS will look at the other relying parties, basically chopping off the right hand side of the realm until it finds a matching realm. In this case here, since I haven’t define anything, the root of my namespace is the fallback realm.

If we click on Service Bus, we see the configuration of the Service Identity and at the end:


The permissions are encoded in the rules. A rule is basically an if-then statement: if that user authenticates against this relying party, emit that claim. For Service Bus, the only interesting claim type is net.windows.servicebus.action:


So here you have it. Service bus performs access control with the following steps:

  1. Check ACS for a relying party corresponding to the entity it’s looking at
  2. If that relying party can’t be found, strip url parts until finding one
  3. ACS runs the rules of the relying party with the Service Identity of the consumer
  4. ACS returns a SWT token with claims in it
  5. Service Bus looks for the claim corresponding to the action it requires to do: Listen (receiving messages), Send & Manage.

So… if you want to give access by a specific agent (e.g. web role) to send messages on a topic, you create a Service Identity for the agent and create a relying party corresponding to the topic. You then enter a rule that emits a Send action and you should be all set.

This requires you to store secrets about to (Service) entity in the agents.

 

Hope this very quick overview gives you some ideas. As mentioned at the beginning, I recommend you read this excellent blog post or watch Clemens Vasters’s video.

WADL in a bottle eating noodles

In my last entry about REST web services I talked about its biggest weakness for me: the lack of description model of REST services.

The idea of hitting an HTTP endpoint as a shot in the dark is for me quite a leap of faith, and very likely an invitation to spend hours troubleshooting.

But despair no more, enters WADL! If it sounds like WSDL, it’s because it’s essentially the same acronym:

Web Service Definition Language -> WSDL

Web Application Description Language -> WADL

So WADL aims to be the WSDL of REST.

But… it was submitted the W3C in December 2009 by Sun Microsystem… one month before it was acquired by Oracle. Since them, it hasn’t budged… coincidence?

No other parties seem to have backed it, so it seems deemed to join the junkyard of unilateral attempt at standardizing global assets!

You can look up at an example on Wikipedia.

Maybe we’ll have another standard one day. Or maybe it’s a non-issue and I’m the only one to worry about it.

REST style with Hypermedia APIs

Once upon a time there was SOAP. SOAP really was a multi-vendor response to CORBA. It even share the same type of acronym, derived from object. Objects are so 90′s dude… The S in SOAP stands for Simple by the way. Have a go at a bare WSDL and try to repeat in your head that it is simple…

Then REST came along. I remember reading about REST back in 2002. It was a little after Roy Fielding‘s seminal article (actually his PhD thesis). Then there were a few articles about how SOAP bastardized the web and how XML RPC was so much better. But like the VHS vs Betamax battle before, the winner wasn’t going to be chosen on technical prowess. At least not at the beginning.

Then I stopped hearing about REST in 2003 and started seeing SOAP everywhere. We implemented it like COM+ interfaces really. A classic in the .NET community was to through Datasets on the wire via SOAP services. That really was a great way to misuse a technology… Ah… the youth… (a tear).

Microsoft tried to correct the trajectory by introducing WCF which enforced, or at least strongly suggest, a more SOA approach with a stronger focus on contracts and making boundaries more explicit. But somehow it was too late… something else was brewing beneath the SOA world…

In 2007, REST came back into fashion but now it was mainstream, i.e. people didn’t understand it, misquote it and threw it everywhere. Basically, it was: cool man, no more bloody contracts, I just send you an XML document, it’s so much simpler! Which of course works awesomely for 2-3 operations, then you start to get lost without a service repository because there are no explicit documentation!

If you see a parallel with the No-SQL movement (cool man, no more bloody schema, I just throw data in a can without ceremony, it’s so much simpler), I got no idea what you are talking about.

Anyway, if it wasn’t obvious, I’m not at all convinced that REST services solve that many issues by themselves. Ok, they don’t require a SOAP stack which make them appealing for a broader reach (read browser & mobile). But without the proverbial Word document next to you to know which service to call and to do something with, they aren’t that easy to use.

Then, finally, came Hypermedia API… I’ve a few articles about those, including the very good Designing and Implementing Hypermedia APIs by Mike Amundsen. I found in Hypermedia APIs the same magic I found when looking at HTML the first time: simple, intuitive & useful.

Hypermedia APIs are basically REST Web Services where you have one (or few) entry doors operations and from which you can find links to other operations. For instance, a list operation would return a list of items and each item would contain a URL pointing to the detail of the item. Sounds familiar? That’s how a portal (or dashboard) work in HTML.

Actually, you already know the best Hypermedia API there is: OData. With OData, you group many entities under a service. The root operation returns you a list of entities with a URL to an operation listing the instances of those entities.

The magic with Hypermedia APIs is that you just need to know your entry points and then the service becomes self-documented. It replaces a meta data entry (a la WSDL) with the service content itself.

The difference between now and the 2000′s when SOAP was developed is that now we really do need Services. We need them to integrate different systems within and across companies.

SOAP failed to deliver because of its complexity but mostly because it’s a nightmare to interoperate (ever tried to get a System.DateTime .NET type into a Java system? Sounds trivial, doesn’t it?).

REST seems easier on the surface because it’s just XML (or JSON). But you do lose a lot. The meta-data but also the WS-* protocols. Ok it was nearly impossible to interoperate with those but at least there was a willingness, a push, to standardise on things such as security & transactions. With REST, you’re on your own. You want atomicity between many operations? No worries, I’ll bake that into my services! It won’t look like any else you’ve ever seen or are likely to see though.

Mostly, you lose the map. You lose the ability to say ‘Add Web Reference’ and have your favorite IDE pump the metadata and generate nice strongly type proxies that will show up in intellisense as you interact with the proxy. Sounds like a gadget but how much is Intellisense responsible for the discovery of APIs for you? For me, it must be above %80.

Hypermedia API won’t give you Intellisense, but it will guide you in how to use the API. If you use it in your designs, you’ll also quickly find out that it will drive you to standardise on representations.

ePub Factory NuGet package

I’ve been publishing this NuGet package.

Ok, so why do yet another ePub library on NuGet when there are already a few?

Well, there aren’t that many actually and none are Portable Class Library (PCL).

So I’ve built an ePub library portable to both Windows 8+ & .NET 4.5.1. Why not Windows Phone? My library is based on System.IO.Compression.ZipArchive which isn’t available on Silverlight in general. That being said, what would be a use case to generate an ePub archive on a smart phone?

I have in my possession a Kobo Touch (yes, my Canadian fiber got involved when I chose the Kobo). I love to read on it: it is SO much more relaxing for my eyes than a tablet. It’s like reading a book but where I can change the content all the time. You see I use it to read a bunch of technical articles on public transport, so I upload new stuff all the time.

I wanted to automatize parts of it and hence I needed an ePub library. I would like to embed that code in a Windows App at some point (this is mostly pedagogical for me you see) so I needed something PCL.

Anywho, two technical things to declare:

1. ePub is complicated!

If you ever want to handcraft an ePub, use an ePub validator such as the excellent http://www.epubconversion.com/ePub-validator-iBook.jsp. Otherwise the ePub just doesn’t work and ePub tools (either eReader or Windows App) are quite silent about the problems.

The biggest annoyance for me was the spec that says you should have your first file starting at byte 38. This is the mime type of ePub and is meant to be a sort of check, i.e. no need to open the archive (an ePub is a zip file underneath) for a client to check, simply go at byte 38 and check you have the ePub mime type to validate you have a valid ePub in your hand.

Well, for that you need to write the mime type file first AND not compress it. Apparently that’s too much for System.IO.Compression.ZipArchive. I really needed that library since it works in async mode. So I did a ‘prototype’ epub file with only the mime type using another zip library (the excellent DotNetZip) and used that prototype as the starting point of any future ePub!

2.  My first NuGet package

Yep! So I went easy on myself and downloaded a graphic tool, NuGet Package Explorer.

I didn’t use much NuGet feature besides embedding the XML comment file in the NuGet package.

Quite neat!

It’s quite cool to handle packages the NuGet way. You can update them at will completely independently…