Sunday, October 27, 2013

Invalid credentails : OmniAuth + oAuth2 + Rails 4 encrypted cookie store + simultaneous requests

OmniAuth is a very well know gem in the Ruby/Rails world. Almost every Rails application out there is probably using it to authenticate with one of the various mechanisms it supports. OmniAuth is just awesome.!

I have been using OmniAuth for about 3 years now in various Rails projects and it has worked very well, although I have had to monkey patch it once or twice to allow me to exploit some of the Facebook features (like authenticated referrals). But all in all, it just works as advertised and you will have an authentication system in pretty much no time at all.

Yesterday, however, I was having quite a bit of hard time getting OmniAuth to do a simple "Login with Facebook" oAuth2 authorization. It was something that had worked seamlessly on innumerable previous occasions. But yesterday it just kept failing repeatedly, succeeding only once in a while. And it always failed with the same obscure error "Invalid Credentials" during the callback phase. (OmniAuth operates in three phases : Setup, Request and Callback. Apart from OmniAuth wiki on github, this is a good place to read about it : http://www.slideshare.net/mbleigh/omniauth-from-the-ground-up). The fact that the error message was not so helpful made this whole process a lot more frustrating. After some hunting on the inter-webs I found that the culprit could be a bad "state" parameter.

Wait, what is this "state" parameter?

Background : oAuth-2 CSRF protection


oAuth2 specifies the uses of a non-guessable secure random string to be used as a "state" parameter to prevent CSRF attacks on oAuth. More details here : http://tools.ietf.org/html/rfc6749#section-10.12. This came out almost a year back and many oAuth providers implement it already, including Facebook. OmniAuth also implemented it last year. Although not written with the best grammar, this article will tell you why this "state" parameter is needed and what happens without it.

To sum it up, oAuth client (our web application) creates a random string, stores it in an accessible place and also sends it to the oAuth provider (Ex : Facebook) as "state" during the request phase. Facebook will keep it, authenticate the end user, ask for permissions and when granted sends a callback to our web application by redirecting the user back to our website "with the state" as a query parameter. The client (our web application) compares the "state" sent by the provider and the one it had stored previously and proceeds only if they match. If they don't then there is no proof that the callback that our web application received is actually from the provider. It could be from some other attacker trying to trick our web application into thinking (s)he is someone else.

In case of OmniAuth oAuth2, this state parameter is stored as a property in the session with the key 'omniauth.state' during the request phase. The result of the request phase is a redirect to the provider's URL. The new session with the "state" stored in it will be set on the client's browser when it receives this redirect (302) response for the request /auth/:provider (This is the default OmniAuth route to initiate the request phase). After the provider (Facebook) authenticates the user and user authorizes our application, the provider makes a callback to our application by redirecting the user back to our web application at the callback URL /auth/:provider/callback along with the "state" as a query parameter. When this callback URL is requested by the browser, the previously stored session cookie containing the 'omniauth.state' property is also sent to our web application.

OmniAuth checks both of these and proceeds only if they match. If they don't match it raises the above mentioned "Invalid Credentials" error. (Yeah, I know, not really a helpful error message..!).

Ok, that is good to hear, but why will there be a mismatch?


A mismatch is possible only if the session cookie stored on the user's browser is changed such that the 'omniauth.state' property is removed from it or altered after the request phase has set it. This can happen if a second request to our web application was initiated while the request phase of oAuth was running and it completed after the request phase completed but before the callback phase started. Sounds complex? The diagram below illustrates it.





The diagram makes it clear as to when and how the 'omniauth.state' gets removed from the session leading to the error. However, apart from the timeline requirements (i.e. when requests start and end), there is another essential criteria for this error to occur :
The response of the "other simultaneous request" must set a new session cookie, overriding the existing one. If it does not explicitly specify a session cookie in the response headers, the client's browser will retain the existing cookie and 'omniauth.state' will be preserved in the session.
Now, from what I have observed, Rails (or one of Rack middlewares) has this nifty feature of not serializing the session and setting the session cookie in the response headers, if the session has not changed in the course of processing a request. So, in our case, if the intermediate simultaneous request does not make any changes to the session, Rails will not explicitly set the session cookie, there by preventing the loss of 'omniauth.state' property in the session.

Ok, then why will the session cookie change and lose the 'omniauth.state' property?


One obvious thing is that the "other simultaneous request" might change the session - either add or remove or edit any of the properties. There however is another player involved.

This is where the "Encrypted Cookie Store" of Rails 4 comes into picture. Prior to Rails 4, Rails did not encrypt its session cookie. It merely signed it and verified the signature when it had to de-serialize the session from the request cookie. Read how Rails 3 handles cookies for a detailed breakdown. Rails 4 goes one step ahead and encrypts the session data with AES-256 (along with the old signing mechanism. More details on that coming up in a new post). The implementation used is AES-256-CBC from OpenSSL. I am not a Cryptography expert, but the way I understand it, the property of AES is that it results in a different cipher text every time you run the encryption for the same message plain text. Or it could also be because the Rails encryption scheme initializes the encryptor with a random initialization vector every time it encrypts a session (Implementation here). Either ways, the session cookie contents are always new for every request even when the actual session object or session contents remain unchanged. As a result Rails will always set the session cookie in the response header for every request and result in the browser updating that cookie in its cookie store.

In our case, this results in the session being clobbered at the end of the "other simultaneous request" and we end up losing the 'omniauth.state' property and oAuth fails.

Umm.. ok, but when and how does this happen in real world, if at all it can?!


All these requirements/constraints described above, especially the timing constraints makes one wonder if this can really happen in the real world. Well, for starters it happened to me (hence this blog post..!!). I also tried to think of scenarios other than mine where this would happen. Here are a couple that I could think of :

Scenario - 1) FB Login is in a popup window and the "Simultaneous request" is a XHR - Ex : an analytics or tracking request.

Here is the flow :
  1. User clicks on a "Login with FB" button on your website.
  2. You popup the FB Login page in a new popup window. Request phase is initiated. But there is a small window of time before the redirect response for '/auth/facebook' is received and 'omniauth.state' is set.
  3. During that small window of time, in the main window, you send an XHR to your web app to, may be, track the click on the "Login with FB" button. You might do this to just track usage or for some A/B testing or to build a funnel, etc. This request sends the session without the 'omniauth.state'.
  4. While the XHR is in progress, the redirect from the request phase is complete and the session with 'omniauth.state' is set. The user now sees FB Login page loading and proceeds to login once it is loaded.
  5. While the user is logging in to FB and approving our app, the XHR has completed and has come back with a session without 'omniauth.state'. This is stored by the browser now.
  6. Once user logs in and approves your app, the callback state starts. But the session sent to your web app is now missing the 'omniauth.state'. 
  7. oAuth fails.
How big a deal is this scenario?

If you are indeed making a XHR in the background, then this scenario needs to be taken care of. Since the "other simultaneous" request is automatically triggered every time, it is very likely that session will get clobbered.

How to solve this?

You can either first send the XHR and then in the response handler of that XHR, you can open the FB Login page in the popup. Also have a timeout just to make sure you don't wait for too long (or forever) before you receive a response for that XHR.

Alternatively, if you can push your tracking events in a queue stored in a cookie, you can do that and then open the FB Login page. Once the FB Login completes, you can pull that event out of the queue and send it. As a backup have a code that runs on every new page load to look for pending events from the queue in the cookie and send those events.

With HTML5 in place, its probably better to use the localstorage for the queue than the cookie. But again that needs user's permission. Your call.

Scenario - 2) FB Login is in the same window/tab but User has the website opened in two tabs.

Here is the flow :
  1. User has your website opened in a browser tab - Tab-1
  2. User opens a link on your website in a second tab - Tab-2 (Ctrl + Click or 'Open in a new tab' menu item). This request sends the session without 'omniauth.state'.
  3. While that Tab-2 is loading, user clicks on "Login with FB" in Tab-1 initiating the request phase.
  4.  If the request loading in Tab-2 is a little time consuming the redirect of the request phase of oAuth in Tab-1 completes before request in Tab-2, setting the session with 'omniauth.state'. After that FB Login page is shown and user proceeds to login and authorize.
  5. While the user is logging in, the request in Tab-2 completes, but with a session that is missing 'omniauth.state'.
  6. After logging in to FB, the callback phase is initiated with a redirect to your web app, but with a session that doesn't have 'omniauth.state'. 
  7. oAuth fails. 
How big a deal is this scenario?

Not a big deal actually. In your web app, in the oAuth failure handle, you can just redirect the user back to /auth/facebook, redoing the whole process again and guess what - this time it will succeed and that too without the user having to do anything because the user is already logged in to FB and has also authorized your app. But just to be on the safer side, you would want to be careful about this loop going infinite (i.e. You start FB auth, it fails and the failure handler restarts the FB auth). Setting a cookie (different from the session cookie) with the attempt count should be good. If the attempt count crosses a certain limit, send the user back to homepage or show up an error page or show a lolcats video, c'mon be creative.

Ok, those are two scenarios that I could think of. I am not sure if there are more.

Can OmniAuth change something to solve this?


I believe so. If OmniAuth uses a different signed and/or encrypted cookie to store the state value instead of the session cookie none of this session clobbering would result in loss of the state value. OmniAuth is a Rack based app and relies on the Session middleware. I am not entirely sure, but it can probably use the Cookie middleware instead. Just set its own '_oa_state' cookie and use that during callback for verification.

Will you send a pull request making this change?


I am not sure. I first will hit the OmniAuth mailing list and find out what the wise folks there have to say about this. If it makes sense and nobody in the awesome Ruby community provides an instant patch, I will try and send a patch myself.

THE END
 
Ok, so that was the awesome ride through oAuth workings inside the OmniAuth gem. In the course I got to know quite a bit of Rails and also Ruby internals. Looking forward to writing posts about those too. Okay, okay.. fine. I will try and keep those posts short and not make them this long..!!

Till then, happy oAuthing. :-/ !

P.S : Security experts, excuse me if I have used "authentication" and "authorization" in the wrong places. I guess I have used it interchangeably as web applications typically do both with oAuth2.

Wednesday, October 16, 2013

Creating Wildcard self-signed certificates with openssl with subjectAltName (SAN - Subject Alternate Name)

For the past few hours I have been trying to create a self-signed certificate for all the sub-domains for my staging setup using wildcard subdomain.

There are a lot of guides and tutorials on the internet out there which explain the process of creating a self-signed certificate using openssl with a good amount details. Further there are also certain guides to create a self-signed cert for wildcard domain. It's fairly easy. You just specify that your Common Name (CN) a.k.a FQDN is *.yourdomain.com while creating the certificate signing request (CSR).

This will take care of all of your sub-domains under yourdomain.com (like www.yourdomain.com or mail.yourdomain.com), however your top level bare domain (yourdomain.com) itself is not covered under this certificate. When you use a certificate generated by specifying *.yourdomain.com the browsers will throw up an error when you hit your server with the top level domain name https://yourdomain.com/.

To address this X.509 certificate standard allows for a type of extension named subjectAltName (http://en.wikipedia.org/wiki/SubjectAltName). Using this you can specify that there are a few other domain names for which this certificate is valid.

This requires specifying the use of this extension while generating the certificate request AND while signing the certificate. To do this you will have to add a few things to your openssl configuration file (typically /etc/ssl/openssl.cnf on a Ubuntu like machine). Alternatively you can copy the config file to another location, add these extension stuff there and then specify the new config file for all your openssl commands. The commands shown below assume the default config file at /etc/ssl/openssl.cnf was updated with extension details.

Here is one blog post which details the updates needed for the openssl config file. http://grevi.ch/blog/ssl-certificate-request-with-subject-alternative-names-san. It also has commands for generating the private key, converting the key to a format which does not ask for password (a.k.a unencrypted key), generating the certificate request (CSR) and finally signing the certificate. The steps mentioned there until the generation of certificate request are correct. It is the last step of signing the certificate that is missing one small piece of information because of which the extension mentioned above (subjectAltName) doesn't get added to the final certificate, despite they being present in the certificate request.

After a lot of searching on the internet, copy pasting the commands exactly, trying my luck at IRC (irc.freenode.net#openssl) the answer finally appeared to me in the man pages (duh..! RTFM dude..!!).  The man page for x509 (man x509) command of openssl has this little entry :

-extfile filename
           file containing certificate extensions to use. If not specified then no extensions are added to the certificate.

So turns out that just specifying the extensions in the openssl config file is not sufficient, but you must also specify the same file as the file containing the extensions to be included in the command line using the above -extfile option.

With that added, you will get a self-signed certificate for your wildcard subdomain which is also valid for your top level bare domain.

Changes made to /etc/ssl/openssl.cnf

Uncomment the req_extensions = v3_req
req_extensions = v3_req # The extensions to add to a certificate request


Add subjectAltName to v3_req section

[ v3_req ]

# Extensions to add to a certificate request

basicConstraints = CA:FALSE
keyUsage = nonRepudiation, digitalSignature, keyEncipherment
subjectAltName = @alt_names

Finally add the alternate names for which you want this certificate to be valid. It would be your toplevel bare domain

[alt_names]
DNS.1 = yourdomain.com

This last section [alt_names] will not be present in the file. You can add it right after the [ v3_req ] section.

Once config update is done, create your certificate.
Here are the commands that I used to create such a certificate :

Create a private key :
openssl genrsa -des3 -out ssl/staging/yourdomain.com.key 2048

This will ask you for a password. Key in a simple one and remember it.

Convert the private key to an unencrypted format

openssl rsa -in ssl/staging/yourdomain.com.key -out ssl/staging/yourdomain.com.key.rsa

This will ask you for a password. Key in the same thing that you used in the previous step.

Create the certificate signing request

openssl req -new -key ssl/staging/yourdomain.com.key.rsa -out ssl/staging/yourdomain.com.csr

This will ask you for a bunch of fields. Enter *.yourdomain.com when it asks for Common Name (or FQDN). The fields after that can be left blank by just hitting return (Enter) key.

Sign the certificate with extensions

openssl x509 -req -extensions v3_req -days 365 -in ssl/staging/yourdomain.com.csr -signkey ssl/staging/yourdomain.com.key.rsa -out ssl/staging/yourdomain.com.crt -extfile /etc/ssl/openssl.cnf

Note that here we specify the openssl config file as the file file containing extensions as that is where we have defined it. Probably we can put the extensions in a separate file too, but I haven't tried that. (Homework?! :P)

Thats it. Now you have a self-signed wildcard subdomain certificate which is valid for your top level domain too.

Despite this the first time you hit your SSL enabled website with the above generated certificate your browser will show the standard "Invalid Certificate - Untrusted" page. It is because your certificate is self-signed. You can view the errors in the "Technical Details" section.

Tuesday, March 5, 2013

Package installation error on Ubuntu because of MTS MBlaze package installed

A couple of weeks ago I was trying to get the MTS MBlaze data card to work with my Ubuntu machine (A VM actually). I was happy to see that the MTS guys had provided a .deb package to be used with their data card. Installing that .deb package installed something under the name of "crossplatformui" (very misleading btw). There was probably some problem during the installation, some error thrown, but I am unable to recall that now. Despite several attempts I was not able to use the data card and I finally gave up.

However, that was not the problem. It was what followed afterwards. Ever since that attempt, during every Ubuntu software update process I would encounter an error and would be presented with a huge error log. Every time this was the error :

make -C /lib/modules/3.2.0-38-generic/build M=/usr/local/bin/ztemtApp/zteusbserial/below2.6.27 modules

make[1]: Entering directory `/usr/src/linux-headers-3.2.0-38-generic'

  CC [M]  /usr/local/bin/ztemtApp/zteusbserial/below2.6.27/usb-serial.o

/usr/local/bin/ztemtApp/zteusbserial/below2.6.27/usb-serial.c:34:28: fatal error: linux/smp_lock.h: No such file or directory

compilation terminated.

make[2]: *** [/usr/local/bin/ztemtApp/zteusbserial/below2.6.27/usb-serial.o] Error 1

make[1]: *** [_module_/usr/local/bin/ztemtApp/zteusbserial/below2.6.27] Error 2

make[1]: Leaving directory `/usr/src/linux-headers-3.2.0-38-generic'

make: *** [modules] Error 2

dpkg: error processing crossplatformui (--configure):

 subprocess installed post-installation script returned error exit status 2

This happened during literally every update. My guess is that the "crossplatformui" package never got fully installed and was in a "to be installed" zombie state because of the error in compilation and the post-installation script. That way the package manager would try to install this and run its post-install script on every software update/installation and end up reporting a package installation error.

Looking at the compile commands it was clear from the "zte" string that this was related to the internet data card software. However there was no package with that name and as I said earlier the package name "crossplatformui" is very misleading! After a little bit of looking around I found out that the "crossplatformui" package is indeed the MTS MBlaze software which is causing this problem.

Immediate first instinct was to remove the package. But alas, their post-uninstall (post-remove) script also was buggy and resulted in an error. Repeated attempts to remove it or even "Completely Remove" it failed and the package just stayed there. Finally I inspected those post-install and post-uninstall scripts. Those are present at :

/var/lib/dpkg/info/crossplatformui.postinst
/var/lib/dpkg/info/crossplatformui.postrm

The "postrm" script was trying to kill a process, but that was failing. So I commented it out and tried to kill the process myself, but there was no such process. The next failure was because of a "-d" option being passed to the "rm" command. I could not find any documentation for such an option. So I just removed it. Finally at the end of the script I put a "exit 0" because the package manager was complaining that the post-uninstall script exited with a exit value of 1.

And boy finally it was uninstalled. However these scripts still lingered around. I once again had to go to the Synaptic Package Manager and mark it for "Complete Removal" and purge it away.

Now it appears that I have gotten rid of this package and all its files. Sadly there were no updates available to check if the error had gone away, but I am thinking (and hoping) it has. Will wait for the next update to find out.

Here is the update "postrm" script : https://gist.github.com/brahmana/5087790 . If you are seeing the same package installation error, replacing the "postrm" with this script should do the trick. The changes are on line no : 12,14 and 44