Friday, July 25, 2014

How to build ADFS (SAML 2.0) to KCD "proxy" using Citrix NetScaler - Part 1

Story behind this post

Some time ago I got request from customer project that they need give for customer Excel access to SQL Analysis Services which is located on our Cloud environment and customer will connect to it from they network over the internet.

First tricky part here was that this connection should be single-sign-on from user point of view.

Using SharePoint's Excel Services it is possible create Excel sheet which is located on SharePoint and Excel Services will connect to back-end service. Because data on SQL Analysis Services would be also different depending who connecting to Analysis Services, Kerberos delegation was only possible way to continue.

SharePoint contains native support for ADFS federation to it, but another tricky part was that if you enable ADFS federation to SharePoint it will authorize users to it without authenticating them. Because users are not authenticated you can't get Kerberos tickets to them so connection SQL Analysis Services autetication is not working any more. More info about that on here: http://blogs.msdn.com/b/andrasg/archive/2010/05/04/setting-up-sharepoint-2010-excel-services-to-get-external-data.aspx


How Citrix NetScaler can help with this situation?

Unlike SharePoint, NetScaler supports extracting user information from ADFS (SAML 2.0) claims and retrieve Kerberos ticket for them. That concept is called for Kerberos constrained delegation (KCD). With that Kerberos ticket NetScaler can forward user's session to any web service which supports Kerberos authentication.

When I say any web service, it really means that. So what actually happened was that I found solution which actually can provide SAML 2.0 federation support to any application which are using IIS native authentication without coding that manually to every application and because it uses Kerberos authentication to back-end services, double or even triple hop are not problem any more.

Which basically means that you can example have web page which gets data from another web server which gets data from SQL server and still use integrated authentication on SQL server side.

How authentication process works on this concept

This picture show how authentication process works on this concept.
What is missing from picture is communication with Claims Provider which actually gives SAML claims for user.

If you want get this working like federation to Office 365 you also need these:
  • ADFS server to customer's network where user actually can be authenticated using they own domain's accounts.
    • On picture above that means that after Step 3, ADFS would redirect user's browser first to customer's ADFS server and wait that it comes back with correct ADFS claim from there.
  • Because we want use Kerberos delegation to back-end service(s) you need have users created to Cloud environment's Active Directory with same identifier field (best practice is use UPN) than them are on customer's domain.
    • Important notes is that these are standard based solutions. Which means that you can use any SAML 2.0 product (ADFS, Shibboleth,etc) on you or customer's side and you can mix them if needed. And on customer's side can be any directory (Active Directory, OpenLDAP, SQL, etc) where you can get authentication information. 

Configuring

Next I will explain for you step by step how configure this concept to lab environment. I will use one empty IIS server on this example

I used totally empty Netscaler for building this lab so all needed steps should be on this guide. I used Netscaler's 10.1 version but all configurations are done from command line so them should works at least on all 10.x versions.

On this example Domain Controller, IIS server and Netscaler are all part of same Active Directory domain (yes, we will join Netscaler to domain :) )

Basic configs

Because Netscaler need get Kerberos tickets from domain controllers it need to have working DNS settings. And because on Active Directory domain Kerberos tickets are only valid for five minutes (default setting) Netscaler's clock must be on same time with domain.

You can configure both of these things using following commands:
add dns nameServer 192.168.100.11
add dns suffix contoso.com
add ntp server 192.168.100.11

We also need create virtual server front of real IIS server using Netscaler.
You can create this using normal procedure but there is two important things what you should remember.
  1. You must create server record to Netscaler instead of using destination server's IP addresses directly on service.
  2. You must user server's real hostname on server record.
    1. That mean that you can't use example "srv_" prefix on server records.
    2. This is important because Netscaler will request Kerberos tickets for that hostname so if it isn't exactly same than server's real name, them domain controller will reject that request.



I created *.contoso.com certificate to Netscaler and imported it using wildcard keypair name and all servers Trusted Root Certificatest. That solved all certificate problems what I was on this lab.

Easiest way what I know is generate self-signed certificate using following PowerShell command (PowerShell 4.0 is needed):
New-SelfSignedCertificate -DnsName *.contoso.com -CertStoreLocation cert:\LocalMachine\My
Then you can export certificate on PFX format and import that to Netscaler accordance with this instruction: http://support.citrix.com/article/CTX136444


I used following commands for configure IIS service load balancing (and enabling LB feature).
enable feature LB
enable feature SSL
enable feature AAA

add server IIS 192.168.100.21
add service svc_IIS IIS HTTP 80
add lb vserver vsrv_IIS SSL 192.168.100.20 443 -persistenceType NONE
bind lb vserver vsrv_IIS svc_IIS 
bind service svc_IIS -monitorName http
set ssl vserver vsrv_IIS -tls11 DISABLED -tls12 DISABLED
bind ssl vserver vsrv_IIS -certkeyName wildcard
On this point is good idea create DNS record for your web page and check that you can connect to it. I'm using url https://iis.contoso.com on this example.

Creating authentication vserver

We need authentication vserver on this concept so I created that one next.
I configured it firstly to use LDAP authentication because it is must easier configure than SAML so I was able to test that auth vserver works.

There is good guide for this part on here so I don't explain this part more but my commands are below: http://support.citrix.com/article/CTX126852

add authentication vserver auth_vsrv SSL 192.168.100.30 443 -AuthenticationDomain contoso.com
bind ssl vserver auth_vsrv -certkeyName wildcard

add authentication ldapAction auth_ldap_srv -serverIP 192.168.100.11 -ldapBase "dc=contoso,dc=local" -ldapBindDn ns@contoso.local -ldapBindDnPassword Password1 -ldapLoginName samAccountName
add authentication ldapPolicy auth_ldap_policy ns_true auth_ldap_srv
bind authentication vserver auth_vsrv -policy auth_ldap_policy -priority 100

add tm sessionAction sessionLDAPSSO -SSO ON -ssoCredential PRIMARY -ssoDomain contoso.local
add tm sessionPolicy sessionLDAPSSO ns_true sessionLDAPSSO
bind authentication vserver auth_vsrv -policy sessionLDAPSSO -priority 1

When authentication vserver was ready I enabled it to my IIS page using following command and tested that authentication using username and password works.
set lb vserver vsrv_IIS -AuthenticationHost auth.contoso.com -Authentication ON -authnVsName auth_vsrv

Configuring SAML support to Netscaler

Here is guide how configure SAML support to Netscaler: http://support.citrix.com/article/CTX133919

You can follow that guide but there is one important note "Metadata file is not created by default. NetScaler administrator has to create the metadata file".

That guide also contains example about metadata file but it is on screenshot format and guide actually doesn't explain very well what that file should contain. So if you are not familiar with SAML protocol it can be hard get it done.


That why I will provide for you my Netscaler's metadata file from my lab and try explain all relevant part from it.

First of all, you need give some DNS name for your Netscaler SAML IDP. You can use same IDP for multiple URLs as long all of them can use same ADFS (or any SAML provider) policies.


I used DNS name nsidp.contoso.com on this example. You don't need configure that name to DNS but it will be included to SAML signing certificate. SAML certificate can be self-signed because only other SAML providers (ADFS on this example) need to be trust it and it will be included to metadata xml. I used same method than earlier for generating this certificate.

After that I exported nsidp.contoso.com certificate again from server but now only public key and saved it on base64 encoded format.

Then I opened that .cer file on notepad, removed all end of line marks from it and copied certificate without headers to metadata XML.


Whole my metadata file is visible on here:
Important settings on that file are:
  • entityID
    • This is your Netscaler IDPs unique identity. That will be send to ADFS on all requests so it know which policy it should use.
  • ds:X509SubjectName
    • Your IDPs name.
  • ds:X509Certificate
    • Your IDPs public certificate.
  • md:AssertionConsumerService
    • This is url where ADFS will redirect user session after successfully authentication. Netscaler will send this url to ADFS on all requests but ADFS reject them if url are not configured to it.
    • URL will be automatically generated to all sites where you are using  SAML authentication.
    • You must add new url to metadata every time when you add new vserver to use SAML authentication. Each url must have unique index.
When metadata file is ready you can import it to relaying party trusts on ADFS console and follow the guide about all other steps.

Because Netscaler's certificate is self-signed, I also disabled it's CRL check using following PowerShell command on ADFS server:
Set-AdfsRelyingPartyTrust -SigningCertificateRevocationCheck None -TargetName nsidp.training.lab


ADFS server's signing certicate I uploaded to Netscaler and configured ADFS server to Netscaler's claims provider trust using following commands:
add ssl certKey adfs-signing -cert adfs-signing.cer 
add authentication samlAction auth_saml -samlIdPCertName adfs-signing -samlSigningCertName nsidp -samlRedirectUrl "https://adfs.contoso.com/adfs/ls/" -samlUserField "Name ID" -samlIssuerName auth.contoso.com


Change authentication vserver to use SAML authentication

On this point I created new authentication policy which are using SAML and changed authentication vserver to use it.
unbind authentication vserver auth_vsrv -policy auth_ldap_policy

add authentication samlPolicy auth_saml_policy ns_true auth_saml
bind authentication vserver auth_vsrv -policy auth_saml_policy

Now you should be able to connect your web page using ADFS federation. If it works you probably need test that using some browser which not do automatic login to ADFS server. Other way you can't see was ADFS used or not.


Next steps are join Netscaler to domain and generate needed Kerberos configurations.
I will write this guides part 2 about them later.

Monday, July 14, 2014

Automatic failover on two data centers SQL AlwaysOn solution with minimum number of components


Using SQL AwaysOn you can very easily create solution where you have two SQL servers (physical or virtual) located to different data centers and all data is replicated between them.

Because in AlwaysOn shared storage is not needed you also don't need expensive storage replication systems and you can even keep all data on local storage.

Automatic failover challenges

If you want that failover is automatic when active node goes down or lose network connectivity, you need plan and test carefully how traffic between nodes and from nodes to quorum works on all situations.

There is of course multiple ways solve this but I will explain for you how we solved this with our network provider.

We are using node and file share majority on our quorum model because it is only model which can be used on this case. More information why witness share is only choice on this case you can find from here: http://blogs.technet.com/b/askpfeplat/archive/2012/06/27/clustering-what-exactly-is-a-file-share-witness-and-when-should-i-use-one.aspx

Our network provider was also created support request to Microsoft and got recommendation that both cluster nodes should always see witness share even if connection between data centers is lost. That was one important reason behind our solution.
Another one was that if both cluster nodes can see each others but lose connection to witness share it only causes alert to event log (which you should monitoring) and services keeps alive.

Our solution

Our solution to problem was create totally separated route from both data centers to third data center where witness share server is located. These routes are using even on physically layer totally different devices than connections between VLANs or to default gateway.

On our solution traffic between data centers 1 and 2 is using layer 2 so both cluster nodes are in same subnet and only witness share connections are routed.

Using that solution we got route to witness share working even connection between data centers will lost or normal route between VLANs/to internet is lost. Like Microsoft's support recommended.

Logical picture

Following picture explains how SQL network is split on logical layer and how routes are done from/to witness share server.

On this example you should use example following IP settings on SQL cluster nodes and persistent routes which you can see on picture.
  • SQL cluster node 1
    • IP: 10.10.10.100
    • Netmask: 255.255.255.0
    • Gateway: 10.10.10.1
  • SQL cluster node 2
    • IP: 10.10.10.200
    • Netmask: 255.255.255.0
    • Gateway: 10.10.10.1

Limitations/challenges on our solution

I figured out at least these limitations/challenges on our solution which you should remember.
  • You need very carefully check that you are using correct router on witness share connection. If you are misconfigured that you will lose route to witness share when connection between datacenter 1 and 2 is lost.
  • You need very carefully check that you are using IP address from correct piece of address list. If you are misconfigured that you will lose route from witness share server back to your cluster node.

Benefits on this solution

Biggest benefits behind this solution is that you get low cost SQL cluster solution which can handle even whole data center crash automatically. That is very important especially with applications which are store all important data to databases.