ClusterFunk


Rate shaping Traffic with Zeus ZTM v6


Dec 13

Posted: under Networking, Zeus ZTM, Zeus ZXTM.

image A couple of weeks ago I built out a rate shaping solution for a client hosting a web site that is very, very, very popular at the moment.

So what is this rate shaping all about?  Well for a kick off it is in fact simplicity itself to implement using Zeus ZTM v6 (or indeed previous ZXTM versions) SLM capabilities.

Solution Components

imageZTM provides two technologies that are useful to in service monitoring/protection. The first is the SLM class and the second is the Rate class.

SLM  Class

An SLM or Service Level Monitoring class is a mechanism for monitoring the response times of the site/service you provide. The SLM class provides the facility through TrafficScript to report on the percentage of requests that confirms to the threshold configured in the SLM class that the response arrives back within

Rate Class

The Rate Class is like a pipe with a definable capacity down which requests against your service flow. The pipe can only allow its maximum capacity and no more. The Rate Class (via TrafficScript) provides a overflow queue (lets think of it as a bucket catching the excess flow that isn’t getting through the pipe) that can be processed once requests have dropped below maximum capacity.

Put it all together: Step by Step

So lets set up a service that uses SLM and Rate Shaping.

I’m using the ZTM r6.02 virtual Appliance on my home lab and built this config as I wrote this blog (its that intuitive :) )

image

Create a Virtual Service

I’ve created one called “HTTP Service”

image

and a Pool called “HTTP Servers”

for the purpose of this post I have used Google to provide the web servers by simply adding the node www.google.com:80

image

Set Up a SLM

Click catalogue and then SLM tab

image 

In this case I have called the Class “Subscription”

The SLM class offers several values to modify but I am only interested in the millisecond response time as I am going to use TrafficScript to test the other values.

image

That’s the SLM class created :)

Now apply it to the Virtual Service

image 

Click Edit next to the Classes tab in the “HTTP Service” Virtual Service configuration summery.

image

select Subscription and click update

 

The Virtual service is now been monitored against the response_time value set in the SLM Class. In this case 40 milliseconds.

Now we need to check the value and do something with it.

Rate Class

From the Catalogue tab select Rate and create a new Rate Class

image

I’ve called mine “Premium” you can have many rate classes and as is typical with ZTM the value used to determine which class to apply are numerous and highly configurable via TrafficScript.  E.G. it could be the host name, referrer, GeoIP check, username, cookie value etc etc etc that determines which class to apply. 

The values to configure in a rate class are simple and represent a volume of requests that your service can handle measured in requests per second and requests per minute. There are two values so that you can provide a quantification of what is sustainable by your service. If for example we could only configure 10,000 requests per minute in theory these could be delivered in the first ten seconds leaving 50 seconds where the rate class will not allow any additional connections.

image

Putting it all together

TrafficScript:

You need to do something if your service incredibly popular.

image

This TrafficScript checks if the service is conforming to our agreed SLA (Service Level Agreement) This is for you to decide what is acceptable. In this example its 95% of transactions been completed within the millisecond response time configured in the SLM Class “Subscription”. If our service drops below 95% the Rate Class is utilised to limit the number of connections that the service will handle.  This is a simple TrafficScript that achieves this:

image 

connection.setServiceLevelClass( "Subscription" );
$conforming = slm.conforming( "Subscription" );

log.info("Percentage Conforming is : ". $conforming);

# Test our SLM threshold. If response times are degrading
# apply rate shaping class to protect service

if( $conforming < 95 ) {
rate.use("Premium");
}

imageThe Rate Class is applied while the SLM Class detects that the service is performing below 95% conforming (to the configured 40 millisecond response).

The Rate Class limits the number of connections that will be processed and also provides a mechanism for queuing excess connection attempts. This queue will be held until the level of activity drops below the per second threshold OR the TCP connection times out (which is bad for user experience if left unhandled).

To handle the excess traffic a second TrafficScript is required to manage this:

image

# How many queued requests are allowed before we track users.
$shapeQueue = 10;
$backlog = rate.getbacklog("Premium");
if ( $backlog > $shapeQueue )
{
http.sendResponse( 503, "text/html", resource.get( "busy.html"), "" );
}

This script sets a value as an acceptable queue length( $shapeQueue ). While the Rate Class is applied, each request is checked to see if the number of connections in the queue is greater than the desired maximum queue length.

If it is then we can handle the connection in a number of ways. In this example I have configured the ZTMs to server a busy page and importantly used the HTTP Error 503 – Service unavailable in the response. The reason I have configure this is to prevent upstream servers from caching this response.   

Testing the configuration

The key to a successful deployment is making sure that millisecond response value is realistic and the number of connections configured in the Rate Class accurately reflect the threshold that the service can deliver – a small margin of error.

In many cases this can be difficult to establish without sufficiently complex load testing. If you have a very module architecture with well established performance characteristics then simply plug in the values and go home for the weekend safe in the knowledge that everything is well with the world.

If you are not so luck there is a nice way to monitor in real-time performance activity of your service in relation to the SLM and Rate Class configuration.

Example Interactive

image

I use Apache JMeter to create load. And that’s my next blog article :)

image

and use the ZTM current activity monitors

image 

To get real-time feedback.

image

Comments (0)

Publishing web applications in complex network environments: More points to consider


Jan 21

Posted: under Networking, Tool, Tips and Tricks, Zeus ZXTM.

In my first post on this subject we look at how network routing effects the configuration of publishing of web applications. In this post I consider how load balancing effects applications and discuss some of the problems

Here is an example nTier platform. This is a common approach to web application publication using layered security to minimise the effects of compromise of any one area of the solution. 

LB Environment

In this example an ISA Server farm acting as a reverse proxy hides the internal infrastructure behind a single IP address. If you need to publishing multiple HTTPS domains you will need an IP for each (SSL) site. In this example each site is resolving to an Windows NLB VIP address hosted by the ISA Farm. 

The ISA Servers also act as the perimeter firewall with an External and Internal NIC configuration for true network segmentation. ISA Server 2006 provides basic load balancing functionality, the solution uses this capability to publish the web servers. ISA has a wizard to create a Web Farm. 

Web Farms

A collection of servers is organised into a Web Farm. ISA has two techniques for balancing the traffic to the servers in the Web Farm. Both techniques relies on round robin to balance requests. As such its a very basic mechanism and doesn’t distribute load just requests against the web servers. 

1) Session Affinity
ISA inserts a cookie into the HTTP payload creating a session id for each client requests. All subsequent request from the host includes the session cookie which ISA uses to direct the client to the same web server. This technique relies on a browser that is HTTP v1.1 if it doesn’t or cookies are disabled then ISA cannot use this method. 

2) IP Based
ISA uses the Client IP to directs the request to a specific web server. This technique is problematic if your clients are behind multiple proxies 

 

Session State

Stateless

Web applications are either statefull or stateless. HTTP is by definition stateless. A client makes a request against a server. A TCP/IP connection is created, the server responds the connection is closed and there is no persistent connection between the client and server. This is fine for static content such a readying this blog.

Stateless Application

Request

state step0

Response

state step0.1 

Transaction is completed and after a timeout period the connection is closed.

image

Client Sends another Request and a new connection is created…..

image 

Statefull Application (in load balanced environment)

Many modern web based applications are statefull, that is they need to maintain a logical link between client and specific server. e.g. shopping based activity were you want to pay for your goods. Frequently (due to PCI compliance) you are connected to a 3rd party to process credit card payments. Once this transaction is completed you are then returned to the original site for order confirmation. Without session state been maintained the application server processing your purchase may or may not be the one that continues the process.      

This example focuses on the Application state at the App Server Tier but the issue of maintaining state has to be addressed at each point that there is a load balancing decision between the client and the application tier. 

Typical statefull application in stateless configuration.

Request

state step1

Requests back and forth between client and server until a period of inactivity at the client results in a timeout of the TCP connection.

Connection Time Out

image

Next Request is directed to a different application server (AP1) and the request is unable to be processed resulting in an error on the clients system.

stateful error 

 

Using infrastructure to maintain session state.

We can address this in a number of ways:

Hardware load balancing can provide session affinity based on the origination client IP or MAC address. This is successful for simple load balanced configurations. However it is problematic for multiple tier load balanced configurations as the second tier always receives requests from a limited number of hosts at the first tier which can easily result in a uneven loading across application servers. 

Application Layer load balancing

Using Cookies for session affinity
The load balancing solution inserts a cookie into the request header, which is used to identify specific user and maintain the relationship between a client and server. This is ideal for the multiple tier load balanced environment.

It does impose the requirement that the client supports HTTP v1.1 so its not going to work for most mobile users or users that disable cookies.  

Protocol Inspection
One such method uses http session id (Session Identification URI to give it its proper name) to maintain session state between client and application server. Alternatively you can insert into the http header a value of your own choosing on which to make load balancing / session affinity decisions.

SSL (HTTPS) and Load Balancing.

When you deal with SSL traffic, it is encrypted between the client and the destination web server. It removes the opportunity to inspect the content of the request / response. Particularly useful then is the ability to load balance based on SSL session ID. Most solutions (ZXTMs and Cisco ACE for example) allow you to do this.  

You could terminate the SSL encryption on the edge of your environment and pass through HTTP traffic internally re-encrypting the traffic as it leaves you environment. Alternatively you could decrypt and then re-encrypt with internal and External SSL certs. You need to consider the load that the SSL offload will have on your solution and also consider the needs of you organisation. Hardware based solutions such as the Cisco ACE modules are licensed based on a number of SSL transactions in combination with network I/O so you will need consider the costs associated with the solution you choose. 

Development led options

There are a number of ways that this can be addressed by the developers of the application. 

Record Session State in Cookie/s

Using this method it doesn’t matter which server receives the request as the session state is recorded in the cookie. This is limited by the number and size (payload) of cookies that can be added to the http header. It requires the application to be developed to accommodate this approach so needs to be designed into the web application. The cookies can add significantly to the amount of data that is transmitted and also increase the processing overhead on the web servers. 

stateful cookie based

Record Session state in Database

The session state can be written to a database by the application server. The session id is then used to retrieve the session sate. A suitable database tier is required and obviously this has to be designed into the application from the start.

 

 image

Other Considerations

The return path is equally susceptible to problems relating to state as highlighted below. This needs to be accommodated  in your application / infrastructure design.

Response

stateful no ISA NAT 

Which is the best method to adopt?

Obviously you need to consider the platform, the applications and the infrastructure. If you are managing a simple web application such as a basic MOSS deployment with a pair of load balanced web servers your requirements are considerably different to the delivery DRM protected video assets that need to be restricted based on location of the user making the request. 

If you are managing a complex multi-tier load balanced environment you are likely to be maintaining a highly dynamic set of web based applications. The business will be constantly responding to the environment in which it operates which is likely to include frequent development of the applications, changes to the environment and often with very aggressive delivery deadline.

In my experience the key to successfully managing such environments is to be able to respond quickly and utilise solutions that are versatile. 

If you have access to product such as the excellent Zeus ZXTMs you have the opportunity to inspect and manipulate the client request and server responses directly via traffic script. Its possible to make decision based on a huge number of parameters, providing extremely granular control of the data flowing through your network, manage service levels and respond to requests differently depending on the load on the platform.

Combined with the Load balancing capabilities where decisions can be tailored based on any number of factors such as response times, time of day, requested resource or even geographic location of originating request. You have the tools to be able to operate effectively in such fast moving dynamic environments. This is why I am such a fan of the ZXTMs. They are a software solution that can be deployed very rapidly, tailored very easily by system admin and developers without input from networks. They put the control of the application function into the hands of the guys that are interested in it (no offence to the network guys out there ;) ) and they don’t cost more money if you want to increase the load that they handle.   

Comments (0)

Publishing web applications in complex network environments: Points to consider


Jan 17

Posted: under ISA Server, Networking, Zeus ZXTM.

I have recently completed a consolidation / redesign of an ISA2006 infrastructure.The resulting platform publishes multiple web farms both internally and externally. Two factor authentication for the external clients via Radius was a requirement. The internal clients were presented via an internal firewall cluster with the external clients presented via an edge firewall cluster with ISA farm sitting between the two firewall clusters.

Like this:

ISASchematic 

There were a number of problems that need to be addressed and so I have decided to highlight these. Some are common to all load balanced environments such as application state. Others only arise with multi-tier firewall / load balancer environments such as routing/firewall spoofing issues.

Network routing

One of the key requirements business have is to measure the popularity of the web applications that they publish. This can be for a large number of reasons such as to establish the effectiveness of an advertising campaign, to track usage trends or even inform strategic decisions regarding the organisation future. The Information that is available from data recorded in the logs is therefore strategically and operationally valuable.

So we want the clients details recorded in the web server logs. Easy right? That’s done by default…..

Internet user requests a web page via the published IP hosted by the ISA servers. ISA Server passes the request through to the web back-end. The response is sent (to the client) as shown below. Everyone’s happy   

external traffic ISA Schematic

Problems!

So now we consider the Internal users.

The INT VLAN has a static route that enables the internal client request to be routed to the ISA server EXT VLAN.  The ISA Server publishes the request to the web servers. The web servers send the response to the client IP which in this case is a VLAN with a route via the Internal Firewall. the Internal Firewall process the response expecting the response to be routed to the ISA INT VLAN. The Firewall closes the connection. The response is considered to be spoofed traffic (see below) due to the mismatch in source and destination IPs. The destination port on the firewall for the response is different to the port that the request was received on.  

Firewall Anti-Spoof checking
This mechanism protects against activity from spoofed or forged IP addresses, mainly by blocking packets appearing on interfaces and in directions which are logically not possible.

This diagram shows the routing problem

routing 

In order to resolve this issue, the Internal Client IP is replaced by the ISA Servers Internal NIC IP, NATing the client behind ISA servers NIC address. 

ISA NAT Schematic

So now when you look in your web server logs every request seems to come from your ISA server Internal NIC.

Immovable object meets irresistible force

So the business wants the data but the network topology prevents this from happening. What do you do?

You have several options

Don’t Log at the Web Servers

Use ISA Server logging capability to provide the information you need.

ISA server logs a huge amount of information (full list here) . ISA Server also logs to SQL Server and Microsoft provide example reports for SQL Reporting Services . However it maybe that you must get the client IP to arrive at the web server.

Modify the HTTP Header

The industry standard approach is to insert the X_FORWARDED_FOR in the http header. Using this method offer the ability to get the client IP recorded in the web server log as it is part of the HTTP request header.

ISA Server does not natively support this for two reasons.

1) X_FORWARDED_FOR is not a ratified standard. There is no Internet Engineering Task Force (IETF www.ietf.org) documentation.

2) The value recorded in the HTTP Header is easily spoofed and is not authenticated so cannot be relied on to actual represent the clients IP.

There is a plug-in for ISA Server 2004/2006 that inserts the value into the HTTP header. 

http://www.winfrasoft.com/X-Forwarded-For.htm

And an article about how to use it here

http://www.isaserver.org/tutorials/X-Forwarded-For-ISA-Firewall-Track-Originating-Client-Web-proxy-Chain-IIS.html

Alternatives

There are other options but they may or may not be viable. If you are building the solution from scratch design out routing issues. Another option is to utilise functionality of other devices in you network topology. E.G. Cisco ACE (Application Control Engine) Load Balance modules can insert the Client IP into the HTTP request.

In this Infrastructure* I would remove the ISA servers and use Zeus ZXTM v5.1 instead. The Internal and External Firewalls provide the TCP/IP packet inspection, so actually reduce the ISA server farm to a Proxy / Web Publishing solution. This is ZXTM’s core capability and for this environment there is no comparison. ZXTM’s load balancing, traffic inspection and modification is far superior to ISA Server 2006 R1. ZXTM v4.2 and above automatically adds a value "X-Cluster-Client-IP" to HTTP Header which will be recorded in the web server logs.

* There are a number of reasons for this not just the Client IP. The next post includes more of these reasons

In the next post I will look at Load Balancing, Application State and DNS.

Comments (0)