Rate shaping Traffic with Zeus ZTM v6
Dec 13
Posted: under Networking, Zeus ZTM, Zeus ZXTM.
A couple of weeks ago I built out a rate shaping solution for a client hosting a web site that is very, very, very popular at the moment. Largely because the nation is gripped by the antics of the likes of John and Edward and the fascinated by Jordan/Katie Price’s bikini selections and swimming/caving capabilities.
So what is this rate shaping all about? Well for a kick off it is in fact simplicity itself to implement using Zeus ZTM v6 (or indeed previous ZXTM versions) SLM capabilities.
Solution Components
ZTM provides two technologies that are useful to in service monitoring/protection. The first is the SLM class and the second is the Rate class.
SLM Class
An SLM or Service Level Monitoring class is a mechanism for monitoring the response times of the site/service you provide. The SLM class provides the facility through TrafficScript to report on the percentage of requests that confirms to the threshold configured in the SLM class that the response arrives back within
Rate Class
The Rate Class is like a pipe with a definable capacity down which requests against your service flow. The pipe can only allow its maximum capacity and no more. The Rate Class (via TrafficScript) provides a overflow queue (lets think of it as a bucket catching the excess flow that isn’t getting through the pipe) that can be processed once requests have dropped below maximum capacity.
Put it all together: Step by Step
So lets set up a service that uses SLM and Rate Shaping.
I’m using the ZTM r6.02 virtual Appliance on my home lab and built this config as I wrote this blog (its that intuitive
)
Create a Virtual Service
I’ve created one called “HTTP Service”
and a Pool called “HTTP Servers”
for the purpose of this post I have used Google to provide the web servers by simply adding the node www.google.com:80
Set Up a SLM
Click catalogue and then SLM tab
In this case I have called the Class “Subscription”
The SLM class offers several values to modify but I am only interested in the millisecond response time as I am going to use TrafficScript to test the other values.
That’s the SLM class created
Now apply it to the Virtual Service
Click Edit next to the Classes tab in the “HTTP Service” Virtual Service configuration summery.
select Subscription and click update
The Virtual service is now been monitored against the response_time value set in the SLM Class. In this case 40 milliseconds.
Now we need to check the value and do something with it.
Rate Class
From the Catalogue tab select Rate and create a new Rate Class
I’ve called mine “Premium” you can have many rate classes and as is typical with ZTM the value used to determine which class to apply are numerous and highly configurable via TrafficScript. E.G. it could be the host name, referrer, GeoIP check, username, cookie value etc etc etc that determines which class to apply.
The values to configure in a rate class are simple and represent a volume of requests that your service can handle measured in requests per second and requests per minute. There are two values so that you can provide a quantification of what is sustainable by your service. If for example we could only configure 10,000 requests per minute in theory these could be delivered in the first ten seconds leaving 50 seconds where the rate class will not allow any additional connections.
Putting it all together
TrafficScript:
You need to do something if your service incredibly popular.
This TrafficScript checks if the service is conforming to our agreed SLA (Service Level Agreement) This is for you to decide what is acceptable. In this example its 95% of transactions been completed within the millisecond response time configured in the SLM Class “Subscription”. If our service drops below 95% the Rate Class is utilised to limit the number of connections that the service will handle. This is a simple TrafficScript that achieves this:
connection.setServiceLevelClass( "Subscription" );
$conforming = slm.conforming( "Subscription" );
log.info("Percentage Conforming is : ". $conforming);
# Test our SLM threshold. If response times are degrading
# apply rate shaping class to protect service
if( $conforming < 95 ) {
rate.use("Premium");
}
The Rate Class is applied while the SLM Class detects that the service is performing below 95% conforming (to the configured 40 millisecond response).
The Rate Class limits the number of connections that will be processed and also provides a mechanism for queuing excess connection attempts. This queue will be held until the level of activity drops below the per second threshold OR the TCP connection times out (which is bad for user experience if left unhandled).
To handle the excess traffic a second TrafficScript is required to manage this:
# How many queued requests are allowed before we track users.
$shapeQueue = 10;
$backlog = rate.getbacklog("Premium");
if ( $backlog > $shapeQueue )
{
http.sendResponse( 503, "text/html", resource.get( "busy.html"), "" );
}
This script sets a value as an acceptable queue length( $shapeQueue ). While the Rate Class is applied, each request is checked to see if the number of connections in the queue is greater than the desired maximum queue length.
If it is then we can handle the connection in a number of ways. In this example I have configured the ZTMs to server a busy page and importantly used the HTTP Error 503 – Service unavailable in the response. The reason I have configure this is to prevent upstream servers from caching this response.
Testing the configuration
The key to a successful deployment is making sure that millisecond response value is realistic and the number of connections configured in the Rate Class accurately reflect the threshold that the service can deliver – a small margin of error.
In many cases this can be difficult to establish without sufficiently complex load testing. If you have a very module architecture with well established performance characteristics then simply plug in the values and go home for the weekend safe in the knowledge that everything is well with the world.
If you are not so luck there is a nice way to monitor in real-time performance activity of your service in relation to the SLM and Rate Class configuration.
Example Interactive
I use Apache JMeter to create load. And that’s my next blog article
and use the ZTM current activity monitors
To get real-time feedback.