Part of SKIPJAQ's Chief Performance Officer series of blogs
SKIPJAQ CEO Rob Harrop's essential briefing for all the budding Chief Performance Officers now running (or thinking about running) applications on AWS
Performance matters more than ever. But don’t take my word for it. Mark Holt, CTO of rail ticket retailer Trainline, recently greeted the ‘opening’ of AWS’ London Region with the following quote: “From extensive testing, we know that 0.3s of latency is worth more than 8 million pounds”. While Holt was looking forward to benefitting from the lower latencies (and higher revenues) associated with the creation of a new, geographically proximate AWS datacentre, the same stats were a major talking point at AWS re:Invent in Las Vegas earlier this month - where a series of new AWS features with major implications for application performance were unveiled.
As the following notes make clear, opening up new datacentres in diverse geographical locations is by no means the only way in which Amazon is helping AWS users to reduce latencies (and reap the associated financial rewards).
One last thing: yes, there have already been quite a few posts published summarising the news from re:Invent ‘16, but I believe this post is the first to feature a review of the announcements from the narrow (but essential) perspective of application performance.
Read, share, and if you have any follow-up questions for me, please do drop me a line.
The biggest performance-related announcement of the event was easily Amazon’s new distributed tracing framework: AWS X-Ray. X-Ray records the latency of each call into your system on a service-by-service basis. From these latency records you can identify where time is being spent in your system, see which services are slow and check for any unusual calls between services.
Expanded Burstable Instances
The t2 series of burstable instances is great for workloads that deal with intermittent traffic. Sadly, the usefulness of the burst model has been limited by the relatively small amount of memory available in the t2 series. With the advent of the t2.xlarge and t2.2xlarge instances, you can now get burstable instances with 16GB and 32GB RAM respectively.
With these new higher memory instances, you can now take advantage of the cost savings associated with the burstable model for many more of your workloads.
New Instance Types
Good news for those workloads that are particular compute, IO-, or memory-limited - the new C5, I3 and R4 generation instances provide significant improvements across the board.
The new C5 instances double the maximum number of vCPUs from 36 to 72, and more than double the maximum amount of memory from 60GB to 144GB. These new instances run on Xeon Skylake processors bringing the AVX-512 instruction set to AWS, a great boon to anybody doing significant number crunching.
The I3 instances bring a significant upgrade for IO-heavy workloads. Using NVMe SSDs, I3 instances can deliver up to 3.3M random IOPS and a whopping 16GB/s of disk throughput. Such high-speed IO means these instances are ideal for transactional database workloads, and this suitability is further increased with a maximum of 64 vCPUs, 488GB RAM and up to 15.2TB of storage
For memory-heavy workloads, the new R4 instances not only double maximum memory from 244GB to 488GB and double the maximum number of vCPUs from 32 to 64, they also introduce much larger L3 caches and new, higher-frequency DDR4 RAM. These instances don’t just have more memory, they have faster memory and bigger caches. As if these machines weren’t large enough on their own, when clustered together in a Placement Group they have 20Gbps throughput for intercommunication and 12Gbps throughput to any EBS volumes.
With G2 and P2 instances, you can already access GPUs in the cloud, a fantastic feature for machine learning, graphics processing and other compute-heavy workloads. Not all workloads fit cleanly with the configurations offered by the G2 and P2 instances. Perhaps you need the IO throughput of an I3 instance, but you also want to do some number crunching on the data once it’s stored; with Elastic GPUs, you can do just that.
Just as you attach storage volumes today with Elastic Block Storage, so too will you be able to attach GPUs to your instances with Elastic GPU.
Elastic GPU is still in the works, and early indications are that it will only support Windows at launch. We think this indicates that AWS is aiming Elastic GPU at graphics workstation workloads. However, once Linux is fully-supported, the easy availability of GPUs for all instance types has the potential to improve performance for myriad workload types.
Log Analysis with Athena
If you’re an avid user of AWS Elastic Load Balancing you’ll be all too aware that the the performance metrics that come out of CloudWatch are next to useless. To get real insight into the performance of your ELB systems you need to analyse the access logs. Tucked away in the ELB documentation you’ll find instructions on how to send the log files to an S3 bucket of your choosing. Once your logs are in S3 though, you’re on your own.
With the introduction of the Athena database, AWS has made analysing data stored in S3 accessible with just a few clicks of your mouse. With Athena, log files stored in S3 are accessible using standard SQL. There’s no need to post-process the log files arriving in S3, you simply point an Athena table at them and they can be queried immediately.
Snowmobile: 21st Century Sneakernet
When Andrew Tanenbaum said “Never underestimate the bandwidth of a station wagon full of tapes hurtling down the highway”, Amazon clearly took him seriously. If you really need to transfer 100PB of data into AWS as fast as possible, it’s hard to imagine anything having the bandwidth to match a shipping container on the back of truck.
Check back soon for: our guide on analysing ELB logs with Athena!