Monthly Archives: May 2015

EMC World 2015 Highlights: Building For The Cloud

By | May 27, 2015

A version of this article appeared on TechTarget SearchSOA as EMC builds for cloud and next-generation applications

EMC World is the company’s traditional venue for major announcements, but this year the company’s premier event for customers, partners, analysts and journalists included more than just a laundry list of product updates. Sure, there was plenty of new hardware, like a major upgrade to the XtremIO all flash array, a new VMAX3 tiered storage system and updates to the Data Domain data protection software, but in retrospect EMC World 2015 will be seen as a milestone in the firm’s transformation from a purveyor of bulletproof, big iron storage systems to an integrated supplier of complete software and infrastructure stacks for building next-generation enterprise applications. Of course, EMC has long had its hands in many facets of IT infrastructure via its Federation partners VMware, Pivotal and now VCE (after buying out Cisco’s share of the joint venture), but there seemed more emphasis this year on how all the parts fit together to build clouds and next-generation applications.

B8dwc6BIQAEIp19

EMC’s transformation mirrors that occurring within enterprise software writ large, as each migrates from the world of what EMC calls Platform 2, client-server applications and infrastructure designed for thousands of PC users to Platform 3 systems built with virtual servers, cloud services, big data and social features designed for millions of primarily mobile clients. Indeed, the event’s theme, Redefine Next, captured its essence: EMC redefining itself and product portfolio to better serve changed enterprise IT strategies and priorities, what VMware’s chief strategist calls the New IT Agenda.

EMC Platform 3 Model

EMC Embraces Openness

The first shocker came not from what EMC is selling, but what it’s giving away. The company announced that the ViPR software defined storage controller will be going into the wild aas an open source project. Sticking with the snake metaphor, Project CoprHD is a Mozilla Public Licensed version of ViPR, including all existing storage automation and control features with code available on GitHub in early June. The extent of corporate and developer adoption for ViPR won’t be clear for months, but the project has already been endorsed by executives at Intel and Canonical. This is the first step towards creating an open ecosystem around EMC’s SDS technology and is likely a precursor to ViPR’s eventual incorporation into OpenStack. When asked about the prospect, EMC executives were predictably cautious, stating they didn’t want to get ahead of themselves and were more focused on nurturing the open source community, but didn’t rule out the possibility. Although EMC will continue to sell a supported version of ViPR, it will be based on the public code base and all new feature development will occur within the open source project.

Comparing-CoprHD-vs-ViPR-Controller.001-1024x576

EMC’s next concession to openness comes via ScaleIO, although in this case the company isn’t giving away the store, just free samples. ScaleIO, software that turns direct attached storage across multiple servers into shared block devices on a virtual SAN, is now free to download for test and development: users must still license the product for production use. Although freemium distribution is nothing new, EMC’s move is more significant than it may appear since ScaleIO has been almost impossible for developers to get their hands short of paying for a full license, a problem EMC execs admit was greatly limiting adoption. Acknowledging new norms among developers weaned in the era of open source and GitHub, Jeremy Burton, EMC’s President of Products and Marketing said, “if developers want to steal software, we want them stealing our software.” He believes that getting ScaleIO in the hands of developers will serve to illustrate the product’s performance advantages versus public cloud services (e.g. AWS EBS, Google Persistent Disks) and open source alternatives like Ceph.

EMC’s OpenStack Erector Set

Outside the realm of storage systems and services, EMC has been busy building software and services to support next-generation Platform 3 applications. The company’s Pivotal division, a well-known leader in PaaS with a commercial Cloud Foundry release, announced several updates to its big data suite of Hadoop, Greenplum, Spring and other software for data aggregation and analysis.

The bigger news came out of EMC’s Federation Solutions group, which serves as a systems integrator for products spanning the company’s major product divisions (EMC II, VMware, VCE, Pivotal and RSA). The company unveiled and demoed Project Caspian, an enterprise OpenStack-in-a-box hardware, software bundle. Chad Sakac, EMC’s President of Global Systems Engineering, describes Caspian as  an “industrialized software stack” plus converged infrastructure, designed for cloud native, Platform 3 applications using open source components. The core of Caspian is a customized OpenStack using technology from EMC’s CloudScaling acquisition, however Sakac stresses its goal is to “stay as close to the “vanilla” OpenStack core as possible.” Next, Caspian adds a persistent data layer of object storage (like Swift), HDFS (Hadoop) and block, using ScaleIO. Finally, Caspian includes what EMC calls a Cluster Manager, an orchestration layer like Kubernetes and Mesos, to automate deployments and monitor workload and system condition.

Caspain-flow

Caspian runs on VCE’s newly announced VXRACK hyper-converged systems (think rack-scale EVO RAIL), although Caspian’s primary hardware dependence appears to be in the orchestration layer meaning customers prepared for some DIY reconfiguration could likely repurpose other converged hardware into a Caspian cloud. Indeed, Sakac’s blog mentions a hardware abstraction layer, meaning there’s some attempt to insulate the software control plane from underlying hardware details. Caspian’s persistence layer currently uses ScaleIO on DAS, but as an impressive demonstration of the new DSSD rack-scale all-flash array made clear, future versions will include an all flash tier.

Hardening the Cloud

In total, EMC World demonstrates a traditional IT vendor working to make the cloud easy, reliable and safe enough for even the most conservative enterprise. A final example to illustrate the point is Cloudlink, a small Canadian company EMC acquired earlier this year. EMC talked more about Cloudlink during a media event, but as Sakac puts it, “Cloudlink does something simple – and does it well.  They deliver a product called SecureVM that encrypts (and attests to, and otherwise controls and manages) compute instances in clouds, whether the VM images be Windows or Linux or reside on AWS, Azure, Google Cloud or vCloud Air). Cloudlink integrates on-site key management with in-cloud key repositories allowing companies to maintain full control over encryption credentials and policies while providing end-to-end VM, boot volume and application encryption.

6a01901e75b401970b01b7c7472883970b

EMC World didn’t feature a blockbuster announcement, but had many important surprises that were significant to both EMC and its customers. Overall, the event shows a company modernizing, expanding and normalizing its products into a full-stack enterprise cloud suite that can be deployed on premise or consumed as shared services. In retrospect, EMC World 2015 was ‘Pivotal’ to the company’s future.

Former Allies Now Frenemies, Cisco and EMC Chart Similar Course To Maintain Dominance

By | May 18, 2015
 Commodification promised to kill old tech stalwarts like Cisco, EMC and Oracle as standardized, mass produced hardware paired with open source software undermined their high-margin businesses. However the road to irrelevance was detoured by an instinctive response: adapt or die. Each company’s resulting metamorphosis from peddling boxes to building full stack IT systems has been both predictable and impressive. Predictable in that the old tech response to ankle biting commodity products follows the classic Clayton Christensen disruptive technology playbook: move up the performance and feature axis. But today’s old tech response deviates from the economic model by introducing a third axis: services. Cisco and EMC realize that they can’t win the price/performance game. Instead, each has concluded that businesses, even tech savvy firms, don’t obsess over about speeds and feeds, but business outcomes and competitive advantage. Old tech vendors can’t win a price war, but they can win a results war. As my column explains, juxtaposing recent statements by executives at Cisco and EMC shows the common conclusion each has reached to avoid marginalization in an era of cheap hardware and free software.

As IT bluebloods like Cisco, EMC, Dell, HP and Oracle have morphed from product segment leaders into full stack IT providers it’s turned former allies into adversaries. Companies that once saw collaborative opportunities by combining strengths in different product silos to deliver complete IT packages have gradually expanded into each other’s turf, each trying to be the one-stop IT shop. Although colliding worlds among IT vendors has turned collaboration into competition, the big firms still have much in common. Indeed, Cisco and EMC, former allies now frenemies, face a common threat from commodity hardware and open source software. Juxtaposing recent comments by their respective executives makes clear they are responding to comparable threats with similar strategies. Firms that grew by selling products with demonstrably superior performance now stress broad portfolios they’ve stitched into end-to-end technology platforms. According to Cisco’s legendary CEO John Chambers customers want “outcomes.”

Source: Cisco

As I explain in the full column, both companies are taking an Apple-like approach to enterprise IT by emphasizing higher-margin packaged “solutions” of pre-integrated hardware and software bundles that just work that appeals to beleaguered IT departments that don’t have the time or expertise to learn, test and integrate white-box hardware and open source software.


Calculating Total Cost Of AWS Deployments Is Easier Said Than Done

By | May 14, 2015

A version of this article appeared on TechTarget SearchAWS as Calculating the true cost of AWS application development

Cloud services are an easy sell: the continuously declining prices, frictionless setup, low cost of entry, offloaded admin headaches and inherent scalability are a compelling combination. It’s equally hard to argue with the long-term economic advantage of warehouse scale computing in which cloud giants like Amazon, Google and Microsoft deploy servers by the thousand, turning them into disposable units of computation and storage: metaphorically treating servers like cattle, not pets. The scale and intense competition have translated into dramatic price declines for cloud infrastructure, with one IaaS price analysis showing that “the  average price drop for the base level services included in this survey from the initial 2012 snapshot to today was 95%: what might have cost $0.35 – $0.70 an hour in 2012 is more likely to cost $0.10 – $0.30 today.” Another study finds that the hourly cost of AWS EC2 instances has dropped 56% in the last two years. However shopping for cloud application platforms is more like pricing a car with an extensive option sheet and complex lease option financing plans than a movie download. The base price, whether in VMs per hour or GB per month, is a loss-leader to snag drive-by shoppers, but one can easily get sticker shock after pricing in all the necessary details. Below, I’ll describe how to avoid nasty surprises when reviewing your first cloud invoice and given usage trends, it’s advice more organizations should heed.

Source: ScienceLogic Blog http://blog.sciencelogic.com/hostingcon-2014-how-to-compete/06/2014

Source: ScienceLogic Blog
http://blog.sciencelogic.com/hostingcon-2014-how-to-compete/06/2014

According to a recent RightScale survey, AWS remains the most popular public cloud service, although Azure is closing the gap. It’s maturity, mindshare and rich set of application services make it the default choice for many, but before assuming AWS is the best target for cloud application deployment, it’s wise to build a more accurate model of the underlying application, its various service and capacity requirements and run it through a complete price analysis. Although tracking by Cloud Spectator and Strategic Blue typically find AWS instance pricing in the middle of the pack, the cloud cost equation has too many degrees of freedom to allow easy summarization by a single number. One attempted metric is the Cloud Price Index from 451 Research, which measures the average hourly price for a typical Web application including compute, storage, relational and NoSQL databases and network traffic. Its aggregate from over ten vendors finds the typical Web application costs $1.70 per hour, or around $15,000 a year, a figure that 451 Research finds can be cut by almost half in a best-case scenario with longer-term service commitments. Much like car fuel efficiency claims: your mileage may vary.

Source: RightScale 2015 State of the Cloud Report

Source: RightScale 2015 State of the Cloud Report

The most accurate, albeit time consuming option for understanding cloud costs is to model an application’s service requirements and run the mix through a spreadsheet using one of the various online price comparison sites. Besides the calculators available from each vendor, we found four good tools to assist with cloud price shopping:

The most sophisticated and automated analysis tool is PlanForCloud, which supports complex application configurations using an arbitrary mix of compute servers, databases, storage and data transfer. Using the other sites requires manually building a spreadsheet with the various resource types and plugging in pricing information by hand.

Example Pricing Exercise

The effort required to build an accurate cost model is obviously a function of the application’s complexity. Developers trying to ballpark a new design should turn to the AWS Reference Architectures, 16 datasheets that include a high-level schematic and basic description of prototypical designs for deployments as varied as legacy batch processing to online gaming. One problem with using the reference designs for price comparisons is that they employ many of the AWS platform services like CloudFront (global CDN), Elastic Load Balancing, DNS or DynamoDB (NoSQL) that might not be available or have close equivalents on other clouds, so it’s best to stick with the core infrastructure.

Source: author

Source: author

By way of example, we’ve built a simple three-tier Web application with the following configuration (details below):

  • 3 front-end Web servers (medium Linux)
  • 2 mid-tier application servers (large Linux)
  • 2 SQL databases (large MySQL, multi-zone)
  • 1 TB object store (S3 on AWS)
  • 1 TB block store (EBS on AWS)

Starting with AWS instance types, we then duplicated the configuration as close as possible on Azure, Google and Rackspace. Assuming 24×7 usage and month-to-month pricing (no reserved instances) and adding estimates for data traffic to and from external users and between each tier of the design, we then ran each through the PlanForCloud calculator. The following chart (online here) summarizes the results:

Source: author

Source: author

Going through this exercise demonstrates several things, chiefly that cloud services aren’t uniform making it almost impossible to clone a particular configuration from one to another. Second, there are significant price differences for roughly comparable servers. For example, a large SQL DB with 100 GB of local storage runs $248 per month on AWS versus $176 on Azure. In contrast, a medium Web front end goes for $52 monthly on AWS versus $119 on Azure and almost $600 on Google.

The analysis is also overly simplified, assuming hourly pricing for systems operating 24×7, which completely neutralizes the dynamic pricing and scalability benefits of the cloud. Furthermore, anyone in that scenario could save a substantial amount of money (generally 60% or more on AWS) by using reserved instances with an annual lease (see CloudVertical’s comparison table for details). Complicating the calculus even further is the fact that Google offers sub-hour pricing with lower rates for sustained use. For example, using an instance for 100% of the billing cycle as we assumed nets an automatically applied 30% discount. Google’s model is much more attractive for highly variable workloads, while AWS is cheaper for sustained or reserved applications. Finally, we’ve discovered that PlanForCloud’s instance choices and prices aren’t always up to date, for example, it only includes a subset of the available RDS instances, meaning users should always double-check figures on the vendor’s own site.

AWS still the best default choice

Unfortunately, cloud pricing is like some Facebook relationships: it’s complicated. Storage is straightforward, but mapping different workloads to the most appropriate server instance isn’t a science. AWS has the richest service offerings, both in variety of instance types and higher level application services so it’s a great place to start your cloud shopping. It’s probably isn’t the cheapest for any particular application, but also not the most expensive. Still, it pays to shop around and those doing some up front planning may find cheaper and more appropriate cloud alternatives.


 

AWS Server Details

AWS Configuration Summary Source: PlanForCloud

AWS Configuration Summary
Source: PlanForCloud

Haswell Redesign of Intel Xeon E7 Made for Big Memory Workloads

By | May 13, 2015

Note: a version of this article appeared on TechTarget: The Intel E7 v3 processor entered the ring against IBM Power and other systems. But will the new BI features be enough to take over scale-up workloads?

Intel’s tick-tock CPU development strategy of alternating between process node shrinks and new microarchitectures has a long, consistent track record of product improvements. With the Broadwell release earlier this year, its consumer products have begun the ‘tick’ phase transition to a 14nm process, however the data center Xeon product line always lags and is just finishing up a ‘tock’ migration to the Haswell architecture. The high-end E7 is the third and final Xeon series to get the Haswell treatment, with Intel’s announcement of 12 new version 3 products now shipping.

Source: Intel

Source: Intel

Building upon the same core and internal dual ring interconnect as the E5v3 released last fall, the E7 adds several key features to support scale up, high memory, mission critical workloads. The processor core itself is the main attraction and the Haswell engine process instructions 10% faster than their Ivy Bridge predecessors, with E7 models sporting up to 18 cores sharing 45 MB of last-level cache. But the E7v3 adds other features designed for big workloads that improve memory performance, power management and I/O throughput, along with new transaction-related and crypto-acceleration features (TSX and AES-NI), better memory performance (DDR4) and greater system resiliency (Run Sure, MCA/machine check).

The E7 is designed for what used to be considered mainframe workloads — OLTP, big data business intelligence, scientific simulations — the type of applications that crunch a lot of data, require high I/O throughput, aren’t easily segmented across separate machines and support critical business functions. Unlike the E5 series, which is made for 2-socket, scale out, cloud-native workloads, the E7 provides concentrated compute firepower for a sweet spot of 4- and 8-socket systems (up to 32S) with up to 6 to 12 terabytes of memory spread across 96 and 192 memory sockets respectively.

Screenshot 2015-05-10 at 11.44.26 PM

Big, consolidated systems running equally big data applications require both bulletproof reliability and maximum transaction throughput. The new E7 delivers on the first count via a set of reliability, availability, and serviceability (RAS) features including memory mirroring and sparing (like RAID for RAM), recovery from parity errors with DDR4 memory and circuitry that allows firmware to intercept and handle both corrected and uncorrected error events. The E7v3 enhances transaction throughput via additions and updates to Intel’s transaction extensions (TSX) that speed multithreaded database applications using a technique known as hardware lock elision. A base set of TSX functions were included with the E5v3 product last fall, but later disabled due to unspecified bugs. The Haswell E7 fixes and improves upon the E5 TSX feature set and provides up  to 6-times greater OLTP throughput, for example on SAP HANA,  by enabling fine-grain locking performance using coarse-grained code.

Big Systems Ripe For Upgrades

Intel believes the E7v3 can exploit a significant upgrade opportunity in 4-8 socket systems, typically refreshed about every five years, which puts Xeon 7400-series in the crosshairs. Intel also sees the Haswell E7 appealing to organizations consolidating virtualized x86 infrastructure on fewer servers and as a replacement for aging POWER systems.

Intel-Xeon-Shipment-vols

The generational improvements in Xeon performance are dramatic. For example, Intel estimates that the OLTP performance provided by 10 racks of circa 2010 7400-series Xeons can be provided by a single rack of v3 systems. For virtualized workloads, Intel’s benchmarks on VMware ESXi show the Haswell E7 yielding up to a 2.7x performance improvement over a first-generation E7-4800 series. Unfortunately, in an era of high-core-count CPUs, software licenses can sometimes cost much more than the underlying hardware. In these situations, one of the new segment-optimized processor SKUs, which trade off core count for CPU frequency and/or power budget, can provide more bang for the buck. For example, for OLTP applications with high fees per core, Intel estimates substituting its 4-core, high frequency E7-8893 part for the mainstream 18-core 8890 product can deliver roughly the same performance for a 20% savings.

The E7v3 argument against POWER8 systems centers on ROI, namely price/performance and long-term-TCO. According to as yet unpublished SPECint_rate benchmarks from Intel, a high-end E7v3 provides roughly equivalent performance to a POWER8 system cost 10-times the price, based on an analysis including initial CapEx, facilities expenses (power, cooling) and software plus support licenses. Intel’s performance claims of a SPECint_rate_base2006 score of nearly 5,000 for an 8-socket system is plausible since published scores for Cisco and Dell 4-socket E7-8890v3 systems are 2770 and 2740 respectively. Whether by coincidence or a preemptive counter to the E7v3 release, IBM just announced two POWER8 configurations optimized for SAP HANA, one with 24 cores and 1TB of memory, the other with 40 cores and 2TB and both designed for HANA applications. IBM didn’t announce benchmarks for either configuration, but based on the specs these should compete well with a loaded E7 8000 series system.

Intel-Xeon-OLTP-Perf-per-rack

The performance advantages of in-memory databases for analytics workloads operating on large data sets are significant, but costly. 16GB DDR4 server DIMMs run about $200, meaning a terabyte system has $13,000 of RAM. In contrast, a 960GB enterprise SSD can be had for under $700. This 17:1 price difference is the driving force behind innovative new flash storage designs and interfaces. IBM has used the high speed, low-latency CAPI interface to its POWER processors for a memory adapter that makes flash look and perform like internal RAM. At the OpenPOWER Summit, Redis Labs showed comparative results from a large NoSQL app in which a system with 90% CAPI flash provided virtually identical performance (200K IOPS, sub-millisecond latency) to a 100% in-memory database with over a 70% cost savings. At EMC WORLD, the company demonstrated the DSSD rack-scale PCIe flash product executing Hadoop queries for a typical analytic app. On synthetic performance benchmarks the flash array nearly matched native RAM speeds.

DSSD System Source: https://community.emc.com/thread/213405

DSSD System
Source: https://community.emc.com/thread/213405

Creative new flash memory system designs and processor interfaces like DSSD and CAPI along with others sure to follow may mitigate demand for the E7’s target market, namely very high memory, scale-up systems, however the processor’s RAS and other features optimized for mission critical workload should still insure a healthy future for the pinnacle of x86 processor engineering.


 

E7v3 Product SKUs

Intel® Xeon® processor SKU

Cores

Frequency (GHz)

Cache

Price

E7-8890  v3 18 2.5 45M $7175
E7-8880  v3 18 2.3 45M $5896
E7-8880L v3 18 2.0 45M $6062
E7-8870 v3 18 2.1 45M $4672
E7-8893  v3 4 3.2 45M $6841
E7-8891 v3 10 2.8 45M $6841
E7-8867 v3 16 2.5 45M $4672
E7-8860 v3 16 2.2 40M $4060
E7-4850 v3 14 2.2 35M $3004
E7-4830 v3 12 2.1 30M $2169
E7-4820 v3 10 1.9 25M $1502
E7-4809 v3 8 2.0 20M $1224

 

Should IT Scale Up Or Scale Out? Intel, EMC Committed to Both

By | May 13, 2015

A juxtaposition of events last week saw several major technology companies reveal key strategies and product updates that collectively served to illustrate the ongoing struggles of old tech blue bloods from the client-server era striving for relevance in the world of mobile devices, social software, cloud services and big data analytics. Together, this amalgam of technologies are the foundation for so-called Platform 3.0 applications and as EMC’s president of products and marketing, Jeremy Burton, told attendees at EMC World this week, building them requires new approaches to application architecture and infrastructure hardware. While EMC executives spent much time explaining the business imperatives for a new generation of distributed, inherently resilient scale out platforms, the company isn’t about to give up on the legacy systems still running most enterprises: and neither is Intel, whose own announcement of a new generation of the E7 series processors focused on decidedly traditional workloads and system designs.

Intel-Xeon-tick-tock

While EMC walked the tightrope by repeatedly explaining that VXRACK, a new line of hyperconverged rack-scale, cloud-ready systems, in no way a replaces its VCE VBlock big iron, Intel was squarely focused on its own mainframe-class x86 processor, the E7, a chip that not-coincidentally powers all but the smallest VBlock configurations. Together, both announcements served notice that while the future may belong to warehouse-scale distributed cloud infrastructure, large, monolithic systems and applications aren’t going away anytime soon. For both companies, the message is about choice: delivering the right technology, optimized for particular types of workloads.

I explain why in the rest of this column and explain which workloads still need a scale-up hardware design.

Screenshot 2015-05-10 at 11.44.26 PM