RE: [load balancing] RE: LoadBalancing products: Foundry ServerI ron
This was my experience with Arrowpoint: (BTW, we ended up abandoning the Architecture outlined below. It was originally designed with certain goals in mind and decided to go another route.) ###########################INTERNAL MEMO #################### Date: 29 December 1999 To: #########, President #########, Sr. Vice President Technology From: ##########, Chief Systems Architect RE: Arrowpoint Communications CS100 / CS800 Solution The review and testing of the Arrowpoint Communication products CS100 and CS800 as possible solutions for the CompanyXYZ distributed virtual hosting platform has been concluded and this product line has been determined to be insufficient to our defined goals. This conclusion is based on development testing, production service, information provided by the upper levels of technical support & senior development staff, and Arrowpoint defined development timelines. Additional facts pertaining to Arrowpoint’s sales practices are referenced here for consideration in future product assessments. Architectural Goals for Product Outline of SystemXYZ Architectural Goals as initially provided to Arrowpoint Communications. These goals were repeated in their entirety to various technicians, developers, and technical managers: Two clusters (one unix, one NT) initially served by one CS100 device. Each Cluster will initially retain two servers but should be able to scale up to at least 16 servers per cluster. Each Cluster is expected to support 2000 websites for a total 4000 websites. Due to the unique architecture of the platform, each server will maintain unique bindings for each website instance. DNS portrayal to the Internet will be propogated as one (1) single IP to represent each cluster. The Arrowpoint device will intercept the intended address and based on layer 4 HTTP/1.1 compliant headers will determine redistribution based on the ‘Host’ key-value pair of the HTTP header. The redistribution rules will utilize NAT to redirect request to local IP:Port binding unique to each website/server instance. Rule utilization is as follows: 2 Clusters x 2000 sites = 4000 HTTP Services 2 Clusters x 2 Servers x 2000 IP:Ports = 8000 rules Persistance methods: Load (primary), cookie (on request) Expected Throughput for both clusters: 8mbps to 3gbps Product Claims White papers made available online for full release versions and beta versions detailed that expectations were well within defined product guidelines. These technical outlines were reinforced with product documentation upon delivery. Discussions with the attending salesman and sales engineers confirmed that the product could comfortably achieve the defined goals. Based upon discussions regarding development cycles and product extensibility, it was determined that product testing was desirable. Testing Results Testing and review consisted of several phases including manageability, resource utilization, network throughput, reliability, and scalability of two versions of full release software and several beta versions. Manageability: All CLI based management is straightforward. Configuration is for the most part logical and heirarchal. Later in testing, it became apparent that some features needed some massaging. Stalling in scrolling through large configurations a know bug being addressed in version to be release in mid-2000 using “collapse-able” configurations. Resource Utilization: Concerns first arose when 40 sites (20 per cluster x 2 clusters) were configured and tested. Utilization climbed and stayed at 38% CPU utilization. This test included NAT, round robin, and approximately 35mbps of traffic. Future testing would indicate that CPU utilization rose in conjunction with rule sets and load. Addressing this issue became the focus of the next four months of testing, product revisions, and major concerns. During the following months, various concessions were made to the SystemXYZ platform to accommodate redefinitions of the CS100 resource availability. Despite that Arrowpoint sales and sales engineers identified certain ranges that could be expanded to accommodate larger rule sets than those defined with in the technical specification papers, it quickly became apparent that the 5000 rule sets defined in the white sheets was an inflated number. Apparently, modifications in the software that added functionality also introduced additional memory requirements and these were not reflected in the white sheets. Over a period of six months, we went from assurances that the CS100 product could accommodate 5000+ rule sets, down to the major concerns of Arrowpoint offering to give us three additional CS100’s to support a total of 250 rule sets after determining that the product could reliably support 250 ruleset at that level. Additionally, under this configuration, certain methods of determining service status were no longer available. This meant that service status was only determined via ICMP calls which can indicate whether a server is available on a network, but which are poor indicators of a running WWW server. A review of future code revisions indicated that by MID-2000, a partial solution would probably be made available. A review of the CS800 solution offers an impractical remedy for the resource demands. The CS800 offers only a 100% to 200% increase in processing capabilities due to the software architecture’s memory demands. Explorations of memory expansion capabilities with Arrowpoint development members resulted in the consensus that the boards they base the architecture on could not physically accommodate additional memory that would be required to meet. The CS800 uses up to two of these same boards of up to 256MB RAM each. This revelation indicated that our concept of the CS800 as an upgrade path was invalid since all experience with the product was indicating that 512MB would be required to run 250 services with 2 ruleset each. Network Throughput: As a matter of course, two moderate volume Anonymous FTP servers where introduced to the CS100. The two servers only serviced FTP and telnet access (for maintenance). The CS100 was configured for 20 basic WWW / Round robin rulesets for use by SystemXYZ platform developers and the handfull of rulesets required to support the two ftp servers. The FTP servers were returns to standard TCP/IP Round Robin DNS outside of the CS100 environment after several code revisions, a full release code, and several undefined CS100 device crashes resulting in either no restoration of services or partial restoration of service. Through two full releases and various in-between beta releases, a bug was apparently allowed to persist that did not release closed sessions which resulted in up to 250 stale ftp sessions that only cleared on power cycling (but apparently not in crashing). The claim for 5GBPS on the switch fabric was never achieved due to our own resource limitations in benchmarking. However, 35mbps of WWW traffic did test successfully using the resources defined above. Reliability: Throughout the six month review and testing of the CS100 product various codes, both full releases and betas, were explored. Beta code was not considered when regarding the stability of the product with the exception of understanding Arrowpoint's pursuit of addressing any ongoing stability issues in release versions. The first release version that was provided was stable. The beta versions of the following release were understandably unstable, but did not apparently address an “undefined” crashing problem that continued to occur in the full release. It has been the consensus that some type of memory leak occurs that causes the crashing, however, up to the time of our decision to pursue other solutions; the issue of the device crashing with or without load had not been resolved. Scalability: Due to the problems with other facets of the Arrowpoint product, scalability testing was never reached. Product Development Goals Arrowpoint personnel are courteous professionals who devote a great deal of time to seeking acceptable solutions. The level of this devotion was one of the reasons that attracted us to Arrowpoint and kept us pursuing the product for six months. Arrowpoint had indicated their willingness to incorporate enhancements to their product based on feedback from SystemXYZ and there is no doubt that Arrowpoint would have accommodated us to the limit of their hardware. Discussions with Senior Developers of the Arrowpoint product line outlined their development cycle for the next year which, unfortunately, did not include utilizing or exploring solutions to the memory barrier that was physically limited by the circuit boards they used for the CS100, CS150, and CS800 products. Additionally, existing memory would be consumed by new functions that were being added to the code. Summary Conclusion Despite both Arrowpoint’s and SystemXYZ’s best efforts, the Arrowpoint product line is not well suited to the SystemXYZ virtual webhosting platform or for recommendation to CompanyXYZ clients interested in load balancing solutions. It may come to pass that Arrowpoint will find a solution to their physical memory limitations beyond their currently defined development cycle that will result in their ability to support the number of rulesets outlined in their white sheets. As of this writing, the supportable rule sets sharply dropped from 5,000 on one box to 250 across four boxes. Although Arrowpoint offered to provide the three additional boxes at no additional fee, subsequent box requirements would be associated with a fee, not too mention the cost of the rackspace for the additional devices. Additionally, coordination of the management of four devices to service 250 sites would introduce it’s own instability factors. The current combination of the additional inferred costs, loss in critical supported feature sets, and the projected development cycle that does not include resolution to memory constraints outweighs any prospects for continuing to pursue the Arrowpoint products as a practical solution. Once the product has survived its infancy, the feature set that Arrowpoint has outlined will certainly be of interest to many hosting companies. As for recommendations to clients, there are a wide variety of load balancing platforms that offer minimal feature sets that provide acceptable load balancing features at a much lower pricepoint and with reputations for stability, at minimum.
participants (1)
-
Karyn Ulriksen