Server cluster load balancing (F5, LVS, DNS, CDN) difference and selection

Server cluster load balancing (F5, LVS, DNS, CDN) difference and selection

The following is the "Suggestions on the Use of Large-scale Website Architecture Optimization (PHP) and Related Open Source Software" by "Black Night Passerby"


F5 full name: F5-BIG-IP-GTM Global Traffic Manager.

It is a four to seven layer switch developed by a company called F5 Networks, with software and hardware bundled.

It is said that the BSD system was originally used, and now it is LINUX; the hardware is Intel's PC architecture, plus the surrounding network and special acceleration equipment.

Of course, we must mention the price, which is worth hundreds of thousands of RMB.

This baby is a device used to manage and distribute traffic and content, that is, load balancing.

You can tell from the name: BIG-IP.

The outside looks like an IP, but the inside is dozens of application servers. It appears as a large virtual server.

That's why I said: What a big IP.

LVS = Linux Virtual Server

It was developed by our Chinese, a doctor named Zhang Wensong,

His web:

Information on the IBM website: Cluster scalability and its distributed architecture (4)

Doctor's comparison of LVS and F5:

Regarding the difference with F5, it is difficult to explain in one sentence, they are all load balancing devices.

Although F5 is also modified based on the BSD system (it is said that the latest one is based on Linux), the important exchange part is realized through a special switching chip (similar to a special image processing chip, a large number of CPU pairs can be saved Image processing operations), so that its performance will not be very dependent on the processing power of the host's operating system.

Load balancing on F5 is mostly based on NAT/SNAT, and Proxy can also be implemented, but it is less used. As a listed company, F5 naturally does a good job in productization, regardless of the convenience of configuration management, flexibility, and performance. And stability is better.

In NAT mode, LVS has basically the same function as F5, but after all, LVS is pure software, and its performance depends on the computing power of the host.

Moreover, LVS is an open source project and should not be compared with a commercial product. It is sold for money, and there are many people to maintain and develop it. However, LVS has always been the obligation of Dr. Zhang to maintain and develop, and wants better functions. , More people need to be involved.

It's quite thorough.

DNS polling is the simplest and most effective way to implement load balancing, and the cost in all aspects is extremely low. Good goods are cheap and sufficient.

The disadvantage is that because there is no detection mechanism, it is not balanced, and the fault-tolerant response time is long.

Many domestic portals use this technology, and it works well with Squid.

Of course, without load balancing, it is the most primitive method to directly pull a few lines from multiple ISPs and provide services separately.

CDN = Content Delivery Network, content delivery network.

In detail, the above are all implementations of CDN.

There are very few open services in China (chinacache), but they are very popular abroad.

It is to provide a cache node to convert the access of the target network content into the access of the neighboring node.

Response speed/security/transparency/expansion, especially under the network structure of China where the North and the South were divided before Taiwan was liberated, it is even greater.

But it is also the service of the nobles, and the construction cost is very high.

ADSL + DDNS + CDN is also another way to build a small site.

Investing the space rental cost in the flow is directly and effective. However, the electricity bill and stability are not optimistic.

In fact, CDN is not just for website services.For example, in South Korea, most of the CND traffic is occupied by online games.

Just imagine, if CDN nodes can be deployed on a large scale, is it necessary for a game to be divided into so many zones and occupy so many servers?

The conclusion is that it is necessary.Although it is true that the access connection and response speed have a great impact on current online games, the development bottleneck lies in computing power and data storage access.

Let's imagine that by combining CDN and P2P, as long as the personal PC online provides CDN service, they can get a monthly commission of xx US dollars. The information and application providers will pay for it.

No matter how you look at it, this is a healthy development industry chain, just like Google will launch a free mobile phone next year, letting advertisers pay.

However, the only unhappy people should be ISPs, and sharing such as bt is now blocked.

Unless this business is monopolized by themselves, it is also the end of zombies.

Users are not harmless, and data security and information are challenged in a timely manner.

GVs are not watching theaters, and there are still websites that can be blocked, and a blog requires 1 million registered capital;

If a bunch of SSL-encrypted data is flowing around, Shenlong has no head and no tail, how to block and filter, how to prevent Sichuan~

Looking back, the more I look at it, the more it looks like a net pick, and I just pick up a complete section.

F5 function introduction:

1. Multi-link load balancing and redundancy

The key business related to the Internet needs to arrange and configure multiple ISP access links to ensure the quality of network services, eliminate single points of failure, and reduce downtime. The solution of multiple ISP access is not a simple routing problem of multiple different WAN networks. Because different ISPs have different autonomous domains, it is necessary to consider how to achieve load balancing of multiple links in two situations:



How internal application systems and network workstations can dynamically allocate and load balance among multiple different links when accessing Internet services and websites? This is also known as load balancing of OUTBOUND traffic.

How can external users of the Internet access internal websites and application systems dynamically balance the distribution on multiple links, and when one link is interrupted, they can intelligently automatically switch to another link to reach the server and Application system, which is also called load balancing of INBOUND traffic.

F5's BIG-IP LC can intelligently solve the above two problems:

For OUTBOUND traffic, after BIG-IP LC receives the traffic, it can intelligently distribute the OUTBOUND traffic to different INTERNET interfaces and do source address NAT. You can specify a legal IP address for source address NAT, or use BIG -IP LC's interface address is automatically mapped to ensure that the data packet can be received correctly when it returns.

For INBOUND traffic, BIG-IP LC binds the public network addresses of two ISP service providers respectively, and resolves DNS resolution requests from the two ISP service providers. BIG-IP LC can not only respond to the corresponding IP address of the LDNS according to the server's health status and response speed, but also establish a connection with the LDNS through two links, judge the quality of the link according to the RTT time, and integrate the above two parameters Respond to the corresponding IP address of the LDNS.

2. Firewall load balancing

Considering that most firewalls can only reach 30% of the line-speed throughput capacity, to make the system reach the line-speed processing capacity required by the design, multiple firewalls must be added to meet the system requirements. However, the firewall must require data to come in and out at the same time, otherwise the connection will be rejected. How to solve the firewall load balancing problem is a key issue related to the stability of the entire system.

F5's firewall load balancing solution can provide users with heterogeneous firewall load balancing and automatic troubleshooting capabilities. The typical method to improve the processing capacity of the firewall is to adopt the "firewall sandwich" method to achieve the continuity of the transparent device. This can meet the requirements of certain applications that require customers to pass the same firewall in order to successfully complete the transaction safely, and can also maintain the original network security isolation requirements. The F5 standard firewall solution is shown in the figure:

Schematic diagram of firewall load balancing connection

3. Server load balancing

For all servers that provide services to the outside world, BIG-IP can be configured with Virtual Server to achieve load balancing. At the same time, BIG-IP can continuously check the health of the server. Once a faulty server is found, it will be removed from the load balancing group.

BIG-IP uses a virtual IP address (VIP is composed of an IP address and TCP/UDP application port, which is an address) to provide users with one or more target servers (called nodes: the IP address of the target server and TCP/UDP). The port composition of the application, it can be the private network address of the internet) to provide services. Therefore, it can provide server load balancing services for a large number of TCP/IP-based network applications. Server groups are defined according to different service types, and traffic can be directed to corresponding servers according to different service ports. BIG-IP continuously performs L4 to L7 plausibility checks on the target server. When a user requests the target server service through VIP, BIG-IP depends on the performance and network health of the target server, and selects the server with the best performance to respond to the user s request. If we can make full use of all server resources and distribute all traffic to each server in a balanced manner, we can effectively avoid the occurrence of "imbalance".

Use UIE+iRules to open the TCP/UDP data packet and search for the characteristic data in it, and then perform the corresponding rule processing according to the searched characteristic data. Therefore, the traffic can be directed to the corresponding server according to the content of the user's access. For example, the traffic can be directed to the corresponding server according to the URL requested by the user.

4. High system availability

The high availability of the system can be considered from the following aspects:

4.1. The high availability of the equipment itself: F5 BIG-IP's specially optimized architecture and excellent processing capacity ensure 99.999% uptime. It can realize millisecond switching when working in dual-machine redundancy mode to ensure the stable operation of the system. A redundant power supply module is optional. When the dual-machine backup mode is adopted, the standby machine switching time will switch within 200ms at the fastest. The BIG-IP product is the only product in the industry that can achieve millisecond switching, and the design is extremely reasonable. While all sessions pass through the Active BIG-IP, the session information will be synchronized to the Backup BIG-IP through the synchronization data line to ensure that the Backup BIG-IP also has all user access session information; in addition, the watchdog chip in each device monitors the power frequency of the other device through the heartbeat line. When the Active BIG-IP fails, the watchdog will first find out and notify Backup BIG-IP Take over Shared IP, VIP, etc., and complete the switching process. Because Backup BIG-IP has pre-synchronized session information, it can ensure unimpeded access.

4.2. Link redundancy: BIG-IP can detect the operating status and availability of each link, and achieve real-time detection of link and ISP failures. In the event of a failure, traffic will be transparently and dynamically directed to other available links. By monitoring and managing the two-way traffic in and out of the data center, both internal and external users can maintain a full-time network connection.

4.3. Server redundancy, multiple servers provide services at the same time, when a server fails to provide services, the user's access will not be interrupted. BIG-IP can perform health checks on servers at different levels in the OSI seven-layer model, and monitor server health in real time. If a server fails, BIG-IP determines that it cannot provide services, and will put it in the service queue. To ensure that users can access the application normally and ensure the correctness of the response content.

5. High security

BIG-IP adopts the design principle of firewall and denies devices by default. It can add extra security protection to any site to defend against common network attacks. It can be used to support the command line SSH or the browser management SSL to facilitate and secure remote management to improve the security of the device itself; it can tear down idle connections to prevent denial of service attacks; it can perform source route tracking to prevent IP spoofing; deny without ACK SYN buffer confirmation prevents SYN attacks; rejects teartop and land attacks; protects itself and the server from ICMP attacks; does not run SMTP, FTP, TELNET or other vulnerable background programs.

The Dynamic Reaping feature of BIG-IP can efficiently delete idle connections in various network DoS attacks, which can protect BIG-IP from being paralyzed by excessive traffic. BIG-IP can accelerate the rate of disconnection as the amount of attacks increases, thereby providing a solution with strong adaptability and capable of defending the maximum amount of attacks.

BIG-IP's Delay Binding technology can provide comprehensive SYN Flood protection for servers deployed behind BIG-IP. At this time, the BIG-IP device acts as a security agent to effectively protect the entire network.

BIG-IP can cooperate with other security devices to build a dynamic security defense system. BIG-IP can generate a control access list based on the number of users' connections per unit time, and load the list to other security devices to effectively control attack traffic.

6.SSL acceleration

Each BIG-IP has an SSL hardware acceleration chip and comes with 100 TPS licenses. Users can have 100 TPS SSL acceleration functions without paying a separate fee, which saves users' investment. When the system is expanded in the future, you can simply upgrade the license to obtain higher SSL acceleration performance.

7. System Management

BIG-IP provides multiple management methods such as HTTPS, SSH, Telnet, SNMP, etc. The user client only needs the browser software that comes with the operating system, and no other software is required. Remote management can be carried out conveniently and securely through SSH which supports command line or SSL which supports browser management. The intuitive and easy-to-use Web graphical user interface service reduces the implementation cost and daily maintenance cost of the multi-homing infrastructure.

BIG-IP includes detailed real-time reports and historical records reports, which can be used to evaluate site traffic, related ISP performance, and estimated bandwidth billing cycle. The administrator can fully grasp the utilization status of bandwidth resources through the comprehensive report function.

In addition, through F5's i-Control development kit, there is currently a domestic network management software x-control based on i-Control, which can customize the monitoring system for system service characteristics, such as service traffic, various service connections, access The situation, the health status of the node, etc., are displayed visually.

The alarm method can provide syslog, snmp trap, mail and other methods.

8. Other

Memory expansion capability: F5 BIG-IP 1000 and above equipment can be expanded to 2G memory at most for a single machine, at this time, it can support 4 million concurrent calls.

Upgrade ability: All F5 equipment can be upgraded through software. During the service validity period, the upgrade software package will be provided by F5. F5 NETWORKS has released the latest version of its system BIG-IP V9.0, which mainly has the following features: virtual IPV4/IPV6 applications, accelerated Web applications up to 3 times, reduced infrastructure costs by 66% or more, and ensured high-priority applications High performance, ensuring a higher level of availability, greatly improving network and application security, powerful performance, simple management, unmatched adaptability and scalability, and breakthrough performance. Its powerful HTTP compression function can shorten user download time by 50% and save 80% of bandwidth.

IP address filtering and bandwidth control: BIG-IP can filter data packets according to the access control list, and perform bandwidth control for a key application to ensure the stable operation of key applications.

Configuration management and system reports: F5 BIG-IP provides WEB interface configuration mode and command line mode for configuration management, and provides a wealth of system reports, and can develop complex configurations and report generation through i-Control.

As a Linux/unix system engineer, I have been involved in external projects in the past few years. I have worked on the architecture of many small and medium-sized websites. I have had many contacts with F5, LVS and Nginx. I want a more easy-to-understand tone. Let me explain what is load balancing and what is Linux clustering, to help you get out of this misunderstanding and understand them in a real sense. For project construction cases, please refer to my similar article on .

1. The current website architecture is generally divided into a load balancing layer, a web layer and a database layer. In fact, I generally add another layer, the file server layer, because now as the website has more and more PVs, the pressure on the file server is also increasing. The larger the size; but with the maturity of moosefs, DRDB+Heartbeat+NFS, this problem is not big. The front-end load balancing layer of the website is called Director, which plays the role of apportioning requests. The most common one is wheel Inquiry.

2. F5 implements load balancing through hardware. It is mostly used in CDN systems for load balancing of Squid reverse acceleration clusters. It is a professional hardware load balancing device, especially suitable for the number of new connections per second and concurrency. Scenarios with high connection requirements; LVS and Nginx are implemented through software, but the stability is also quite strong, and it also performs quite well in dealing with high concurrency.

3. Nginx is less dependent on the network. In theory, as long as the ping works and the webpage access is normal, nginx can be connected. Nginx can also distinguish the internal and external networks. If it is a node that has both internal and external networks, it is equivalent to a single machine. The backup line; LVS is more dependent on the network environment. At present, the server is in the same network segment and the LVS uses the direct method to divert, the effect can be guaranteed.

4. At present, the more mature high-availability technologies for load balancing include LVS+Keepalived, Nginx+Keepalived. In the past, Nginx did not have a mature dual-machine backup solution, but it can be achieved through shell script monitoring. If you are interested, please refer to my 51cto. In addition, if you consider the high availability of Nginx's load balancing, you can also implement it through DNS polling. If you are interested, you can refer to related articles by Zhang Yan.

5. Cluster refers to the web cluster or tomcat cluster behind load balancing, but the meaning of cluster now refers to the entire system architecture. It includes load balancers and back-end application server clusters. Now many people like to use Linux clusters Refers to LVS, but I think it should be distinguished in a strict sense.

6. High availability in load balancing high availability refers to the HA of the load balancer, that is, one load balancer can be switched in less than 1 second after the other is broken. The most commonly used software is Keepalived and Heatbeat, which are mature The load balancer solutions in the production environment include Lvs+Keepalived, Nginx+Keepalived.

7. LVS has many advantages: Strong load resistance; Stable work (because of the mature HA solution); No traffic; Basically can support all applications, based on the above advantages, LVS has a lot of fans ; But there is no absolute thing in the world. LVS relies too much on the network. In the relatively complex application scenarios of the network environment, I have to abandon it and choose Nginx.

8. Nginx has little dependence on the network, and its regularity is powerful and flexible, and its powerful features have attracted many people, and the configuration is quite convenient and simple. I basically consider it in the implementation of small and medium-sized projects; of course, If funds are sufficient, F5 is the best choice.

9. In large-scale website architecture, you can actually use F5, LVS or Nginx in combination. Choose two or three of them; if you don t choose F5 for budget reasons, then the front-end point of the website should be LVS, which is DNS The direction should be the lvs equalizer. The advantages of lvs make it very suitable for this task. Important ip addresses are best managed by lvs, such as database ip, webservice server ip, etc. These ip addresses will become more and more useful over time, and failures will follow if the ip is replaced. Therefore, it is safest to hand over these important IPs to lvs hosting.

10. The VIP address is a virtual IP of Keepalived. It is an external public IP and the IP pointed to by DNS; therefore, when designing the website architecture, you must apply for an additional external IP from your IDC

11. In the actual project implementation process, it was found that both Lvs and Nginx support https very well, especially LVS, which is relatively easier to handle.

12. In the troubleshooting of LVS+Keepalived and Nginx+Keepalived, both of them are very convenient; if a system failure or server-related failure occurs, the DNS can be pointed to a real web behind them to achieve The effect of short-term troubleshooting. After all, the PV of advertising websites and e-commerce websites is money. This is why the load balancing and high availability design is designed here; I recommend direct access to the CDN system for large advertising websites.

13. Nowadays, Linux clusters are all myths. In fact, this is not too complicated; the key depends on your application scenario, whichever is applicable, Nginx, LVS, and F5 are not myths, whichever is convenient, whichever is applicable Which one to choose.

14. In addition, regarding session sharing, this is also an old question; Nginx can use the ip_hash mechanism to solve the session problem, while both F5 and LVS have a session retention mechanism to solve this problem. In addition, you can also The session is written into the database. This is also a good way to solve session sharing. Of course, this will also increase the burden on the database. It depends on the choice of the system architect.

15. The concurrency of the e-commerce website I currently maintain is about 1000, the previous securities information website is about 100, and the large-scale online advertisement is about 3000. I feel that the concurrency of the web layer is becoming less and less a problem; now because of the server Coupled with the high resistance to concurrency of Nginx as the web, the concurrency of the web layer is not a big problem; on the contrary, the pressure on the file server layer and the database layer is increasing, and NFS alone cannot be up to the present Now the good solutions are moosefs and DRDB+Heartbeat+NFS; and the Mysql server that I like, the mature application solution is still the master-slave, if the pressure is too great, I have to choose Oracle's RAC dual-machine solution.

16. Now affected by Zhang Yan, everyone is going to play Nginx (especially for web). In fact, with excellent server performance and sufficient memory, Apache's anti-concurrency ability is not weak, and the bottleneck of the entire website should still be In terms of database; I suggest that you can understand both Apache and Nginx, using Nginx for load balancing on the front end and Apache for web on the back end. The effect is also quite good.

17. The split-brain problem of Heartbeat is not as serious as expected. You can consider using it in the online environment; DRDB+Heartbeat is considered a mature application, and it is recommended to master it. I use this combination to replace EMC shared storage on quite a few occasions. After all, the price of 300,000 is not acceptable to every customer.

18. No matter how mature the design is, it is recommended to configure a Nagios monitoring machine to monitor the situation of our server in real time; email and SMS alarms can be turned on, after all, mobile phones can be carried with you; if conditions permit, you can also purchase special Commercial scanning website service, it will scan your website every one minute, if it finds no alive, it will send a warning message to your email or call directly.

19. At least for the security of the website, I suggest using a hardware firewall. The more recommended is Huasai three-layer firewall + Tiantai web firewall. DDOS security protection must be in place; iptables and SElinux of the Linux server itself can be turned off. Of course, the fewer ports are open, the better.

Supplementary note: The response time of the test website is, and it is found that LVS+Keepalived, Nginx+Keepalived does not affect the speed. Please don't worry about this. Nginx is now doing reverse Acceleration has also matured day by day.