I set up a tutorial project created by instructor Qi Yi, to get hands-on experience about the popular components and tools used in the high-volume internet architecture, and I finished the project during the semester break. The project simulates an e-commerce website featuring maternal and children related products.
The major scenario is a time limited offers or a Flash Sales, but much more intense. The example could be like ‘10 half price iPhone 10 at 10 AM’. This type of sales is quite popular in China and it even has a name called "second KO", meaning Sold out in seconds. It is understandable that this would attract high volume requests at the specific time. This project describes practical solutions for this type of scenario. The tech stack used includes Spring boot, Freemaker, Mybatis, JMeter, Mysql, Redis, RabbitMQ, Nginx etc.
The development environment, IDEA on Windows, is used directly for testing. All other components are also running in Windows.
The first scenario of this project is to display the details of product using a dynamic web page, as following(url: /goods?gid=xxx).
Tech stack used in this implementation includes:
LayUI, Freemake and Mybatis are relatively new to me, which are alternatives to Boostrap/SemanticUI, Thymeleaf and JPA. The former two are relatively easy to grasp, but Mybatis are more complex than JPA. It is more like a semi-automatic database approach rather than a fully automatic in JPA. Mybatis allows using xml file containing actual SQL statement to map the database to Java object, which definitely offers more flexibility. This might be the reason why Mybatis has more usage especially in complicated business environment.
Mysql is used as the database and total volume of the data are around 40 thousand records obtained by web crawler. The product name, description, prices, detailed specs are stored in several tables.
The second scenario is to allow the user to put order on the product. This is done by pressing the Order button. The URL for this operation is like /putOrder?gid=xxx&userId=yyy, where gid is the product ID and userId is the customer ID. The basic operation in this API is to check if there is still inventory, if yes, create a order record and decrease the inventory.
Apache JMeter is an Java application used to do performance and load test for web application. It is quite easy to configure and use.
The throughput for the base scenario (localhost/goods?gid=xxx) by the JMeter is around 266/seconds, meaning the server can handle 266 request per second.
As a single-threaded key-value database, Redis is a very important piece in the puzzle. In this project, Redis takes the following three roles.
For the first scenario where database operation is mostly reading, caching is perfect to improve the performance. Spring Boot provides several annotations, including @EnableCaching on Application and @Cacheable on data access service, that could be used to enable caching in Redis without having to change the code. For example in the basic scenario for product display, with caching enabled, the throughput can be increased from 266 to 541 per second.
For the ordering scenario, the basic process logic could have two issues, high database load and over selling due to multithreading, both of which could be addressed by Redis.
The process with Redis would be like following, add a scheduled task to monitor the promotion campaign, At the time the campaign starts, load the campaign information(products and inventory) into a Redis List of product, then the rest of the operation like ordering will be all on Redis instead of database. For each order request, remove one product until the List is empty. The single-threaded Redis would automatically ensure the thread safety.
Another usage of Redis in the project is to store the user session data. Because there could be multiple Web servers to serve the requests, the session information needs to be stored on a shared location instead on individual servers. It is also very easy for Spring Boot to enable Redis session sharing and it only requires an Annotation(@EnableRedisHttpSession ) on the Spring Boot Application instead of code changes.
In the scenario that a time consuming operation is involved, for example payment processing involving external banking system, a message queue, which represents an asynchronized logic, might be able to fit in.
The project assumes that the order creation/processing is costly and handles it using Rabbit MQ. The whole ordering logic is as following.
The product.ftl page sends a ‘/putOrder’ request using AJAX to Controller, which generates a order UUID, pushes the order details information as a message into MQ and returns. There is a listener watching over the MQ and pulls the order detail message and does the actual processing and writes the order detail into DB. The frontend waits for a few seconds then sends ‘/checkorder’ request using the order UUID. The Controller sends query Order to DB, if already created, returns orderCreated. Then frontend shows order.flt. Otherwise, return notYet, then frontend shows wait.ftl and would try latter. This should be the typical scenario where MQ is used to take the advantage of async processing and in this way, the waiting, if required, would be in the frontend (browser) instead of in the backend.
Spring Boot also has well supports for Rabbit MQ message producer and consumer.
Nignx is also popular components in the high volume scenario and it performs multiple functionalities. The first one is the load balancer, which sits in front of and forwards requests to the multiple web servers, which are typical in the high volume scenario. The second one is to cache the static content, like css and js file, which is helpful to reduce the load of web servers. The third one is to compress the static content, which reduces the requirement for band width. All of these functionalities will be done through the configuration of Nignx.
Content Delivery Network(CDN) is another common solution for web application where the some static libraries could be downloaded from CDN providers instead of from these mission critical web servers. It is helpful to reduce the load and band width.
In this project, the cloud provider Ali is used as CDN provider.
This project is very helpful to understand how these popular web components and tools fit in together to provide a reliable and scalable solution for the high volume scenario. Several components, including Rabbit MQ, Nignx and CDN, are first time to me, and a dirty hand is a helping hand.