Soul's three major synchronization strategies for configuring cache for high-availability gateways

Soul's three major synchronization strategies for configuring cache for high-availability gateways

Preface

The gateway is the entrance for traffic requests and plays a very important role in the microservice architecture. The importance of high availability of the gateway is self-evident. In the process of using the gateway, in order to meet business demands, it is often necessary to change the configuration, such as flow control rules, routing rules, etc. Therefore, the dynamic configuration of the gateway is an important factor in ensuring the high availability of the gateway. So, how does Soul Gateway support dynamic configuration?

[[284791]]

Those who have used Soul know that all Soul plugins are hot-swappable, and all plugin selectors and rules are dynamically configured and take effect immediately without restarting the service. However, in the process of using Soul Gateway, users also reported a lot of problems.

  • Dependence on zookeeper, which is very troublesome for users who use etcd, consul, and nacos registry centers
  • Depends on redis and influxdb. I haven't used the current limiting plug-in and monitoring plug-in yet. Why do I need these?

Therefore, we partially refactored Soul and released version 2.0 after two months of iteration.

  • The data synchronization method removes the strong dependence on Zookeeper and adds http long polling and websocket
  • The current limiting plug-in and the monitoring plug-in realize real dynamic configuration, changing from the previous yml configuration to the dynamic configuration of the admin background user

1.Some people may ask me why I don’t use the configuration center for configuration synchronization?

Answer: First of all, introducing a configuration center will add a lot of extra costs, both in terms of operation and maintenance, and will make Soul very heavy; in addition, when using a configuration center, the data format is uncontrollable, which is not convenient for soul-admin to perform configuration management.

2. Some people may ask? Dynamic configuration update? Isn't it enough for me to check the database or redis every time? What I get is the latest one, why so much trouble?

A: Soul is a gateway. In order to provide a higher response speed, all configurations are cached in the JVM's Map. Each request goes through the local cache, which is very fast. So this article can also be understood as three ways of memory synchronization in a distributed environment.

Principle Analysis

First, here is a high-definition, uncensored picture. The picture below shows the Soul data synchronization process. When the Soul gateway is started, it will synchronize configuration data from the configuration service, and support push-pull mode to obtain configuration change information and update the local cache. The administrator changes the user, rule, plug-in, and traffic configuration in the management background, and synchronizes the change information to the Soul gateway through push-pull mode. Whether it is push mode or pull mode depends on the configuration. Regarding the configuration synchronization module, it is actually a simplified configuration center.

In version 1.x, the configuration service relies on zookeeper, and the management backend pushes the change information to the gateway. Version 2.x supports webosocket, http, and zookeeper, and specifies the corresponding synchronization strategy through soul.sync.strategy. The default http long polling synchronization strategy can achieve data synchronization in seconds. However, one thing to note is that soul-web and soul-admin must use the same synchronization mechanism.

  • As shown in the figure below, soul-admin will send a configuration change notification through EventPublisher after the user makes a configuration change. EventDispatcher processes the change notification and then sends the configuration to the corresponding event handler according to the configured synchronization strategy (http, weboscket, zookeeper).
  • If it is a websocket synchronization strategy, the changed data will be actively pushed to soul-web, and at the gateway layer, there will be a corresponding WebsocketCacheHandler processor to handle the data push from the admin
  • If it is the zookeeper synchronization strategy, the changed data will be updated to zookeeper, and ZookeeperSyncCache will monitor the data changes of zookeeper and process them.

If it is an http synchronization strategy, soul-web actively initiates a long polling request with a default timeout of 90 seconds. If soul-admin has no data changes, it will block the http request. If there is a data change, it will respond with the changed data information. If there is still no data change after more than 60 seconds, it will respond with empty data. After receiving the response, the gateway layer will continue to initiate http requests and repeat the same request.

Zookeeper synchronization

The synchronization principle based on zookeeper is very simple, mainly relying on the watch mechanism of zookeeper. soul-web will monitor the configured nodes. When soul-admin starts, it will write all the data to zookeeper. When the subsequent data changes, the zookeeper nodes will be incrementally updated. At the same time, soul-web will monitor the nodes of the configuration information, and once there is a change in information, it will update the local cache.

Soul writes configuration information to the Zookeeper node through careful design.

websocket synchronization

The websocket and zookeeper mechanisms are somewhat similar. When the gateway establishes a websocket connection with the admin, the admin will push the full data once. If the configuration data changes later, the incremental data will be actively pushed to soul-web through the websocket.

When using websocket synchronization, pay special attention to disconnection and reconnection, also known as heartbeat. Soul uses the third-party library java-websocket to perform websocket connections.

  1. public class WebsocketSyncCache extends WebsocketCacheHandler {
  2. /**
  3. * The Client.
  4. */
  5. private WebSocketClient client;
  6. public WebsocketSyncCache(final SoulConfig.WebsocketConfig websocketConfig) {
  7. ScheduledThreadPoolExecutor executor = new ScheduledThreadPoolExecutor(1,
  8. SoulThreadFactory. create ( "websocket-connect" , true ));
  9. client = new WebSocketClient(new URI(websocketConfig.getUrl())) {
  10. @Override
  11. public void onOpen(final ServerHandshake serverHandshake) {
  12. //....
  13. }
  14. @Override
  15. public void onMessage(final String result) {
  16. //....
  17. }
  18. };
  19. // Make a connection
  20. client.connectBlocking();
  21. //Use the scheduling thread pool to reconnect once every 30 seconds
  22. executor.scheduleAtFixedRate(() -> {
  23. if (client != null && client.isClosed()) {
  24. client.reconnectBlocking();
  25. }
  26. }, 10, 30, TimeUnit.SECONDS);
  27. }

http long polling

The data synchronization mechanism of zookeeper and websocket is relatively simple, while http synchronization is relatively complicated. Soul draws on the design ideas of Apollo and Nacos, extracts the essence, and implements the http long polling data synchronization function. Note that this is not the traditional ajax long polling!

As shown above, the soul-web gateway requests the admin's configuration service, and the read timeout is 90s, which means that the gateway layer will wait for up to 90s to request the configuration service. This makes it easier for the admin configuration service to respond to the changed data in a timely manner, thereby achieving quasi-real-time push.

After the http request reaches sou-admin, it does not respond to data immediately, but uses the asynchronous mechanism of Servlet3.0 to respond to data asynchronously. First, the long polling request task LongPollingClient is thrown into the BlocingQueue, and the scheduling task is started and executed after 60 seconds. The purpose of this is to remove the long polling request from the queue after 60 seconds, even if there is no configuration data change during this period. Because even if there is no configuration change, the gateway must be informed, and it cannot be left waiting. In addition, there is a 90s timeout when the gateway requests the configuration service.

  1. public void doLongPolling(final HttpServletRequest request, final HttpServletResponse response) {
  2. // Because soul-web may not have received a notification of a configuration change, the MD5 value may be inconsistent, so respond immediately
  3. List<ConfigGroupEnum> changedGroup = compareMD5(request);
  4. String clientIp = getRemoteIp(request);
  5. if (CollectionUtils.isNotEmpty(changedGroup)) {
  6. this.generateResponse(response, changedGroup);
  7. return ;
  8. }
  9. // Servlet3.0 asynchronously responds to http requests
  10. final AsyncContext asyncContext = request.startAsync();
  11. asyncContext.setTimeout(0L);
  12. scheduler.execute (new LongPollingClient(asyncContext, clientIp, 60));
  13. }
  14. class LongPollingClient implements Runnable {
  15. LongPollingClient(final AsyncContext ac, final String ip, final long timeoutTime) {
  16. // Omitted......
  17. }
  18. @Override
  19. public void run() {
  20. // Add a scheduled task. If there is no configuration change within 60 seconds, it will be executed after 60 seconds to respond to the http request
  21. this.asyncTimeoutFuture = scheduler.schedule(() -> {
  22. // clients is a blocking queue that stores request information from soul-web
  23. clients.remove(LongPollingClient.this);
  24. List<ConfigGroupEnum> changedGroups = HttpLongPollingDataChangedListener.compareMD5((HttpServletRequest) asyncContext.getRequest());
  25. sendResponse(changedGroups);
  26. }, timeoutTime, TimeUnit.MILLISECONDS);
  27. //
  28. clients.add (this);
  29. }
  30. }

If the administrator changes the configuration data during this period, the long polling requests in the queue will be removed one by one, and the response data will be sent to inform which Group's data has changed (we divide plug-ins, rules, traffic configuration, and user configuration data into different groups). After receiving the response information, the gateway only knows which Group has changed the configuration, and needs to request the configuration data of the Group again. Some people may ask, why not write out the changed data directly? We also discussed this issue in depth during development, because the http long polling mechanism can only guarantee quasi-real time. If the gateway layer is not processed in time, or the administrator frequently updates the configuration, it is very likely that a configuration change will be missed. For safety reasons, we only inform a certain Group that the information has changed.

  1. // Soul-admin has a configuration change, remove the requests in the queue one by one and respond to them
  2. class DataChangeTask implements Runnable {
  3. DataChangeTask(final ConfigGroupEnum groupKey) {
  4. this.groupKey = groupKey;
  5. }
  6. @Override
  7. public void run() {
  8. for (Iterator<LongPollingClient> iter = clients.iterator(); iter.hasNext(); ) {
  9. LongPollingClient client = iter.next ();
  10. iter.remove();
  11. client.sendResponse(Collections.singletonList(groupKey));
  12. }
  13. }
  14. }

When the soul-web gateway layer receives the http response information, it pulls the change information (if there is any change), and then requests the soul-admin configuration service again, and repeats the cycle.

Quick Use

  • get soul-admin.jar
  1. > wget https://yu199195.github.io/jar/soul-admin.jar
  • start soul-admin.jar
  1. java -jar soul-admin.jar -Dspring.datasource.url= "your mysql url"   
  2. -Dspring.datasource.username= 'you username' -Dspring.datasource.password = ' you password'  
  • visit: http://localhost:8887/index.html username:admin password:123456
  • get soul-bootstrap.jar
  1. > wget https://yu199195.github.io/jar/soul-bootstrap.jar
  • start soul-bootstrap.jar
  1. java -jar soul-bootstrap.jar

Warehouse Address

github: https://github.com/Dromara/soul

gitee: https://gitee.com/shuaiqiyu/soul

<<:  10 hottest enterprise networking startups in 2019

>>:  Xiamen is selected as the most livable city. Please check out this tech gift package from Robin Li

Recommend

12 Questions about Routing! Do you know all of them?

1. When to use multiple routing protocols? When t...

What are the main reasons why enterprises turn to fiber optic networks?

The invention of fiber optic cables has revolutio...

Riverbed Redefines APM, Helps Enterprises Promote Digital Transformation

[51CTO.com original article] It is obvious that d...

How wireless technology is changing the world

How does wireless charging technology work? Befor...

How Network Modernization Drives Digital Transformation

[[422647]] The fact is that the global outbreak o...

5G commercialization has arrived, how far are 6G and the "terahertz era"?

On October 31, 2019, the three major operators an...

Should I turn off my router when I go to bed at night? This is a question

Nowadays, many people have WiFi at home and have ...

What types of single-mode optical fiber are used?

What is single mode fiber? In fiber optic technol...

...