questionThe company uses Alibaba Cloud infrastructure, and uses overseas Akamai as the DNS resolution service provider for domain names. Currently, some applications need to be called by third-party applications, and there is also a need to actively call third-party applications. Recently, many call failures have occurred. Application call failed: Gitlab pull failed: Troubleshooting1. Perform loop packet capture on ECS and modify the resolv.conf configurationPacket capture command: tcpdump -i any -s 0 port 53 and host [domain name] -C 100 -W 50 -w /tmp/dns.pcap Parameter Description: -i: specifies the network card interface to be filtered. If you want to view all network cards, you can use -i any. -s: By default, tcpdump will only capture the first 96 bytes. To capture all the message contents, use -s number, number is the number of bytes in the message you want to intercept. If it is 0, it means intercepting the entire message content. -C: file-size, tcpdump checks whether the file size exceeds file-size before saving the original data packet directly to the file. If it exceeds, the file will be closed and another file will be created to continue recording the original data packet. The newly created file name is the same as the file name specified by the -w option, but there is an additional number after the file name. The number will increase from 1 as the number of newly created files increases. The unit of file-size is million bytes (nt: here it means 1,000,000 bytes, not 1,048,576 bytes, the latter is calculated based on 1024 bytes as 1k, 1024k bytes as 1M, that is, 1M=1024 *1024 = 1,048,576). Here it is 100M -W parameter: When used together with the -C parameter, it can achieve the effect of writing files in a loop. Here is to grab 50 files -w file path and file name are used to specify the path and name of the saved file. If no path is specified, the system default path will be used. Standard default resolv.conf standard configuration: options timeout:2 attempts:3 rotate single-request-reopen #This configuration makes a random selection among all nameservers when resolving domain names. nameserver 100.100.x.xxx nameserver 100.100.x.xxx 2. Frequently execute the git pull command and wait for the error to appear3. Confirm the DNS export IP addressExecute the command multiple times: dig whoami.ds.akahelp.net txt +short "ns" "106.xx.xxx.8" "ns" "106.xx.xxx.8" "ns" "106.xx.xxx.7" "ns" "106.xx.xxx.6" "ns" "106.xx.xxx.6" "ns" "106.xx.xxx.7" "ns" "106.xx.xxx.1" "ns" "106.xx.xxx.6" "ns" "106.xx.xxx.8" "ns" "106.xx.xxx.6" "ns" "106.xx.xxx.6" "ns" "106.xx.xxx.7" 4. Check the cloud vendor server to find out the cause of the problemHere are the reasons: a. First, Alibaba Cloud's DNS service does not have a cache. b. When a user or application initiates domain name resolution c. If the Alibaba Cloud DNS server has the requested address and the TTL time has not expired, the result will be returned directly. d. Otherwise, Alibaba Cloud DNS server will go to Akamai overseas to request resolution records, but due to the network from China to overseas, Fluctuations may cause some requests to fail Temporary solutionWe used two temporary solutions before a long-term solution was implemented. 1. If it is an A record, solve it by temporarily binding Hosts 2. If it is a CNAME record or other, use Alibaba Cloud's Private Zone temporary intranet DNS resolution service to resolve it Long-term solutionIn order to solve this problem in the long term, we are still planning to put the domain name resolution service on Alibaba Cloud's cloud resolution service to ensure that there are no problems with domestic access. At the same time, the application also needs to make two changes: 1. To set up a retry mechanism for failed calls, for example, retry three times after a failure with an interval of 3 seconds each time 2. To set up a compensation mechanism for failures after retries, the business owner needs to formulate specific compensation rules. |
>>: Enterprise 5G: A guide to planning, architecture and benefits
1. How to locate the problem that an Eth-Trunk in...
DogYun has been mainly providing independent serv...
[51CTO.com original article] Recently, Riverbed l...
I shared an article about migrating from CP to DA...
[[184286]] The software development cycle require...
On November 16, 2016, GFIC2016, hosted by DVBCN&a...
In 2019, China Radio and Television, together wit...
Consumers in today's world are more "fic...
The modern computing revolution was driven by the...
With the continuous development of information te...
Aoyo Host is a long-established hosting company e...
[[419435]] Hello everyone, I am Tom~ Today I will...
On December 6, at the 2021 China Unicom Partner C...
On April 17, during the 2018 Huawei Analyst Confe...
10gbiz sent a blog reader exclusive discount code...