Operator hijacking is a common tactic used by thieves. They target people of all ages, even children. They often arrogantly pop up some embarrassing ads in the lower right corner or at the beginning or end of the web page. This makes it difficult for mothers who are tutoring their children to explain.
1. Introduction A classic interview question: What actually happens when you enter a URL and press Enter? That depends on what URL you enter. Taobao will hurt your hands, Baidu will hurt your body, and Tencent will hurt your kidneys... 2. Mysterious Return It was a sunny day with no clouds in the sky. The latency was reduced to less than 50ms. It was a good day for free-range crawlers. As usual, with a few quick operations, the data was successfully stored in the database like a hundred rivers flowing into the sea. Just when I wanted to make a cup of coffee and look at the long-lost sky, a long error message splashed onto the screen like diarrhea! Grass (a kind of plant), has it been discovered? Check it out quickly Grass! (a powerful plant). What is this? I thought it would return an abnormal status code, or an error json, or at least fake data, but I didn't expect that even the data format was changed, and a whole HTML was thrown at me? But this interface is clearly all json. I drank some water to calm myself down but ended up burning my mouth... After thinking it over, the product can't meet this demand if it only drinks alcohol without eating food. Besides, I'm a small mosquito, so I won't use a cannon to attack. There must be! — Question! — Question! I quickly checked the logs and found the frequency. There was an exception in about 10 requests, so I got all the HTML codes. Let's learn...
3. The truth is revealed Damn! The truth is out. This lousy code is definitely not suitable for crawlers. It is not even as good as gutter oil! It is probably the work of broadband operators. Before, every visit to Baidu was set up with an iframe. But I never expected that this time, they were so desperate that they even took care of the JSON interface and made such a big, rough and hard modification! This is like when you just leave the airport and want to take a taxi to the tourist attraction, but you end up getting a black car, which pulls up a big sword for eating and sauna, and you empty your wallet and slap it on the ground! Since it is confirmed to be a black car, I will just call this silly fish. Let's see the effect first—— Haha, haha, haha... As expected. I endured the discomfort and flipped through the code, found the domain name, and ran a whois Get the company name, and search Baidu, Tianyancha, and Qichacha... It's this, it's this 4. The result? This method is usually impossible to accomplish without the collusion of broadband operators. At present, my home has one X letter and one X mobile line. After repeated testing, only the X mobile line will have this problem. . . Then the matter is very simple Complain to the Ministry of Industry and Information Technology! Coordinates: https://dxss.miit.gov.cn/ The storm is over and the sky is clear again, but this - I'm afraid it will never be the last time. 5. What should I do with my website? It's understandable that you do some hijacking, after all, the soil is like this. But the hijacking bot you wrote is too stupid, it even messes up the JSON format, how can the company run it? You know, many services now don't have web terminals, and the hijacking program should be upgraded. Once upon a time, our websites were all http, which is the favorite of hijackers. The way to deal with it is to upgrade to https in an all-round way, making it more difficult to hijack, protecting users and yourself. About the author: Xiaojieweidao (xjjdog), a public account that does not allow programmers to take detours. Focusing on infrastructure and Linux. Ten years of architecture, hundreds of billions of daily traffic, discussing the high-concurrency world with you, giving you a different taste. My personal WeChat is xjjdog0, welcome to add friends for further communication. |
<<: What does the battle for AI spectrum mean for 5G?
>>: Do you know the ins and outs of threads?
U.S. telecom operator Verizon announced on Wednes...
Currently, 5G is entering a period of accelerated...
In modern network architecture, the health of net...
Recently, Cato Networks released a survey report ...
Today, the global food challenge has become a rea...
The long-awaited 5G technology has finally arrive...
At present, my country's 5G development is ac...
Under the severe constraints of the COVID-19 epid...
In recent years, China's microwave test and m...
RAKsmart's year-end discount is coming. In ad...
1. The concept of agency I believe everyone has h...
On the 11th of this month, ShockHosting sent an e...
[Barcelona, Spain, February 26, 2024] During MW...
Hengchuang Technology is an IDC brand under Hong ...
Today's data centers are more dynamic than ev...