Preface This article conducts an in-depth analysis of the security vulnerabilities of a new syntax for formatting strings introduced by Python, and provides corresponding security solutions. When we use str.format on untrusted user input, it will bring security risks - I have known about this problem for a long time, but I didn't realize its seriousness until today. Because attackers can use it to bypass the Jinja2 sandbox, which will cause serious information leakage. At the same time, I provide a new secure version of str.format at the end of this article. As a reminder, this is a pretty serious security risk, and the reason it's being written about here is that most people probably don't know how easily it can be exploited. Core Issues Starting from Python 2.6, Python introduced a new syntax for formatting strings inspired by .NET. Of course, in addition to Python, Rust and some other programming languages also support this syntax. With the help of the .format() method, this syntax can be applied to bytes and unicode strings (in Python 3, it can only be used for unicode strings). In addition, it can also be mapped to the more customizable string.Formatter API. A feature of this syntax is that one can determine the positional and keyword arguments of the string format, and can explicitly reorder the data items at any time. In addition, it is even possible to access object attributes and data items - this is the root cause of the security issue here. In general, people can use it to:
Essentially, anyone who can control the format string can potentially access various internal properties of the object. What's the problem? The first question is how to control the format string. You can start from the following places: 1. Untrusted translators in string files. These are likely to work because many applications translated into multiple languages use this new Python string formatting method, but not everyone does a thorough review of all input strings. 2. User-exposed configuration. Since some system users can configure certain behaviors, these configurations may be exposed in the form of format strings. It is important to note that I have seen some users configure notification emails, log message formats, or other basic templates through web applications. Hazard Level If you just pass the C interpreter object to the format string, it's not very dangerous, because then the most you'll expose is some integer class or something like that. However, once Python objects are passed to this format string, things get tricky. This is because the amount of things that can be exposed from a Python function is quite staggering. Here is a scenario for a hypothetical web application that could leak a key:
If a user could inject format_string here, they would be able to discover a secret string like this:
Sandboxing Formatting So, what should you do if you need someone else to provide a formatted string? In fact, you can use some undisclosed internal mechanisms to change the string formatting behavior.
Now, we can use the safe_format method instead of str.format:
summary In this article, we conducted an in-depth analysis of the security vulnerabilities of a new syntax for formatting strings introduced by Python, and provided corresponding security solutions, hoping that it will be helpful to readers. |
<<: I don't know the router's address.
>>: Accelerate 5G research and development to reduce network charges
Most IT organizations are under pressure to be mo...
To fully understand the network and its capabilit...
It has been a while since I shared information ab...
On April 18, 2018, at HAS2018, Huawei released th...
TripodCloud (Yunding Network) is a relatively low...
[[424222]] Legacy systems are as much a drag on t...
EtherNetservers is a foreign hosting company foun...
In March of this year, when the COVID-19 epidemic...
Ericsson and Swisscom have signed an expanded 5G ...
IMIDC has launched a 6.18 promotion, offering spe...
Wen Ku, spokesman for the Ministry of Industry an...
IPv6 has been gradually applied, and now many ope...
In 2019, my country's 5G commercial use was o...
[[376484]] In my work, the thing I deal with most...
The canonical definition of SASE includes five fu...