Serverless Engineering Practice | Tips for Optimizing and Debugging Serverless Applications

Serverless Engineering Practice | Tips for Optimizing and Debugging Serverless Applications

Tips for debugging serverless applications

During the application development process, or after the application development is completed, when the execution results do not meet expectations, we need to perform some debugging work. However, under the Serverless architecture, debugging is often subject to great environmental restrictions. The developed application can run healthily and as expected locally, but some unpredictable problems may occur on the FaaS platform. In addition, in some special environments, there is no way to simulate the online environment locally, making it difficult to develop and debug the project.

The debugging of Serverless applications has always been criticized, but cloud vendors have not given up on in-depth exploration in the debugging direction. For example, Alibaba Cloud Function Compute provides a variety of debugging solutions such as online debugging and local debugging.

Online debugging

1. Simple debugging

The so-called simple debugging is debugging in the console. Taking Alibaba Cloud Function Compute as an example, it can perform basic debugging in the console through the "Execute" button, as shown in the figure.

Function online simple debugging page

When necessary, we can also simulate some events by setting Event, as shown in the figure.

Simulate events by setting Event

The benefit of online debugging is that you can use some online environments to test the code. When the online environment has resources such as VPC, it is difficult to debug in the local environment. For example, the database needs to be accessed through VPC, or there is business logic with object storage triggers.

2. Breakpoint debugging

In addition to simple debugging, some cloud vendors also support breakpoint debugging, such as remote debugging of Alibaba Cloud Function Compute and remote debugging of Tencent Cloud Cloud Function. Taking Alibaba Cloud Function Compute remote debugging as an example, it can debug the function online through the console. After creating a function, the user can select remote debugging and click the "Start Debugging" button, as shown in the figure.

Function online breakpoint debugging page (I)

After turning on debugging, wait a moment and the system will enter the remote debugging interface, as shown in the figure.

Function online breakpoint debugging page (Part 2)

At this point you can do some breakpoint debugging, as shown in the figure.

Function online breakpoint debugging page (Part 3)

Local debugging

1. Command Line Tools

At present, most FaaS platforms provide users with relatively complete command line tools, including AWS's SAM CLI and Alibaba Cloud's Funcraft. There are also some open source projects such as Serverless Framework and Serverless Devs that support multiple cloud vendors. The method of debugging code through command line tools is very simple. Take Serverless Devs as an example to debug Alibaba Cloud Function Compute locally.

First, ensure that you have a Function Compute project locally, as shown in the figure.

Local Function Compute Project

Then execute debugging instructions under the project, such as debugging in Docker, as shown in the figure.

Command line tool to debug Function Compute

2. Editor plugin

Taking the VScode plug-in as an example, after downloading the VSCode plug-in for Alibaba Cloud Function Compute and configuring the account information, you can create a new function locally and perform breakpoint debugging after marking it, as shown in the figure.

VSCode plugin debugging function calculation

After the function debugging is completed, perform operations such as deployment.

Other debugging solutions

1. Local debugging of web framework

To develop a traditional Web framework on the Alibaba Cloud FaaS platform, take the Bottle framework written in Python as an example. You can add the following code:

  1. app = bottle.default_app() and make conditionals on the run method ( if __name__ == '__main__' ): if __name__ == '__main__' : bottle.run(host= 'localhost' , port= 8080 , debug= True )For example: # index.pyimport [email protected]('/hello/<name>')def index(name): return "Hello world"app = bottle.default_app()if __name__ == '__main__': bottle.run(host='localhost', port=8080, debug=True)  

When deploying the application online, you only need to fill in ndex.app in the entry method to achieve smooth deployment.

2. Local simulation event debugging

For non-web frameworks, we can build a method locally, for example to debug an object storage trigger:

  1. import jsondef handler(event, context): print(event)def test(): event = { "events" : [ { "eventName" : "ObjectCreated:PutObject" , "eventSource" : "acs:oss" , "eventTime" : "2017-04-21T12:46:37.000Z" , "eventVersion" : "1.0" , "oss" : { "bucket" : { "arn" : "acs:oss:cn-shanghai:123456789:bucketname" , "name" : "testbucket" , "ownerIdentity" : "123456789" , "virtualBucket" : "" }, "object" : { "deltaSize" : 122539 , "eTag" : "688A7BF4F233DC9C88A80BF985AB7329" , "key" : "image/a.jpg" , "size" : 122539 }, "ossSchemaVersion" : "1.0" , "ruleId" : "9adac8e253828f4f7c0466d941fa3db81161****" }, "region" : "cn-shanghai" , "requestParameters" : { "sourceIPAddress" : "140.205.***.***" }, "responseElements" : { "requestId" : "58F9FF2D3DF792092E12044C" }, "userIdentity" : { "principalId" : "123456789" } } ] } handler(json.dumps(event), None) if __name__ == "__main__" : print(test())

In this way, by constructing an event object, simulated event triggering can be achieved.

Serverless Application Optimization

Resource assessment remains important

Although the serverless architecture is pay-as-you-go, it does not necessarily mean that it is cheaper than traditional server rental costs. If you do not accurately evaluate your project and set some indicators unreasonably, the cost of the serverless architecture may be huge.

Generally speaking, the charges of FaaS platforms are directly related to three indicators, namely, the configured function specifications (such as memory specifications, etc.), the time consumed by the program, and the traffic costs incurred. Generally speaking, the time consumed by the program may be related to the memory specifications and the business logic processed by the program itself. The traffic costs are related to the size of the data packets that the program interacts with the client. Therefore, among these three common indicators, the memory specifications may cause a large deviation in billing due to non-standard configuration. Taking Alibaba Cloud Function Compute as an example, assuming that there is a Hello World program that is executed 10,000 times a day, the costs incurred by different memory specifications (excluding network costs) are shown in the table.

From the table, we can see that when the program can run normally in 128MB memory, if the memory specification is incorrectly set to 3072MB, the monthly fee may increase by 25 times! Therefore, before launching a Serverless application, you need to evaluate the resources so that you can further reduce costs with a more reasonable configuration.

Reasonable code package specifications

Each cloud vendor's FaaS platform has restrictions on the size of code packages. Leaving aside the cloud vendor's restrictions on code packages, the impact that the code package specifications may have can be seen through the cold start process of the function, as shown in the figure.

Function cold start process diagram

During the cold start process of a function, if the uploaded code package is too large or there are too many files, resulting in slow decompression, the code loading process will be prolonged, further extending the cold start time.

Imagine that there are two compressed packages, one is a 100KB code compressed package, and the other is a 200MB code compressed package. Both are ideally downloaded at a gigabit intranet bandwidth (i.e., without considering the disk storage speed, etc.). Even if the maximum speed can reach 125MB/s, the download time of the former is less than 0.01 seconds, while the latter takes 1.6 seconds. In addition to the download time, plus the file decompression time, the cold start time of the two may differ by 2 seconds. Generally speaking, for traditional Web interfaces, if the response time is more than 2 seconds, it is actually unacceptable for many businesses, so when packaging the code, the size of the compressed package should be reduced as much as possible. Taking the Node.js project as an example, when packaging the code package, we can use methods such as Webpack to compress the size of the dependent package, further reduce the specifications of the overall code package, and improve the cold start efficiency of the function.

Reasonable reuse examples

In order to better solve the cold start problem and use resources more reasonably, instance reuse exists in the FaaS platforms of various cloud vendors. The so-called instance reuse means that when an instance completes a request, it will not be released, but will enter a silent state. Within a certain time range, if a new request is assigned, the corresponding method will be called directly without the need to initialize various resources, etc., which greatly reduces the occurrence of function cold start. To verify, we can create two functions:

  1. Function 1 : # -*- coding: utf- 8 -*-def handler(event, context): print( "Test" ) return   'hello world' function 2 : # -*- coding: utf- 8 -*-print( "Test" )def handler(event, context): return   'hello world'  

Click the "Test" button in the console to test the above two functions to determine whether "Test" is output in the log. The statistical results are shown in the table.

Function reuse record

As you can see, instance reuse actually exists. If the print("Test") statement is used to initialize a database connection, or if function 1 and function 2 load a deep learning model, does function 1 execute every time a request is made, while function 2 can reuse existing objects?

Therefore, in actual projects, some initialization operations can be implemented according to function 2, for example:

In machine learning scenarios, load the model during initialization to avoid loading the model every time the function is triggered.
Create a connection object during initialization to avoid creating a connection object for each request.
Other files that need to be downloaded and loaded during the first load are implemented during initialization to improve instance reuse efficiency.

Good at using function characteristics

The FaaS platforms of various cloud vendors have some features. The so-called platform features refer to functions that may not be specified or described in the CNCF WG-Serverless Whitepaper v1.0. They are simply functions that the cloud platform has discovered and implemented from the user's perspective based on its own business development and demands. They may only be functions that a certain cloud platform or several cloud platforms have. Generally speaking, if these functions are used properly, they will lead to a qualitative improvement in business performance.

1.Pre-freeze & Pre-stop

Taking Alibaba Cloud Function Compute as an example, during the development of the platform, the user pain points (especially those that hinder the smooth migration of traditional applications to the Serverless architecture) are as follows.

Asynchronous background metric data is delayed or lost: If it is not sent successfully during a request, it may be delayed until the next request, or the data point may be discarded.
Sending indicators synchronously increases latency: If a similar Flush interface is called after each request, it not only increases the latency of each request, but also creates unnecessary pressure on the backend service.
Graceful function offline: When an instance is shut down, the application needs to clean up connections, shut down processes, report status, etc. In Function Compute, when an instance goes offline, developers cannot know it, and there is no Webhook to notify the function instance of the offline event.
Based on these pain points, Alibaba Cloud released the Runtime Extensions feature. This feature expands on the existing HTTP service programming model and adds PreFreeze and PreStop Webhooks to the existing HTTP server model. Extension developers are responsible for implementing HTTP handlers and monitoring function instance lifecycle events, as shown in the figure.

A simplified diagram of the work content of the extended programming model and the existing programming model

PreFreeze: Each time before the Function Compute service decides to freeze the current function instance, the Function Compute service calls the HTTP GET/prefreeze path. Extension developers are responsible for implementing the corresponding logic to ensure that necessary operations before instance freezing are completed, such as waiting for the metrics to be sent successfully, as shown in the figure. The time it takes to call InvokeFunction does not include the execution time of the PreFreeze Hook.

PreFreeze Timing Diagram

PreStop: Each time before Function Compute decides to stop the current function instance, the Function Compute service calls the HTTP GET/prestop path. Extension developers are responsible for implementing the corresponding logic to ensure that necessary operations are completed before the instance is released, such as waiting for the database link to be closed, and reporting and updating the status, as shown in the figure.

PreStope timing diagram

2. Single instance with multiple concurrent connections

As we all know, the function computing of each cloud vendor is usually isolated at the request level, that is, when the client initiates 3 requests to the function computing at the same time, theoretically 3 instances will be generated to respond, which may involve problems such as cold start and status association between requests. Therefore, some cloud vendors provide the ability of single instance multi-concurrency (such as Alibaba Cloud Function Compute). This capability allows users to set an instance concurrency (InstanceConcurrency) for the function, that is, a single function instance can process multiple requests at the same time, as shown in the figure.

Simple diagram of single instance multi-concurrency effect

As shown in the figure above, assuming that there are 3 requests that need to be processed at the same time, when the instance concurrency is set to 1, Function Compute needs to create 3 instances to process these 3 requests, and each instance processes 1 request. When the instance concurrency is set to 10 (that is, 1 instance can process 10 requests at the same time), Function Compute only needs to create 1 instance to process these 3 requests.

The advantages of single instance with multiple concurrency are as follows.

Reduce execution time and save costs. For example, partial I/O functions can process requests concurrently within one instance, reducing the number of instances and thus reducing the total execution time.
Requests can share state. Multiple requests can share the database connection pool in one instance, thus reducing the number of connections to the database.
Reduce the probability of cold starts. Since multiple requests can be processed within one instance, the number of times new instances are created is reduced, and the probability of cold starts is reduced.
Reduce the occupied VPC IP. Under the same load, single instance with multiple concurrent users can reduce the total number of instances, thereby reducing the occupied VPC IP.
The application scenarios of single instance multi-concurrency are relatively broad. For example, the scenario where a function spends a lot of time waiting for the response of downstream services is more suitable for this function. There are also scenarios where single instance multi-concurrency is not suitable for application. For example, when there is a shared state in the function and it cannot be accessed concurrently, the execution of a single request consumes a lot of CPU and memory resources. In this case, it is not suitable to use the single instance multi-concurrency function.

<<:  Cure the difficulty of choosing! What are the differences between 5G, Wi-Fi 6, and Wi-Fi 6E?

>>:  New space age opens opportunities for edge computing

Recommend

20 billion daily traffic, Ctrip gateway architecture design

20 billion daily traffic, Ctrip gateway architect...

my country has initially built the world's largest 5G mobile network

"Since the implementation of network speed-u...

Developing strategies at the data center level

Data centers are the infrastructure for modern bu...

5G drives growth in rising private mobile network market

The use of dedicated mobile networks based on LTE...

In the digital age, how should enterprises achieve excellent digital experience?

[51CTO.com original article] Driven by mainstream...

Byte side: TCP three-way handshake, very detailed questions!

Hello everyone, I am Xiaolin. A reader was asked ...

COVID-19 pandemic boosts 5G enterprise use cases, study finds

Global technology market advisory firm ABI Resear...

A thread pool that novices can understand at a glance

I believe everyone can feel that using multithrea...