Baidu can't stand it

Baidu can't stand it

Lao Lao Noodles

Source: https://www.nowcoder.com/feed/main/detail/d39aabc0debd4dba810b4b9671d54348

Note: Get into the habit of looking at the real test first, answering it yourself, and then looking at the analysis reference (don’t forget to click three times at a time~)

1. Basic questions

  • How many network io models are there?
  • In what scenarios do you know of and have used the asynchronous network model? (Answered thread-related scenarios)
  • In addition to using threads, is there any other way to complete asynchronous operations?
  • How to implement synchronous blocking and synchronous non-blocking at the Java level? (The previous network IO model was answered smoothly, but the specific implementation details need to be improved)
  • Describe a complete HTTP request
  • How many ways to implement long connections do you know?
  • What parts does an HTTP request contain?

2. Code Questions

  • Design a HashSet (no idea at all)

3. Scenario Questions

  • How to load 1TB of data into 200MB of memory and find two rows of identical data?
  • Java opens a 1T file, what is the first operation?
  • What is the difference between opening a file with code and opening a file with the mouse?

Note: The blogger will not introduce the basic questions in detail, but only analyze the classic questions.

What network IO models do you know?

Common network IO models are as follows:

  1. Blocking IO Model: In this model, when a thread performs an IO operation, it will be blocked until the IO operation is completed.
  2. Non-Blocking IO Model: In this model, when a thread performs an IO operation, it will not block all the time, but will return immediately to tell the caller whether the IO operation is completed.
  3. Multiplexed IO Model: A thread can monitor multiple IO operations at the same time. When an IO operation is completed, it will notify the thread to process it.
  4. Signal Driven IO Model: In this model, when an IO operation is completed, the operating system sends a signal to the application, and the application processes it after receiving the signal.
  5. Asynchronous IO Model: After an application initiates an IO operation, it does not need to wait for the operation to complete, but can continue to perform other operations. When the IO operation is completed, the operating system will notify the application to process it.

In what scenarios have you used the asynchronous network model? Where can it be applied specifically?

  1. Highly concurrent Web applications: In Web applications, the asynchronous network model can improve the concurrent processing capabilities of the server, reduce the blocking waiting time of threads, and improve the throughput of the system.
  2. High-performance network server: In the network server, the asynchronous network model can improve the concurrent processing capability of the server, reduce the blocking waiting time of threads, and improve the throughput of the system.
  3. Large-scale real-time data processing system: In a real-time data processing system, the asynchronous network model can improve data processing efficiency, reduce data processing delay time, and improve the real-time performance of the system.
  4. Large-scale distributed systems: In distributed systems, the asynchronous network model can improve the system's concurrent processing capabilities, reduce thread blocking waiting time, and improve system throughput.

The asynchronous network model can be applied to any scenario that requires high concurrency, high performance, and high real-time performance to improve the performance and scalability of the system and enhance the user experience.

Can you give an example based on a specific business scenario?

Asynchronous network models are also very common in social and shopping scenarios. For example:

  1. Social applications: In social applications, asynchronous network models can be used to process user chat messages, dynamic updates and other requests, improving the real-time performance and performance of the system.
  2. Shopping websites: In shopping websites, asynchronous network models can be used to process user orders, payments, logistics and other requests, improving the system's concurrent processing capabilities and performance.

Let's take a concrete example, the often played King of Glory. (My personal opinion) It needs to handle a large number of game player requests, including login, registration, querying game data, game operations, etc. If a blocking IO model is used, each request needs to create a thread to process. When the number of concurrent requests is large, the creation and destruction of threads will bring a lot of overhead, resulting in a decrease in server performance and throughput.

If an asynchronous network model is used, requests can be processed in an event-driven manner. When a player request arrives, the server does not need to create a new thread, but processes the request through asynchronous IO operations. When the IO operation is completed, the server will call back the corresponding processing function for processing. This can greatly reduce the overhead of thread creation and destruction and improve server performance and throughput.

In addition, the asynchronous network model can also be applied to real-time data processing systems, such as financial trading systems, online advertising systems, etc. These systems need to process a large number of data requests in real time. If the blocking IO model is used, the data processing delay will be long, affecting the real-time performance of the system. The asynchronous network model can process data requests in real time in an event-driven manner, improving the real-time performance and performance of the system.

How can asynchronous operations be performed?

  1. Callback function: In Java, you can use callback functions to complete asynchronous operations, such as using Java callback interfaces or Lambda expressions to implement asynchronous callbacks.
  2. Future object: The Future object is an asynchronous programming solution in Java. It can encapsulate asynchronous operations into a Future object, and then use the Future.get() method to wait for the completion of the asynchronous operation, thereby realizing synchronous programming of asynchronous operations.
  3. CompletableFuture object: CompletableFuture is a new asynchronous programming solution in Java 8. It can encapsulate asynchronous operations into a CompletableFuture object, and then use CompletableFuture methods to process the results of asynchronous operations, such as thenApply(), thenAccept(), thenRun(), and other methods.
  4. Asynchronous framework: Some asynchronous frameworks can be used to implement asynchronous operations, such as Netty, Vert.x and other frameworks. They can implement asynchronous operations in an event-driven manner to improve the performance and scalability of the system.

In Java, synchronous blocking and synchronous non-blocking can be implemented through different IO models?

In Java, synchronous blocking and synchronous non-blocking can be implemented through different IO models.

  1. Synchronous blocking IO model: In Java, the synchronous blocking IO model is the most common IO model, which uses blocking IO classes such as InputStream and OutputStream to implement data reading and writing operations. In the synchronous blocking IO model, when a thread calls the read() or write() method of a blocking IO class, the thread will be blocked until the IO operation is completed or an exception occurs.
  2. Synchronous non-blocking IO model: In Java, the synchronous non-blocking IO model can be implemented by using Java NIO (New IO). Java NIO provides an IO model based on channels and buffers, which can implement non-blocking IO operations. In the synchronous non-blocking IO model, when a thread calls the read() or write() method of the Java NIO channel, the thread will not be blocked, but will return immediately, and then the status of the IO operation can be checked by polling, thereby implementing non-blocking IO operations.

Can you explain it with specific scenarios?

When it comes to high concurrency, high performance, and high reliability scenarios, it is very important to choose the right IO model. The following is an explanation based on specific scenarios:

  1. Web server: For Web servers, the synchronous blocking IO model is the most commonly used IO model because it can provide stable performance and reliability. In Java, you can use the Servlet API to implement the synchronous blocking IO model. If you need higher performance and scalability, you can consider using an asynchronous IO model, such as Java NIO or Netty framework.
  2. Game server: For game servers, it is necessary to handle a large number of concurrent connections and real-time data interaction, so the synchronous non-blocking IO model is a more suitable choice. In Java, you can use frameworks such as Java NIO or Netty to implement the synchronous non-blocking IO model.
  3. Database access: For database access, the synchronous blocking IO model is the most commonly used IO model because it can provide stable performance and reliability. In Java, you can use the JDBC API to implement the synchronous blocking IO model. If you need higher performance and scalability, you can consider using the asynchronous IO model, such as using an asynchronous database driver, such as HikariCP.

In addition to the synchronous blocking and synchronous non-blocking IO models, there are some other IO models, such as the asynchronous IO model, the multiplexed IO model, etc. In practical applications, the appropriate IO model should be selected according to specific scenarios and requirements.

Describe a complete HTTP request?

A complete HTTP request usually includes the following steps: (If the address request is initiated from a browser, various address resolutions are required~)

  1. Establishing a TCP connection: The client establishes a connection with the server through the TCP protocol and performs a "three-way handshake". The client sends a SYN packet, the server responds with a SYN+ACK packet, and the client responds with an ACK packet to complete the connection establishment.
  2. Send HTTP request: The client sends an HTTP request to the server, which contains a request line, a request header, and a request body. The request line includes the request method, request URL, and HTTP protocol version; the request header includes some additional information, such as request header fields, cookies, etc.; the request body contains the requested data, such as form data in a POST request.
  3. Server processing request: After receiving the HTTP request sent by the client, the server will process it according to the content of the request, such as querying the database, reading files, etc.
  4. The server returns an HTTP response: After processing the request, the server returns an HTTP response to the client. The response includes a response line, a response header, and a response body. The response line includes the HTTP protocol version, status code, and status description; the response header includes some additional information, such as response header fields, cookies, etc.; the response body contains the response data, such as HTML pages, JSON data, etc.
  5. Close TCP connection: After receiving the HTTP response from the server, the client will close the TCP connection and perform "four waves". The client sends a FIN packet, the server responds with an ACK packet, and then the server sends a FIN packet, the client responds with an ACK packet, and the connection is closed.

In short, a complete HTTP request includes establishing a TCP connection, sending an HTTP request, the server processing the request, the server returning an HTTP response, and closing the TCP connection. In practical applications, issues such as HTTP cache, cookies, and session management also need to be considered.

What are the ways to implement long connections?

  • A persistent connection means that the client and the server maintain a connection, allowing multiple requests and responses to be made within a certain period of time without having to re-establish a connection for each request.
  • Long connections can reduce the overhead of establishing and disconnecting connections and improve network transmission efficiency. They are often used in scenarios such as real-time communication and push services.

Common long connection implementation methods include:

  1. HTTP persistent connection: The HTTP/1.1 protocol supports persistent connection. The client and server can maintain a connection and make multiple requests and responses within a certain period of time. In an HTTP persistent connection, after the client sends a request, the server will maintain the connection until the client sends a request to close the connection or the timeout period is reached.
  2. WebSocket: WebSocket is a long connection technology based on the HTTP protocol. It can establish a two-way communication connection between the client and the server to achieve real-time communication and push services. The WebSocket protocol is implemented through the upgrade of the HTTP protocol. Data frames can be sent and received between the client and the server without having to re-establish a connection.
  3. TCP persistent connection: TCP protocol supports persistent connection, the client and server can maintain the connection state, and multiple requests and responses can be made within a certain period of time. In TCP persistent connection, after the connection between the client and the server is established, the connection state can be maintained until the client or server sends a request to close the connection or the network abnormality disconnects the connection.

Long connections can improve network transmission efficiency and are often used in real-time communications, push services, and other scenarios.

Design a Hashset?

A simple Hashset I designed casually (for reference only):

  1. Define a hash table array. The length of the array is a prime number. Each element is a linked list used to store elements with hash conflicts.
  2. Define a hash function to map an element to a position in the hash table array. Hash functions can be implemented using modulo operations or bitwise operations.
  3. Implement the method of adding elements. First, calculate the hash value of the element according to the hash function, and then add the element to the linked list at the corresponding position. If the same element already exists in the linked list, it will not be added.
  4. Implement a method to delete an element. First, calculate the hash value of the element according to the hash function, then search for the element in the linked list at the corresponding position, and delete it if found.
  5. Implement a method to find an element. First, calculate the hash value of the element according to the hash function, then search for the element in the linked list at the corresponding position, and return the element if found, otherwise return null.
  6. Implement a method to get the number of elements. Traverse the hash table array and count the number of elements in all linked lists.
  7. Implement a method to clear the hash table. Traverse the hash table array and clear all linked lists.

Here is a simple Java code implementation:

 public class MyHashSet<T> { private static final int DEFAULT_CAPACITY = 16; private static final float DEFAULT_LOAD_FACTOR = 0.75f; private Node<T>[] table; private int size; private int threshold; private float loadFactor; public MyHashSet() { this(DEFAULT_CAPACITY, DEFAULT_LOAD_FACTOR); } public MyHashSet(int initialCapacity, float loadFactor) { table = new Node[initialCapacity]; this.loadFactor = loadFactor; threshold = (int) (initialCapacity * loadFactor); } public boolean add(T value) { int hash = hash(value); int index = indexFor(hash, table.length); Node<T> node = table[index]; while (node != null) { if (node.value.equals(value)) { return false; } node = node.next; } Node<T> newNode = new Node<>(value, table[index]); table[index] = newNode; size++; if (size > threshold) { resize(table.length * 2); } return true; } public boolean remove(T value) { int hash = hash(value); int index = indexFor(hash, table.length); Node<T> node = table[index]; Node<T> prev = null; while (node != null) { if (node.value.equals(value)) { if (prev == null) { table[index] = node.next; } else { prev.next = node.next; } size--; return true; } prev = node; node = node.next; } return false; } public boolean contains(T value) { int hash = hash(value); int index = indexFor(hash, table.length); Node<T> node = table[index]; while (node != null) { if (node.value.equals(value)) { return true; } node = node.next; } return false; } public int size() { return size; } public void clear() { Arrays.fill(table, null); size = 0; } private int hash(T value) { return value.hashCode(); } private int indexFor(int hash, int length) { return hash & (length - 1); } private void resize(int newCapacity) { Node<T>[] newTable = new Node[newCapacity]; for (Node<T> node : table) { while (node != null) { Node<T> next = node.next; int index = indexFor(hash(node.value), newCapacity); node.next = newTable[index]; newTable[index] = node; node = next; } } table = newTable; threshold = (int) (newCapacity * loadFactor); } private static class Node<T> { T value; Node<T> next; public Node(T value, Node<T> next) { this.value = value; this.next = next; } } }

How to load 1TB of data into 200MB of memory and find two rows of identical data?

It is impossible to load 1T of data into 200M of memory, because 1T of data far exceeds the memory size of 200M. Therefore, some special algorithms and techniques are needed to solve this problem.

One solution is to use an external sorting algorithm to divide 1TB of data into multiple small files, each of which can be loaded into memory for sorting. Then, use the idea of ​​merge sort to merge these small files into a large file and find two identical rows of data during the merge process.

The specific steps are as follows (reference):

  1. Divide 1T of data into multiple small files, each with a size of 200M
  2. To sort each small file, you can use algorithms such as quick sort.
  3. To merge the sorted small files into a large file, you can use the idea of ​​merge sort.
  4. During the merging process, the last line of the previous file and the first line of the current file are recorded, and the two lines are compared to see if they are the same. If they are the same, two lines of identical data are found.
  5. Finally, just output the two rows of identical data.

In actual operation, factors such as disk reading and writing speed and file reading and writing methods also need to be considered to improve the efficiency and accuracy of the algorithm.

Java opens a 1T file, what is the first step?

To open a 1T file in Java, the first step should be to determine how and where to read the file.

  1. Determine how to read the file: Choose the appropriate file reading method based on the file type and size. If the file is a text file, you can use BufferedReader to read it line by line; if the file is a binary file, you can use DataInputStream or FileChannel to read it.
  2. Determine the file reading range: Since a 1T file is very large and cannot be read into the memory at one time, it is necessary to determine the reading range. The file can be divided into multiple blocks, and the data of one block is read each time. After processing, the data of the next block can be read. The block size can be determined based on the file size and the memory size.

Is there any difference between opening a file with code and opening it with the mouse?

The underlying difference lies mainly in the way the operating system and file system interact.

Opening a file with a mouse is achieved through the graphical user interface (GUI) provided by the operating system. The user clicks the icon, but the actual operating system will call the corresponding API according to the user's operation to achieve file opening, reading, writing and other operations. These APIs are actually usually the underlying file system interface provided by the operating system, such as Windows' Win32 API, Linux's POSIX API, etc.

Opening files with code is done through the file operation API provided by the programming language. These APIs are usually encapsulation and abstraction of the underlying file system interface of the operating system. Usually, you can use File, FileInputStream, FileOutputStream and other classes to implement file opening, reading, writing and other operations. These classes will call the underlying operating system file system interface to implement the corresponding functions.

Therefore, from a low-level perspective, the difference between opening a file with code and opening a file with the mouse is that the APIs called are different, but the underlying file system interface is the same.

What parts does an HTTP request contain?

  1. Request Line: Contains the request method, requested URL, and HTTP protocol version. Common request methods include GET, POST, PUT, DELETE, etc.
  2. Request Headers: Contains various header information of the request, such as User-Agent, Content-Type, Cookie, etc. The header information provides additional information about the request for the server to process the request.
  3. Request Body: For GET requests, the request body is usually empty. For requests that need to pass data, such as POST requests, the request body contains the data to be sent to the server.

<<:  Learn dynamic routing protocols from scratch and never get lost in network routing again!

>>:  IoT and 5G: Transforming Public Transportation Systems

Recommend

VLAN Centralized Management Protocol (VCMP) You should know

In production environments, we often configure VL...

TCP three-way handshake and four-way wave

TCP (Transmission Control Protocol) is a connecti...

2021 is the year of the explosion of the cellular IoT module industry

According to a new research report from IoT analy...

A brief discussion on common tunneling technologies: IPSec

IPSec is not a single protocol, but a set of netw...

New report identifies progress and benefits across the 5G network lifecycle

Infovista welcomes TM Forum’s new industry survey...