When developing online documents, have you solved this technical difficulty?

When developing online documents, have you solved this technical difficulty?

"Times make heroes" is an eternal truth. In the current era, online documents can be called such "heroes".

The rapid development of the new generation of information technology has profoundly affected our work and lifestyle. In recent years, remote work has completely overturned the traditional enterprise management model, and online documents, as an important part of remote work software, have also ushered in rapid development.

Nowadays, even though there are online office products such as Tencent Docs, Graphite Docs, Feishu, Yuque and Lingxi Docs in the market, online documents themselves still face many challenges in terms of functions, technology, data security, services, ecology, etc., such as data processing efficiency, multi-person collaboration, secondary expansion, system integration, framework compatibility issues, etc.

From a technical perspective, online, data processing, and multi-person collaboration are the most critical technical indicators for developing online document systems. However, online and data processing both have relatively mature technical solutions, and are not difficult to implement. Therefore, multi-person collaboration is the core factor that affects the usability of online document systems.

What is multiplayer collaboration?

Multi-person collaboration means that multiple people can edit the same document at the same time, and users can see the changes made by others without refreshing. Google Docs, Tencent Docs, Shimo Docs, Quip, etc. all have multi-person collaboration functions.

So, how is multi-person collaboration achieved?

If any information is to be edited and displayed by multiple people in real time, the following three steps need to be implemented:

  • Operationalization
  • Transmittable
  • Restoreable

These three steps are similar to the encoding and decoding process: first, the information is converted into a set of operations, then the operations are transmitted to other terminals through the network, and finally the operations are restored to information at the local terminal.

These steps may seem simple, but each step involves a lot of details. For example, during the operationalization process, when segmenting and combining information, how can we ensure that all changes in information can be decomposed into a set of operations? How can we make the operations cover all changes in information? How can we determine the granularity of segmentation?

The following points need to be considered for transferability:

1. Transmission content

  • Original text

Ⅰ. Clarity

II. Redundancy

  • Compression Technology

Ⅰ. Logical Compression

II. Protocol Compression

III. Manual compression

2. Network Protocol

  • Socket

I. TCP

Ⅱ.UDP

  • HTTP
  • WebSocket

3. QoS (Quality of Service)

  • Fail Fast
  • Automatic rollback
  • Automatic reconnection
  • Automatic recovery

Restorable mainly involves:

1. Restoration of absolute operation

  • Control volume
  • Reasonable Tips

2. Relative operation restoration

  • Strict order
  • Ensure order from the source
  • Sequential Remedies

3. Restoration of local operations

  • Filter the received operation set
  • Refine the operation particles from the source
  • Save locally and execute locally

4. Non-intrusive restoration

  • Defining Intrusion
  • Exclude intrusion
  • Thousands of people, thousands of faces

After understanding the basic principles of multi-person collaboration, let's study its technical difficulties.

What are the technical difficulties in multi-person collaboration?

The essence of multi-person collaboration is Multiple Leader Replication in a distributed system, that is, any user end can be regarded as a Data Leader, and data synchronization between these leaders will inevitably encounter disorder and conflict problems . This is the main difficulty of multi-person collaboration.

There are two solutions to the conflict problem of Multiple Leader Replication:

  • Avoid conflicts, that is, do not allow multiple users to edit the same place at the same time. This solution is simple and crude. When using it, you need to check whether the product form is suitable for this solution.
  • Expose the conflict to the user and let the user solve it by themselves. Currently, most professional version control software adopts this method, but it is not suitable for products with a large number of non-professional users, such as online documents.
  • Give the write operation a global index, which can be a timestamp or a sequence number. The index must be global and increasing. In any conflict, the one with the higher index is selected for writing. The advantage of this method is that conflict resolution is fully automatic and does not require user intervention. The disadvantage is that if the synchronization interval is very long, a lot of user input will be lost.

In the actual process of developing online document systems, Operational Transformation (OT) algorithm technology is a commonly used method to solve the problem of multi-person collaboration conflicts. This technology was born in 1989. Its principle is to unify text content into the following three types of operation methods, with the aim of providing users with eventual consistency implementation:

  • retain(n): retain n characters
  • insert(str): insert character str
  • delete(str): delete character str

After completing the above operations, the OT algorithm merges and transforms the concurrent operations to form a new operation flow, and applies it to the historical version to achieve lock-free synchronous editing.

(Operational transformation process in OT algorithm technology)

The idea behind the OT algorithm is actually very simple, which is to perform corresponding operation conversions under specific conditions. Therefore, OT is mainly used for text, which is usually complex and not scalable. For more advanced structures such as rich text editing, OT uses complexity to achieve user expectations without causing too much negative impact on system performance. Therefore, most real-time collaborative editing logic is now implemented based on the OT algorithm.

For this reason, the OT algorithm has become one of the most important solutions to solve the current collaborative conflict processing. However, even though it has been around for more than 30 years and the theories related to control algorithms have already flourished, it still cannot handle distributed implementation issues well, and developing a system that supports real-time collaborative editing by multiple people is far more complicated than imagined.

Where is the breakthrough in achieving multi-person collaboration?

It can be seen that it is not enough to rely solely on algorithm logic to realize a complex multi-person real-time collaborative editing system. It is also necessary to invest a lot of R&D costs and time according to different business scenarios (such as project dashboards, plain text editing, undo/redo, etc.), and find the balance between product performance and ease of use through continuous exploration.

So, is there a simpler and faster solution?

By analyzing the sample codes of many online collaborative office products on the market, we found that in addition to using the OT algorithm mentioned above, these products basically use third-party table components . By embedding components, the online document system well supports the final consistency of multi-person collaboration, providing users with a more user-friendly and diverse experience, while reducing R&D costs, achieving a higher density of computing complexity, and greatly improving the efficiency of multi-person collaboration.

What functions does a table component for multi-person collaboration need to have?

First, there is functional support for tables.

Since the numerical sensitivity of tables is much higher than other data types, they can achieve more delicate operation granularity and calculation complexity when used as multi-person collaborative documents. Therefore, the selected components must have strong table function support, not only showing strong capabilities in data entry and data reporting, but also various statistics, calculation summaries, perspective analysis, and graphical methods.

Secondly, an open API interface is needed to meet more customization options .

This type of component needs to provide a wealth of events and application programming interfaces to control logic such as cell status, form protection, and data transmission. For multi-person collaboration, it is also necessary to restrict users from editing the same content, as well as insert timestamps (serialization) and other functions.

Out of curiosity, I downloaded and tried many table components online, and found that only a handful of them can meet the above requirements, and SpreadJS is undoubtedly the most eye-catching one. This component focuses on "online Excel" that can be embedded in the system. The pure front-end architecture can be easily embedded in system development without considering compatibility with the native system. It is worth mentioning that SpreadJS uses sparse arrays as a storage model. Compared with traditional chain storage or array storage, sparse arrays only store non-empty data, and do not need to open up additional memory space for empty data.

In addition to saving memory space, sparse arrays also make it easier to build a row-indexed data dictionary for loosely laid out data types such as tables, so that any level of nodes in the entire storage structure can be replaced or restored at any time. With this feature, SpreadJS achieves efficient data rollback and data recovery (Redo/Undo) in multi-person collaboration.


(SpreadJS's sparse matrix storage model (Sparse Array))

Conclusion

The demand for enterprise collaborative office will increase dramatically with the deepening of digital transformation. In the future, enterprise collaborative office will develop in the direction of improving product usability, integration and secondary expansion capabilities, high compatibility with original systems/businesses, and meeting the usage habits of end users.

How to break through technical barriers and develop online document products that can meet user needs in different scenarios and have market competitiveness and differentiation is the primary consideration for SaaS companies and system suppliers.

“Good winds help me soar to the sky.” In today’s fiercely competitive online document field, in addition to spending a lot of energy on independent research and development, learning to leverage other’s strengths to meet different business scenarios and customer needs may also be a good choice.

<<:  How online help documents/user manuals can help companies better understand their users

>>:  GSMA: 5G networks will cover two-fifths of the world's population by 2025

Recommend

South Korea's 5G users approach 10 million, with mixed results for the future

As the first country in the world to announce the...

Karamay: Huawei's first cloud strategic cooperation city in the world

Karamay is a desert city that was born and prospe...

Next generation WiFi: There is still signal one kilometer away!

[[433169]] The Wi-Fi Alliance announced on Tuesda...

The latest analysis of WiFi 6E and WiFi 7 market!

WiFi has been expanding its deployment and applic...

SDN network architecture: three layers and three interfaces

As we all know, SDN is a network with a separate ...

LRU implementation with expiration time

[[382833]] I saw this algorithm a long time ago w...

EasyVM: $3/month KVM-2GB/30GB/2TB/Dallas & New York, etc.

EasyVM is a foreign hosting company founded in 20...

10 ways to completely solve wireless AP failures

Wireless AP devices are used to centrally connect...