What are the technical difficulties in implementing a multi-person collaborative online document?

What are the technical difficulties in implementing a multi-person collaborative online document?

This is a long-awaited answer. It just so happens that Cloud Studio also implements multi-person collaborative code editing. The technical principles are similar, so I'll post one of my previous blogs here. There are two basic implementation ideas for collaborative editing, namely CRDT (Conflict-Free Replicated Data Types) and OT (Operational-Transformation).

CRDT

CRDT stands for Conflict-free Replicable Data Type. It may seem difficult to understand (I don’t really understand it either), but it is a general term for data structures in distributed systems that are suitable for different scenarios and can maintain eventual consistency. In other words, CRDT itself is just a concept. When applied to collaborative editing, you need to implement the data structure yourself, such as the one open-sourced by the GitHub team.

ATOM's real-time collaboration feature is based on this library. Data transmission uses WebRTC. Except for the initial invitation/joining stage that relies on GitHub's server, all transmissions are peer-to-peer. At the same time, to ensure privacy, all data is encrypted.

OTO

Operational-Transformation, or operational transformation, refers to a class of technologies for document editing and conflict resolution of simultaneous editing, not just an algorithm. Unlike CRDT, the OT algorithm relies on the server to maintain eventual consistency throughout the process. In terms of cost, CRDT is better than OT, but due to the complexity of CRDT implementation (I haven't learned it), this article mainly introduces real-time collaborative editing based on the OT algorithm. The OT algorithm can not only be used for plain text operations, but also supports some more complex scenarios:

  • Collaborative graphics editing

A multimedia editor that supports real-time collaboration, allowing multiple users to edit the same document simultaneously in the same Adobe Flash

  • Collaborative HTML/XML and rich text editing

Real-time collaborative web-based editor

  • Collaborative spreadsheets, Word documents, etc.
  • Computer-aided design (Maya)

The basic idea of ​​maintaining consistency in the OT algorithm for multi-person collaborative editing of Autodesk Maya documents is to convert the editing operations into a new form based on the impact of previously executed concurrent operations, so that the converted operations can achieve the correct effect and ensure that the copied documents are the same. In fact, OT is not necessary only when multiple people edit adjacent characters at the same time. The applicability of OT has nothing to do with the number of characters/objects operated concurrently. Regardless of whether these target objects overlap with each other or whether these characters are close to each other, OT will perform concurrent control on objects with positional dependencies.

OT represents document changes as three types of operations (Operational)

  1. Insert
  2. Retain
  3. Delete

For example, for a document with the original content of "abc", suppose user O1 inserts a character "x" at position 0 of the document, represented by `Insert[0,"x"]`, and user O2 deletes a character at position 2 of the document, represented by `Delete[2,1]` (or Delete[2,'c']), which will generate a concurrent operation. Under the control of OT, the local operation will be executed as expected, and the remote server will perform a transformation `Transformation` after receiving the two operations. The specific process is as follows

  1. User O1 first performs an insert operation, and the document content becomes "xabc". Then O2's operation arrives and is converted to `O2' = T(O2,O1) = Delete[3,1]`, resulting in a new operation, where the position increases by 1 because O1 inserts a character. Then O2' is executed in the document "xabc", and the document content becomes "xab", that is, "c" is correctly deleted. (If no conversion is performed, "b" will be deleted incorrectly).
  2. User O2 first performs a delete operation, and the document content becomes "ab". Then O1's operation arrives and is converted to `O1' = T(O1, o2) = Insert[0,"x"]`, which also generates a new operation. Since the previously executed O2 and O1 do not affect each other, the converted O1' is the same as O1, and the document content becomes "xab".

Cursor operations are ignored here. In fact, when multiple users are editing in real time, the application on the editor will not actually move the cursor, but will only insert a fake cursor at the corresponding position. Monaco-Editor and ot.js We use ot.js to implement collaborative editing of Monaco-Editor. ot.js contains the implementation of the client and the server. On the client, it converts the editing operation into a series of operations.

  1. // For document "Operational Transformation"  
  2. const operation = new ot.Operation()
  3. .retain(11) // retain the first 11 characters  
  4. .insert( "color" ); // insert character  
  5. // This will change the document to "Operationalcolor"  
  6.  
  7. // "abc"  
  8. const deleteOperation = new ot.Operation()
  9. .retain(2) //  
  10. . delete (1)
  11. .insert( "x" ) // axc  

At the same time, operations are also combinable, such as combining two operations into one operation

  1. const operation0 = new ot.Operation()
  2. .retain(13)
  3. .insert( " hello" );
  4. const operation1 = new ot.Operation()
  5. . delete ( "misaka " )
  6. .retain(13);
  7.  
  8. const str0 = "misaka mikoto" ;
  9.  
  10. const str1 = operation0.apply(str0); // "misaka mikoto hello"  
  11. const str2a = operation1.apply(str1); // "mikoto hello"  
  12.  
  13. // Combination  
  14. const combinedOperation = operation0.compose(operation1);
  15. const str2b = combinedOperation.apply(str0); // "mikoto dolor"  

Applied to Monaco, we need to listen to the editor's onChange event and cursor-related operation events (selectionChange, cursorChange, blur, etc.). In the event of text content modification, convert the `changes` generated by each modification into one or more operations, also called `operation`. Cursor operations are easy to handle and can be converted into a `Retain` operation.

  1.   const editor = monaco.editor.create(container, {
  2. language: 'php' ,
  3. glyphMargin: true ,
  4. lightbulb:
  5. enabled: true ,
  6. },
  7. theme: 'vs-dark' ,
  8. });
  9.  
  10. editor.onDidChangeModelContent((e) => {
  11. const { changes } = e;
  12. let docLength = this .editor.getModel().getValueLength(); // Document length  
  13. let operation = new TextOperation().retain(docLength); // Initialize an operation and retain the original content of the document  
  14. for (let i = changes.length - 1; i >= 0; i--) {
  15. const change = changes[i];
  16. const restLength = docLength - change.rangeOffset - change.text.length; // Document  
  17. operation = new TextOperation()
  18. .retain(change.rangeOffset) // retain all characters before the cursor position  
  19. . delete (change.rangeLength) // Delete N characters (this operation is invalid if it is 0)  
  20. .insert(change.text) // insert characters  
  21. .retain(restLength) // retain the remaining characters  
  22. .compose(operation); // Combine with the initial operation into one operation  
  23. });

This code first creates an editor instance, listens to the `onDidChangeModelContent` event, and iterates over the changes array. change.rangeOffset represents the cursor position when the operation occurs, change.rangeLength represents the length of the characters deleted (0 means no deletion operation), and restLength is calculated based on the final length of the document - the cursor position - the length of the inserted characters. It is used to retain the remaining characters when inserting characters in the middle of the document.

But at the same time, we also need to consider undo/redo. The processing of undo/redo in ot.js is that each editing operation needs to generate a corresponding `reverse operation` and store it in the undo/redo stack. In the loop body of the above code, we also need to add an operation called `inverse`.

  1. let inverse = new TextOperation().retain(docLength);
  2.  
  3. // Get the deleted characters, implementation omitted  
  4. const removed = getRemovedText(change, this .documentBeforeChanged);
  5. inverse = inverse.compose(
  6. new TextOperation()
  7. .retain(change.rangeOffset) // Same as editing  
  8. . delete (change.text.length) // Insertion becomes deletion  
  9. .insert(removed) // delete becomes insert  
  10. .retain(restLength); // Also retain the remaining characters  

This generates an edit operation and an inverse operation for undoing. The edit operation is sent to the server for conversion and then sent to other clients. The inverse operation is saved locally for undoing.

The idea of ​​undo/redo is very simple, because no matter what, a change event will be generated for the editor, and in the real-time editing state, the undo/redo stacks of the two users need to be independent of each other, that is, A's operation cannot enter B's undo stack, so when B performs an undo, it can only affect his previous operation and cannot undo A's edit. Therefore, we need to implement a custom undo function to override the editor's built-in undo function.

We need to override the default undo

  1. this .editor.addAction({
  2. id: 'cuctom_undo' ,
  3. label: 'undo' ,
  4. keybindings: [
  5. monaco.KeyMod.CtrlCmd | monaco.KeyCode.KEY_Z
  6. ],
  7. run: () => {
  8. this ._undoFn()
  9. }
  10. })

The implementation of _undoFn is not repeated here. In fact, it saves the inverse operation generated in the previous change event in a custom undoManager. Each time an undo is executed, undoStack.pop() takes out the most recent operation and applies it locally, and sends it to the collaborator at the same time. Because the undoManager does not save the collaborator's inverse operation, executing an undo will not affect the collaborator's operation.

ot.js also includes the server-side implementation. You only need to run the ot.js server-side code in nodejs and build a simple websocket server.

  1. const EditorSocketIOServer = require( 'ot.js/socketio-server.js' );
  2. const server = new EditorSocketIOServer( "" , [], 1);
  3.  
  4. io.on( 'connection' , function (socket) {
  5. server.addClient(socket);
  6. });

The server receives the operation of each collaborator and converts it before sending it to other collaborator clients. The conversion operation actually calls a `transform` function, which can be viewed here. In fact, this function is the core of OT technology. Due to time constraints, the source code of this function will not be explained in detail. (With the upgrade and improvement of Cloud Studio's architecture, we are preparing to abandon OT and turn to CRDT, so we will share it after all the implementations are completed.

<<:  Market forecast: China's smart home market will reach US$48.2 billion in 2027

>>:  The core technical principles behind DingTalk document collaborative editing

Recommend

F5 Future

[51CTO.com original article] Expert introduction:...

It will take time for 5G cross-network roaming to become popular

On May 17, at the opening ceremony of the "2...

What is the difference between WiFi and Ethernet connections?

In today's networking world, Wifi and Etherne...

CAICT answers hot issues on “number portability” service

On November 27, 2019, China Telecom, China Mobile...

How to protect data center power systems from winter threats

For many people, the cold winter months are upon ...

Gartner: China's IT spending is expected to exceed US$550 million in 2022

According to Gartner's latest forecast, globa...

Smart home wiring, wired or wireless?

Smart home veterans all know that wired systems a...