Whitepaper
The PDU Protocol: A Peer-to-Peer Social Networking Service
Section titled “The PDU Protocol: A Peer-to-Peer Social Networking Service”Version 5
Email: liupeng@pdu.pub
Abstract
Section titled “Abstract”Any information dissemination system should satisfy two objectives at once: the free publication of information and the effective acquisition of information. Contemporary centralized social platforms sacrifice the former in order to achieve the latter. They filter information through censorship and account bans, and they often require users to submit real-world identity information before creating accounts.
This paper proposes a fully peer-to-peer social networking system that depends on no third-party service and does not attempt to eliminate so-called spam or malicious accounts at the system level. The system defines a publisher identity as a totally ordered sequence of messages signed by the same private key. In other words, an identity is the ordered set of events itself.
Each information consumer constructs their own visible set of publisher identities according to the interaction relationships among publishers and a set of self-defined rules. Within that visible set, the consumer can filter information effectively. The system contains no unified consensus and no god’s-eye view. The filtering of information quality is the statistical result of independent judgments made by all participants.
Introduction
Section titled “Introduction”Information dissemination and interaction on the contemporary internet largely depend on centralized platforms such as Facebook, Twitter/X, and WeChat. These platforms make it convenient for users to publish information and establish relationships. They also use various algorithms to detect and filter spam in order to preserve the user experience.
The limitations of centralized social services, however, have become increasingly apparent. Third-party services may misuse user information or leak private user data. They may also use their large user bases to lock users in and preserve monopolistic power. In addition, centralized services are vulnerable to government regulation and blocking because they are clear and controllable targets.
Despite these problems, most users still have little choice but to continue using existing platforms. Moving to another platform does not necessarily cause data loss, but it does mean losing the social relationships accumulated on the original platform, thereby reducing one’s influence. This is a major source of platform lock-in.
Decentralized social platforms have developed rapidly in recent years in an attempt to address the problems created by centralized platforms. Mastodon is one example. Mastodon adopts a federated architecture that rejects a single center and instead consists of multiple servers capable of communicating with one another, allowing user relationship data to be preserved independently. Nevertheless, user registration and content governance still depend on the administrators of individual servers. This governance structure can be understood as a collection of independent small centralized platforms, and it still cannot fundamentally avoid the problems encountered by centralized platforms.
Blockchain-based social platforms, such as Steemit and Minds, use a certain number of tokens as the cost of creating or activating accounts, while also using tokens to incentivize social behavior. Although this method increases the cost of account creation, it differs from identity-verification mechanisms used by centralized platforms. It cannot effectively prevent the proliferation of fake accounts, and it imposes unfair restrictions on users with weaker economic capacity, thereby reducing the diversity and inclusiveness of the user population.
Some social applications use invitation mechanisms during their early stages to control the trustworthiness of new users. This can effectively prevent malicious registration and the spread of fake accounts. However, it also obstructs broader participation. For people who do not know existing users, joining the platform becomes extremely difficult. Moreover, early users exert disproportionate influence over community culture and rules, which may cause the community culture to become homogeneous and make it difficult to attract a more diverse user base.
The common problem among these approaches is that all of them attempt to eliminate spam or malicious accounts at the system level. Yet there is an inherent tension between the two fundamental objectives of an information dissemination system: free publication and effective acquisition. Centralized platforms sacrifice free publication for effective acquisition, filtering content through censorship and bans. In a fully decentralized environment, there is no unified standard for what counts as good or bad information. Each person has a different judgment about what should be treated as spam. To disallow the existence of spam is itself in conflict with the objective of free publication. Therefore, a truly decentralized social system should not and cannot eliminate spam at the system level. It should instead allow any information to exist while enabling every information consumer to filter efficiently according to their own standards.
The system proposed in this paper is based on precisely this idea. Its design goal is not to prevent malicious behavior, but to contain its effects within a limited scope so that normal users can publish and acquire information without substantive interference. From the perspective of the system as a whole, effective information spreads more widely because it is accepted by more people, while spam naturally contracts because it is blocked by many information consumers along its propagation paths. The system requires no unified adjudication mechanism. The filtering of information quality is the statistical result of independent actions taken by all participants.
These design decisions are grounded in a unified philosophical foundation. Identity in the system is defined as an ordered set of events, a position derived from a philosophical understanding of time and individuality. The visible identity set constructed by each information consumer corresponds to that consumer’s own horizon. The system neither possesses nor seeks a god’s-eye consensus over a single unified view. Public transparency of information is a structural requirement of the system, not a mere functional preference. For a more detailed philosophical account, see Horizon-Limited Realism.
Users must understand and accept the following as necessary costs of a decentralized social network: each person’s visible range of publishers may differ, while effective information acquisition remains possible; maintaining the complete total order of a message chain is the publisher’s own responsibility, and violations must be penalized; the system cannot preemptively block all spam or malicious accounts with perfect certainty, but it can filter information efficiently.
The Two Roles of a User
Section titled “The Two Roles of a User”In this system, the traditional concept of a “user” is divided into two independent roles: information publisher and information consumer. These two roles have distinct goals, behaviors, and system-level representations. Understanding this distinction is fundamental to understanding the entire system.
Information publishers are the visible roles in the system. All actions taken by a publisher, including publishing content, commenting, reposting, liking, and blocking, are formed into an immutable ordered set of messages through a chain structure and digital signatures. This set itself is the definition of the publisher’s identity.
As an information publisher, the fundamental goal is to maximize the influence of one’s information. Influence here is not merely breadth of dissemination. It is a balance between duration and the population affected, measured according to the publisher’s own subjective judgment. A scholar, for example, may consider it more valuable to influence ten peers for thirty years than to influence one million people for three days.
Information consumers are the invisible roles in the system. The fundamental goal of a consumer is to acquire information effectively. The means for doing so is a set of self-defined filtering rules through which the consumer filters all information in the system. These rules depend essentially on trust propagation relationships established through interactions among publishers. A consumer’s rules are stored locally. They leave no trace in the system, are not known to others, and exist only as that consumer’s own method for filtering all information. Through such rules, each information consumer constructs their own horizon, namely the visible range of publisher identities.
There is no strict one-to-one relationship between these two roles. A single user may possess multiple publisher identities. These identities may share the same acquisition rules, or they may use independent rules. A publisher identity and a filtering rule set are not bound to one another in any way.
It should be noted that a publisher’s interaction behavior is based on the visible user range associated with that user in the role of information consumer. A publisher can interact only with information that they can see as a consumer. Once the publisher publishes an interaction message, however, that action becomes public, traceable, and open to judgment by everyone.
At the system level, the objective is neither to maximize the influence of any particular publisher nor to judge whether information is good or bad. The objective is the normal flow of information: information should spread within the range where it is needed and accepted, while spam should interfere with normal acquisition as little as possible. Through the independent judgments made by every information consumer about information and publishers, the system naturally adjusts the range and direction of information propagation.
Messages
Section titled “Messages”A message is defined as the basic data structure of the system and as the only type of information transmitted in the peer-to-peer network. Other data types in the system, such as publisher identities, are generated independently by each information consumer from this public data.
As shown in Figure 1, each message consists of three parts: message content, a reference list, and a signature. The message content is the body of the message and is divided into content information and interaction type. The former includes multimedia content such as text and images; the latter includes common social-network interactions such as publishing, replying, quoting, liking, and blocking.
The reference list may contain the signatures of multiple messages in order to indicate the temporal relationships between the current message and the referenced messages. When the message content involves interaction types such as replies or quotes, the signatures of the relevant messages should also be included in the reference list. To provide more precise temporal verification, the reference list should preferably include at least one signature from a recent message. Finally, the message content and reference list are combined to calculate a hash value, and the current publisher’s private key signs that hash. This confirms the publisher identity and guarantees data integrity.
Although each message has explicit content, it is often impossible to determine whether a message is spam by examining that message alone. For example, the messages “I recommend trying Restaurant A on M Street” or “I advise everyone to avoid Restaurant B on N Street” are not problematic when viewed in isolation. Yet if thousands of nearly identical messages appear in a short period of time, they would be regarded as spam and as malicious manipulation. Therefore, information should not be evaluated at the level of a single message. It must be evaluated at the level of a publisher identity by examining the complete ordered set of messages produced by that identity. This leads to the definition of publisher identity.
Publisher Identity
Section titled “Publisher Identity”A publisher identity is defined as a totally ordered linked list consisting of all messages signed by the same private key and arranged in temporal order. Maintaining this total order is the responsibility of the publisher, not the responsibility of those who interact with the publisher or of information consumers.
In a peer-to-peer distributed system, each publisher’s messages must form a linked-list structure in order to ensure that published content has not been tampered with and to allow deletion of previously published messages by the publisher to be detected. The rule is that the first reference in each message’s reference list must be the signature of the previous message signed by the same private key, as shown in Figure 2. If the current message is the first message signed by that private key, the first value in the reference list is set to 0 to mark the starting point.
The definition of an identity is all the words a person has spoken, arranged in order. Total order is the precondition for the existence of identity and the basis on which others can make fair judgments about a publisher. When messages form a total order, even contradictory content can be judged consistently by information consumers according to sequence. For example, suppose a publisher first declares ownership of an object, then transfers that object to M, and later declares that the same object has been transferred to N. All information consumers can judge the transfer to N to be invalid because, according to the total order, the object had already been transferred to M. A contradiction exists, but consensus can still be reached.
When a message chain forks, however, the situation is fundamentally different. If the two messages transferring the object to M and to N both follow immediately after the message declaring ownership, forming two branches, then information consumers cannot determine which transfer is valid. The fundamental harm caused by a fork is not contradiction itself, but the loss of any possibility for all information consumers to reach consensus about that contradiction. In such a case, the only reasonable response is to ignore the identity from the fork point onward.
In blockchain systems, a fork of the main chain results from computational competition among multiple miners rather than from malicious behavior, and the system therefore does not punish miners merely because a fork occurs. In this system, however, a private key should be controlled by one person or by one unified organization, and that controller is necessarily able to control the ordering of all messages. Whether a fork is caused by a technical failure or by human action, it is the responsibility of the private-key holder. Because signatures cannot be forged, once a fork appears in a publisher’s message chain, the publisher is deemed to have failed in the responsibility to maintain total order. Except for the first message, if the first signature in the reference list points to a message signed by another private key, that too is considered an act that destroys total order. Both cases should result in penalties for the publisher.
If the first message referenced in a reference list has not yet been received, the current message should be temporarily stored. It should neither be accepted nor used to penalize the publisher. The information publisher is responsible for preserving the complete message chain signed by their own key, so that the chain can be supplied for external verification if some message is lost in the network.
Another case is that a publisher may hide one fork of the message chain until several messages in another fork have already been received, and only then release the hidden fork into the system. In this situation, the messages already accepted by the system are treated as established facts. The usual penalty is to discard the later-received conflicting portion after the conflict is discovered and to block the publisher. Such a penalty makes fork attacks meaningless.
It is especially important to state that, in this system, any lawful identity constituted by totally ordered events has equal system status regardless of whether the entity behind it is a real person, an organization, or an artificial intelligence. Identity is defined by the ordered set of events, not by the type of entity behind it. The system does not distinguish, and cannot distinguish, the nature of the entity behind an identity. This is not a defect of the system, but a design principle.
Visible Identity Set
Section titled “Visible Identity Set”The message chain formed by signed messages allows any attempt by a publisher to tamper with previously published content to be easily detected by others. This data structure therefore makes it possible to judge information publishers with relative fairness. However, if a malicious publisher can simply replace a signing private key at zero cost after being blocked and continue publishing messages under another identity, then the penalty mechanism becomes ineffective.
Traditional centralized platforms usually require verification through a mobile phone number or a similar mechanism when creating a user account, thereby making the user’s real-world identity the cost of creating a virtual identity. Some decentralized identity projects, such as Sovrin, adopt similar methods. This approach is relatively reasonable, but it necessarily relies on a trusted third party to complete part of the verification. Some blockchain-based social systems charge a certain number of tokens as the cost of creating an account. Yet because users differ greatly in wealth, it is difficult to find a fair price that both attracts participation and suppresses the creation of spam accounts.
In this system, we do not attempt to attach an explicit cost to identity creation. Instead, each information consumer constructs their own visible range of publisher identities through a trust propagation mechanism. The visible identity set is not the same as a follow list in a traditional centralized platform. It is more like the set of platform users whose range is defined by the user themselves. Only information published by identities within this range can potentially be seen by the user.
Trust Propagation Mechanism
Section titled “Trust Propagation Mechanism”The visible identity set is constructed through the following mechanism.
First, an information consumer manually specifies several initial trusted identities. These initial identities may be the consumer’s own publisher identities, publishers personally known to the consumer, or reputable identities obtained from a third-party recommendation service. These initial identities form layer zero of trust propagation.
Starting from the initial identities, the consumer finds all object identities that have received positive interactions actively initiated by those trusted identities, such as comments, reposts, or likes, and that have not been blocked by them. These object identities constitute the first layer of trusted identities. The key point is the directionality of interaction: trust propagation requires active behavior by an already trusted party. If A is already trusted and A actively comments on B’s message, this indicates that A recognizes B’s content; B is therefore indirectly trusted through A’s judgment. Conversely, if B comments on A, that is merely B’s unilateral action. It does not represent A’s recognition and does not constitute a basis for trust propagation.
If the consumer sets the number of trust propagation layers to more than one, the same operation is repeated for each identity in the first layer. The identities with which they actively interact, and which they have not blocked, become the second layer, and so on. In practical use, trust propagation is usually set to one layer.
Blocking is a public interaction behavior by an information publisher and functions as an exclusion mechanism within trust propagation. When an already trusted identity A blocks identity C, C is excluded from the trust path through A. However, if C can be introduced through another trusted identity B’s trust path, C may still enter the visible identity set. Blocking cuts a specific path; it does not constitute a global ban.
The expansion of trust propagation is gradual in actual use, rather than instantaneous. An information consumer requires a process to expand their visible identity set, and this process naturally controls the size of the set. The number of trust propagation layers is set by the consumer, who may adjust the layer count according to practical needs in order to control the size of the visible range.
It should be noted that the time at which an interaction occurred does not affect the degree of trust. Whether an interaction occurred one year ago or yesterday, it has the same weight in trust propagation.
Admission of New Identities
Section titled “Admission of New Identities”When a new publisher identity is created, it is initially not included in any consumer’s visible set. This is the normal initial condition. A new identity faces two distinct problems: being discovered and obtaining trust.
Discovery may rely on third-party services. For example, an information retrieval service may maintain the largest possible visible user range and help a new identity be seen by potential interactors. Such third-party services are auxiliary, non-dependent, and non-decisive.
Obtaining trust, however, must depend on the publisher’s own accumulated content. An older account, or an account that has published more content, should normally receive more trust than a new account. Identity itself is an ordered set of events. The accumulation of events requires time, and the cost of time cannot be forged.
System Bootstrapping
Section titled “System Bootstrapping”During the initial stage of the system, the system can be bootstrapped by creating AI publisher identities with different personalities and interests. As long as an AI identity maintains a totally ordered message chain, it is a fully lawful identity in the system. As noted above, the system does not distinguish, and cannot distinguish, the type of entity behind an identity. AI identities and real users have completely equal status in the system.
New users joining the system may begin from identities they trust or from reputable identities recommended by third-party services. They can then gradually expand their visible user range through the trust propagation mechanism according to interaction behavior among publishers.
Properties of the Visible Identity Set
Section titled “Properties of the Visible Identity Set”The visible identity set can be understood as a customized rule. According to that rule, a user gradually computes their own visible user range from public information. In a peer-to-peer distributed system, however, the system does not guarantee that every user can obtain all information. Therefore, even if the same algorithm is used, the final visible identity sets may differ. It is normal for each information consumer to have a different visible set. This is not a defect of the system, but an inevitable property of a decentralized system: there is no unified horizon from a god’s-eye view.
Because the visible identity set exists, a malicious publisher may be able to replace a private key at zero cost and publish information again under another identity, but that new publisher identity will not easily be accepted by other users. The propagation range of spam is therefore reduced. Even if an identity has not published any spam, it may still be removed from a user’s visible identity set if it repeatedly interacts with punished publishers.
Incentives
Section titled “Incentives”Information publishers and information consumers have different goals, and the system-level goal is different from both. These distinctions must be explained separately.
For information publishers, the fundamental goal is to maximize the influence of their information. Influence is a balance between time and the affected population, measured according to the publisher’s own subjective judgment. The system incentivizes publishers by helping them expand the influence of their information, while penalties reduce that influence. This concept also explains why punishing malicious publishers through blocking is reasonable: when we block a spam publisher, we directly attack that publisher’s only objective. This already constitutes a sufficient penalty.
For information consumers, the fundamental goal is to acquire information effectively and to prevent the information they want to read from being submerged by spam as much as possible. Consumers achieve this goal by constructing and maintaining visible identity sets.
From the perspective of the system as a whole, the objective is the normal flow of information: information should spread within the range where it is needed and within the range where it is accepted. Through the independent filtering behavior of each information consumer, the system naturally adjusts the range and direction of information propagation. Effective information spreads more widely because it is accepted by more consumers; spam naturally contracts because it is blocked by many information consumers along its propagation paths. This filtering requires no system-level unified consensus. Nor does the system collectively incentivize or collectively penalize any publisher identity. Everything is the statistical result of independent actions by all information consumers.
It should be emphasized that, unlike decentralized systems represented by blockchain, a social network has no system-level unified consensus. Nor is there a unified standard for what counts as spam. Each consumer has the right to decide, according to their own judgment, what information is valuable and what information is spam.
When a publisher continuously produces high-quality content, that content is more likely to receive interactions from other publishers. Through these interaction messages, the current publisher’s content has a higher probability of being accepted by information consumers who have not yet included the publisher in their visible identity sets. As more information consumers accept the identity, the publisher’s future content will gain increasing influence.
Conversely, if a publisher continuously publishes low-quality content or continuously interacts with publishers that have been blocked, other publishers will become less willing to interact with them. Some information consumers who have already accepted the publisher identity may also remove that identity from their visible identity sets. Therefore, a publisher’s behavior and content quality directly affect the range of influence of that publisher’s information.
In summary, the incentive mechanism of a social network forms a dynamic equilibrium of information propagation through feedback on publisher behavior and content quality. Publishers of high-quality content can expand the dissemination and influence of their information, and may ultimately monetize that influence through interactions with commercial brands and similar mechanisms. This process resembles advertising systems in centralized platforms and encourages publishers to continuously improve content quality in order to obtain broader dissemination and greater economic benefit.
Privacy
Section titled “Privacy”In traditional centralized social networks, users must verify their identity through a mobile phone number or similar method when creating an account, thereby linking a virtual identity to a real-world identity. The mobile phone number itself is private user information and must not be disclosed to other users. This creates the need for privacy protection. From this need, a series of privacy-protection mechanisms arise, including access-permission settings such as “visible to friends only.”
In this system, the premise of the privacy problem is eliminated at the root. Identities within the system are defined entirely by public keys and totally ordered message chains. Users do not need to submit any real-world personal information to the system. Consequently, the kind of “user privacy leakage” found in centralized platforms does not exist. Mobile phone numbers, email addresses, and other personal information are neither required nor verifiable if supplied, and therefore cannot be leaked by the system.
All messages in the system are public. This is not an imposed restriction, but a necessary consequence of the system’s logic, for two reasons.
First, an identity is defined by all of its ordered messages. If reading permissions differ across messages, different information consumers will see different message sets and will therefore form different understandings of the same identity. The basis on which identity functions as a publicly verifiable object would be destroyed, and the premise on which the system operates would no longer hold.
Second, reading-permission control, such as “visible to friends only,” rests on a premise that cannot be guaranteed: that others can keep secrets for you. In a decentralized environment, no mechanism can enforce such a constraint. A function that cannot be guaranteed is unnecessary.
In this system, all published content is content that users wish to make public. Content that users do not wish to make public should not be published through this system. The system contains no concept of “real-name identity.” All identities are public-key identities by nature. As for the association between a public key and a real-world identity, that is an activity outside the system. Within the system, we can only prove that a set of messages comes from the same source through signatures made by the same public key; we cannot prove anything else. A person may record a video outside the system declaring that a certain public key belongs to them, but the creation and verification of that association are not functions of this system.
If users do need private communication, they may use known publisher public keys from the system to conduct peer-to-peer encrypted communication outside the system. It should be noted, however, that such private communication is not a function of this system. It should avoid using the system’s message format so as not to be punished by mistake.
Third-Party Services
Section titled “Third-Party Services”As a peer-to-peer social networking system, we welcome the existence of third-party services, but the system does not depend on any third-party service. Third-party services can provide users with more efficient and convenient services based on public messages in the system, and they can also provide richer channels of interaction with the world outside the system. Third-party services play an auxiliary role. They can be especially useful in helping new identities be discovered, but they do not have a decisive position.
| Service | Description |
|---|---|
| Information retrieval | Based on public messages, maintains as large a visible user range as possible, verifies information integrity and conflicts, and provides message retrieval interfaces to the outside. It can help new identities be discovered by more potential interactors. |
| Message delivery | Based on publisher identities, provides peer-to-peer encrypted message delivery between users outside the system. |
| Data statistics | Based on public messages and publisher identities, measures message interactions and calculates the degree of information dissemination. |
| Advertising platform | Connects publishers and advertisers based on publisher identities and information dissemination. |
| Other services | Services that become feasible because publisher identities carry cost, such as voting. |
Blockchain
Section titled “Blockchain”The core functions of this system do not depend on cryptocurrency. Blockchain is one possible extension that becomes supportable once the system has an identity foundation, just as money was not natural to human society. Early human societies existed for thousands of years through barter.
In the current system, blockchain-based cryptocurrency can be implemented. A block in the blockchain can be packaged as a message and broadcast by a miner as an information publisher.
Unlike traditional blockchains, publisher identities in a social system already possess meaning. Therefore, more efficient consensus mechanisms can be selected instead of relying solely on computational power. The system can use publisher identities to replace staking and to limit the range of miners or validators, rather than using successful block production to expand the degree to which a publisher is accepted. This avoids distorting the incentive principles of the current system. Proof of Stake (PoS), Delegated Proof of Stake (DPoS), Proof of Authority (PoA), and Avalanche-style consensus mechanisms are all relatively suitable choices.
The implementation of cryptocurrency may also be understood as multiple publishers forming a higher-level publisher. The blockchain is jointly maintained by these publishers and constitutes a higher-level totally ordered message queue. A blockchain implemented within the system can identify the total order of block messages through a specific position in the reference list.
Conclusion
Section titled “Conclusion”This paper proposes a peer-to-peer social networking system that can operate without relying on any third-party service. The system pursues the two fundamental objectives of information dissemination at the same time: the free publication of information and the effective acquisition of information. It does not attempt to eliminate spam or malicious accounts at the system level. Instead, it contains the effects of malicious behavior within a limited scope through the independent filtering of each information consumer.
The system divides the traditional user into two independent roles: information publisher and information consumer. All actions taken by a publisher form an immutable totally ordered message chain, which constitutes the definition of identity. Maintaining total order is the publisher’s own responsibility. A fork in the message chain means the destruction of the identity definition and causes that identity to be ignored. Consumers construct visible identity sets through self-defined filtering rules and acquire information effectively within their own horizons. Any lawful identity constituted by totally ordered events has equal system status regardless of whether the entity behind it is a real person, an organization, or an artificial intelligence.
All messages in the system are public. Identity is defined by all ordered messages. Distinguishing reading permissions would cause different consumers to form different understandings of the same identity, thereby destroying the foundation of the system. The system neither needs nor uses real-world personal information to define identity, and therefore eliminates at the root the privacy-leakage problem found in centralized platforms.
From the perspective of the system as a whole, effective information spreads more widely because it is accepted by more consumers, while spam naturally contracts because it is blocked by many information consumers along its propagation paths. The system does not require any unified adjudication mechanism. The filtering of information quality is the statistical result of independent actions by all participants. Unlike traditional centralized social networks, each consumer has a different visible range of publishers. This is not a defect of the system, but an inevitable property of decentralized systems.
On the basis of this system, existing mainstream blockchain consensus mechanisms can also be transplanted to implement cryptocurrency and other extensions. Because publisher identities in the system carry a cost accumulated over time, they can optimize certain consensus mechanisms that require staking-based constraints.