Communication Patterns
- Overview
- Async
- Database
- File Transfer
- GraphQL
- gRPC
- P2P
- Stream
- Sync
Communication Style | Definition | Features | Use Cases |
---|---|---|---|
Asynchronous | Asynchronous communication is used to communicate between services without waiting for the response |
|
|
Database | Uses a shared database to communicate between services. Commonly uses CQRS and Event Sourcing techniques |
|
|
File Transfer | Services communicate by reading and writing files to a shared location |
|
|
GraphQL | Query language for APIs, allowing clients to request exactly what they need |
|
|
gRPC | gRPC (Google Remote Procedure Call) is a high-performance, language-agnostic remote procedure call (RPC) framework that enables efficient communication between distributed systems by utilizing Protocol Buffers for serialization and HTTP/2 for transport, ensuring low latency, bandwidth efficiency, and support for bi-directional streaming |
|
|
P2P | P2P (Peer-to-Peer) is a decentralized form of communication where each party has the same capabilities and either party can initiate a communication session. It enables direct communication and sharing of resources among multiple nodes in a network without the need for a central coordinating server |
|
|
SOAP | SOAP (Simple Object Access Protocol) is an XML-based communication protocol |
|
|
Stream | Data transmission method where information is continuously and sequentially delivered in a steady flow, often in real-time, without distinct boundaries or breaks, facilitating persistent and ongoing data exchange between communicating entities |
|
|
Synchronous | Data transmission between sender and receiver occurs in real-time, requiring both parties to be actively engaged simultaneously for message exchange, ensuring temporal alignment of communication events |
|
|
- Messaging protocol
- Delivery Semantics
- Service Integration Patterns
- Fan-In / Fan-Out
Aspect | MQTT | AMQP |
---|---|---|
Visualization | ||
Definition | Provides simple message queuing services, which can be implemented mainly in the embedded systems | Offers a wealthier range of messaging circumstances, and it performs better when it comes to security protocols |
Background | MQTT is majorly vendor-driven and was developed by IBM | JP Morgan developed AMQP for financial apps |
Architecture | MQTT has client/broker architecture | AMQP has a client/broker and client/server architecture |
Design protocol | It simplifies the process of encrypting messaging using TLS and authenticating clients using modern protocols such as OAuth | It is a TCP-based protocol that performs both publish/subscribe and request/response types of communication |
Framework optimization | It is based on the wire framework, which uses a stream-like approach for lightweight memory devices | It is optimized on the wire of data framing, which has a buffering approach, boosting the server performance |
Messaging services | MQTT is highly transient and is most engaged for active routing pud core unuse bors | AMQP enables all kinds of messaging, including bulk messaging, and executes metadata messages |
Transaction of messages | It is known for supporting general acknowledgments relatively quickly | It supports various acknowledgments and transactions |
Data context | MQTT has partial support for data cache and proxy | AMQP offers full support for data cache and proxy |
Proven security | It does not act to any security issues in connection, although its security can be amped up using add-on services | AMQP eliminates the policy of TLS and SASL, increasing the performance of continuous updates swiftly |
Last value queues | It offers to Retain command and Supports the last value in the queues smoothly | There is no provision or support for the last values in the queue, which can be a limitation |
Efficiency and scalability | Since it is wire-efficient, it requires less effort to implement on a client than AMQP | It does not allow the creation of subscriptions in message streams; hence, it is unscalable |
Reliable messaging | Its data delivery is highly reliable and on point | It enables only fire-and-forget policy. AMQP is not reliable |
Namespaces | MQTT deploys "namespaces" for the transmission of messages in a hierarchy | AMQP allows multiple ways for finding messages, such as queues or nodes |
Additional attributes | MQTT fulfills the basic requirements for the DNS server | It is asymmetric and does not support any advanced features |
Implementation | It can be implemented in devices with less than 64kb of RAM | It is implemented in little elements of less than 64kb |
Extensibility | MQTT has an entire fresh draft of the protocol and allows for much broader extensibility than other protocols | It has structural points allowing extensibility in a particular fashion and alteration in layers by isolation |
Pros |
|
|
Cons |
|
|
Use Cases |
| Widely used in critical systems in the financial, telecommunications, defense, manufacturing, internet, and cloud computing industries |
QoS (Quality of Service) Level | Definition | Message Delivery | Pace | Dependability |
---|---|---|---|---|
0 (at least once) | Each message will be delivered once or possibly not at all. This method prioritizes speed but compromises on reliability (may have duplicates) | Single Delivery | Swift | Low |
1 (at most once) | Ensures the delivery of the message but also allows the possibility of duplicates (may loose message) | Guaranteed Delivery | Fair | High |
2 (exactly once) | Promises a single delivery of the message. While the most reliable, it is the least speedy (guaranteed once) | Unambiguous Delivery | Slow | Supreme |
Pattern | Description | Use Cases | Implementation | Examples |
---|---|---|---|---|
Request-Reply | Client sends a request to a service, which processes the request and sends a response back to the client | Synchronous interactions where the client expects a response within a predefined time frame | Use protocols like HTTP, gRPC, or AMQP for request-response communication. Ensure error handling and timeout mechanisms are in place to handle failures gracefully | HTTP RESTful APIs and RPC calls |
Publish-Subscribe (Pub-Sub) | Publishers broadcast messages (events) to one or more subscribers without knowledge of the subscribers' identities | Asynchronous event-driven architectures where services need to react to state changes or events | Utilize message brokers like Apache Kafka, RabbitMQ, or AWS SNS/SQS. Define topics for different event types and allow subscribers to consume messages asynchronously | User registrations, order placements, and system notifications |
Message Broker | Services communicate indirectly through a message broker, which acts as an intermediary responsible for message routing and delivery | Decoupling of producers and consumers, enabling asynchronous communication and load balancing | Choose a suitable message broker (e.g., RabbitMQ, Kafka) and define message queues for point-to-point communication or topics for pub-sub scenarios | Task queues, job processing, and distributed logging |
Event Sourcing | Services maintain a sequential record of state-changing events, which serve as the primary source of truth for data | Tracking and auditing changes to data, ensuring consistency and traceability across distributed systems | Implement event sourcing patterns using databases optimized for write-heavy workloads (e.g., Apache Kafka, Apache Pulsar) or dedicated event sourcing frameworks | Financial transactions, inventory management, and audit logs |
Command Query Responsibility Segregation (CQRS) | Separates the responsibility for handling read and write operations into separate components, optimizing performance and scalability | Applications with varying read and write loads, where optimizing data retrieval and modification operations is critical | Maintain separate data models for reads and writes, with dedicated services handling each aspect. Utilize event sourcing and eventual consistency to synchronize data between read and write stores | E-commerce platforms, social media feeds, and analytics systems |
Saga Pattern | Manages distributed transactions across multiple services by orchestrating a sequence of compensating actions to maintain consistency | Long-running business transactions spanning multiple services, where traditional ACID transactions are not feasible | Implement sagas using choreography or orchestration-based approaches. Utilize compensating transactions to rollback changes in case of failures and ensure eventual consistency | Order processing, payment processing, and booking systems |
Data Replication | Copies data from one service to another to ensure availability, performance, and fault tolerance | Replicating data across multiple services or data centers to improve read/write performance, reduce latency, and enhance fault tolerance | Use techniques like master-slave replication, multi-master replication, or distributed caching to replicate data across services. Ensure consistency and synchronization mechanisms are in place to handle updates and conflicts | Caching, database replication, and distributed data stores |
Gateway and Proxy | Provides a single entry point for clients to access multiple services, abstracting the complexities of the underlying microservices architecture | Simplifying client interactions, enforcing security policies, and aggregating data from multiple services | Deploy gateways/proxies as separate services or as part of a service mesh infrastructure. Implement routing, load balancing, and security features to manage client requests effectively | API gateways, reverse proxies, and edge computing platforms |
Criteria | Fan-In | Fan-Out |
---|---|---|
Visualization | ||
Definition | Multiple nodes or components send data to a single destination or central point | Data is distributed from a single source or central point to multiple nodes or components |
Purpose |
|
|
Use Cases |
|
|
Aspect | Binary Logs | Polling |
---|---|---|
Visualization | ||
Definition | Structured file that records changes made to a database, typically used for data recovery, replication, and auditing purposes | Mechanism that repeatedly querying a database to check for changes |
Trigger | Database write operations | Periodic queries/checks |
Data Transmission | Only the changes made to the database | All relevant data periodically |
Updates | Near real-time | Eventual consistency |
Use Cases |
|
|
Aspect | File Transfer Protocol (FTP) | File Storage |
---|---|---|
Visualization | ||
Definition | FTP is a standard network protocol used for transferring files from one host to another over a TCP-based network, such as the Internet | File Storage involves storing files in a centralized location or distributed system, accessible by multiple users or systems, typically over a network |
Communication Model | Client-server model, where clients initiate requests and servers respond | Can follow client-server or peer-to-peer models depending on the architecture |
Transfer Mode | FTP supports two modes: ASCII and Binary, for transferring text and binary files, respectively | File storage supports various modes including block storage, object storage, and file systems |
Scalability | FTP servers can handle a limited number of concurrent connections, scalability can be achieved through load balancing and clustering | File storage systems are designed for scalability, with the ability to scale horizontally by adding more storage nodes or vertically by increasing resources |
Performance | Performance may vary based on network conditions and server load. Limited by factors like bandwidth and server processing power | Performance depends on factors like storage technology (e.g., HDD, SSD, NVMe), network bandwidth, and caching mechanisms |
Use Cases | Primarily used for transferring files between a client and a server. Considered as a legacy protocol, still widely used but less favored in modern cloud-native architectures | Used for storing, managing, and accessing files within a network or across multiple networks. Often integrated with services like object storage or distributed file systems |
Examples | FileZilla Server | Amazon S3, Google Cloud Storage, Azure Blob Storage |
- Overview
- Monograph vs Supergraph
- Federation
- GraphQL Composition
Aspect | REST | GraphQL |
---|---|---|
Visualization | ||
Design Philosophy | Based on standard HTTP methods (GET, POST, PUT, DELETE for CRUD operations) and status codes | A query language for APIs, not tied to HTTP. Provides single endpoint for clients to query for precisely the data they need |
Data Fetching | Multiple requests might be required to gather all necessary data | Allows fetching all necessary data in a single request |
Over-fetching/Under-fetching | Possible, as server defines what data is returned for each endpoint | No over- or under-fetching, as client defines exactly what data it needs. Might lead to N+1 requests problem that should be handled by dev |
Efficiency | Less efficient due to over-fetching and under-fetching | More efficient due to minimized data transfer |
Versioning | Requires versioning as changing the structure can lead to breaking changes | No versioning needed, as old fields can be deprecated and new ones added |
Error Handling | Uses HTTP status codes | Provides error messages in the response, not tied to HTTP status codes |
Real-time Updates | Requires additional technologies like WebSockets | Supports subscriptions which allows real-time updates |
Flexibility | Less flexible as server defines what data is sent for each endpoint | API-First approach: More flexible as client specifies exactly what data it needs |
Caching | Caching is straightforward with HTTP caching mechanisms | Requires more effort to implement as it doesn't leverage HTTP caching mechanisms |
Use Case | Ideal for simple, CRUD-based projects and public APIs due to its simplicity and scalability | Best for complex systems, real-time data, microservices, and when precise control over data fetching is required and when following API-First approach |
Typing | No built-in type system | Has a strong type system, which helps with validation and autocompletion tools |
API Introspection | Not supported | Supported, which allows clients to understand what data is available |
Optimized for | Optimized for servers | Optimized for clients |
Scalability | Might suffer performance issues due to over-fetching and multiple round trips | Better performance due to minimized data transfer and single round trip. For Enterprise applications Federated schemas and Apollo Router can be used |
Debugging | Can be difficult due to the lack of specific error messages | Easier due to detailed error messages and API introspection |
Aspect | Monograph | Supergraph |
---|---|---|
Visualization | ||
Definition | Single GraphQL service with its own schema, resolvers, and data sources | Collection of independent GraphQL services (subgraphs) stitched together into a unified GraphQL API |
Schema | Single, monolithic schema defining all available data | Composed of individual schemas from each subgraph, combined into a single supergraph schema by the router |
Data Sources | Connects directly to its own data sources | Can connect to its own data sources or leverage data from other subgraphs |
Resolvers | Contains resolvers for all data types defined in its schema | Each subgraph defines resolvers for its own data types |
Client Interaction | Clients interact directly with the monograph endpoint | Clients interact with a single supergraph endpoint managed by the router |
Scalability | Limited scalability as the service grows | Highly scalable by adding or removing subgraphs independently |
Flexibility | Limited flexibility as changes require modifying the entire schema | Highly flexible, allowing independent changes to subgraphs without affecting the entire API |
Use Cases | Small to medium sized applications | Enterprise applications |
Type | Concept | Implementation | Schema Design | Scalability | Use Cases |
---|---|---|---|---|---|
Schema Stitching | Manually combines schemas from multiple GraphQL servers into a single schema | Requires manual schema manipulation and merging | Can be less flexible and lead to complex schemas | Limited scalability as adding new subgraphs requires manual schema updates | Simple use cases with a limited number of subgraphs and limited need for future growth |
Apollo Federation | Leverages a specification and tooling to create a unified schema from independent subgraphs | Uses directives and extensions to define subgraphs and entity relationships | Promotes a clean separation of concerns with independent subgraphs | Highly scalable as subgraphs are independent and loosely coupled | Microservices architectures with independent development and deployment of subgraphs |
Code-First Approach | Focuses on code-driven development of subgraphs with minimal schema manipulation | Relies on code libraries and frameworks to define subgraphs and resolvers | Encourages code-centric design for subgraphs with potential schema duplication | Scalability depends on the chosen code-first framework and its ability to handle distributed queries | Preference for code-driven development and rapid prototyping of subgraphs |
- gRPC Flow
- Types
- RPC vs RESTful
- gRPC vs tRPC
Aspect | Unary | Server Streaming | Client Streaming | Bi-Directional Streaming |
---|---|---|---|---|
Visualization | ||||
Definition | Client sends request, server sends back response | Client sends request, server responds with a stream to read multiple messages | Client sends multiple messages via stream, waits for server response | Both sides send and receive messages independently via streams |
Data Flow | Single request - single response | Single request - multiple responses | Multiple requests - single response | Multiple requests - multiple responses |
Latency | High, due to round-trip time | Lower due to continuous stream from server | Lower due to continuous stream from client | Lowest, due to continuous bi-directional communication |
Resource Usage | Low, only one request and one response | Higher, due to the stream of responses | Higher, due to the stream of requests | Highest, due to the continuous bi-directional communication |
Real-time Data Handling | Not suitable, due to high latency | Server can continuously send updates | Client can continuously send updates | Both client and server can continuously send updates |
Use Cases |
|
|
|
|
Aspect | RPC | RESTful |
---|---|---|
Message type | Resource | Binary |
Coupling | Strong | Weak |
Data format | binary, thrift, protobuf, avro | text, xml, json, csv |
Communication protocol | TCP | HTTP/1.1, HTTP/2 |
Performance | High | Lower |
IDL (Interface Definition Language) | Thrift, protobuf | Swagger |
Client code generation | Auto-generated stub | Auto-generated stub |
Developer experience | Not human readable and hard to debug | Human readable and easy to debug |
Comparison Criteria | gRPC (Google RPC) | tRPC (TypeScript RPC) |
---|---|---|
Language Support | Supports a wide range of programming languages including C++, Python, Ruby, and C# | Primarily supports TypeScript and JavaScript, other languages are not supported |
Protocol | Uses HTTP/2 as a default transport protocol | Uses HTTP/1.1, HTTP/2, and HTTP/3 protocols |
Data Format | Uses Protocol Buffers as the interface definition language | Uses JSON as the data format |
Streaming Support | Supports bi-directional streaming and flow control | tRPC doesn't support streaming natively but can be used with streaming libraries |
Client-Server Communication | Uses a contract-first approach to client-server communication | Uses a code-first approach to client-server communication |
API Contract | Interface definition language (IDL) is required for defining the API contract | Type safety is provided by the TypeScript compiler, and doesn't require a separate IDL |
Error Handling | Provides a rich model for handling various types of errors | Has a simpler mechanism for handling errors |
Performance | High performance due to binary data format and HTTP/2 protocol | Performance is good but not as high as gRPC due to JSON and HTTP/1.1 usage |
Use Cases | Suitable for microservices, real-time systems, and point-to-point services | Suitable for building APIs in TypeScript or JavaScript |
Server Push | Supports server push via HTTP/2 | Server push is not supported natively |
Interoperability | Can interoperate with other gRPC services out of the box | Interoperability is limited to TypeScript and JavaScript |
- Overview
- Network Types
Peer-to-peer (P2P) is a decentralized network model where nodes, or "peers", connect directly to each other instead of via a central server. Each peer shares resources like processing power or storage, enabling efficient and flexible data distribution
Aspect | Unstructured P2P | Structured P2P | Hybrid P2P |
---|---|---|---|
Topology | Random, decentralized | Overlay network, usually structured based on distributed hash tables (DHTs) | Combination of decentralized and structured elements |
Routing | Flooding, random walk | DHT-based (Chord, Kademlia, etc.) | Combination of flooding and DHT-based routing |
Scalability | Limited by flooding, can suffer from congestion and inefficiency as network size grows | Highly scalable due to structured routing, can handle large-scale networks efficiently | Offers good scalability by combining the benefits of both structured and unstructured approaches |
Search Efficiency | Low, as searches may need to traverse the entire network | High, logarithmic time complexity for routing queries | Moderate, depends on the implementation |
Fault Tolerance | Limited, as nodes may join/leave without coordination, leading to data loss or inconsistency | High, redundancy and structured routing ensure resilience against node failures | Moderately high, benefits from both decentralized nature and structured redundancy |
Data Locality | Low, data may be stored on any node, leading to increased latency for retrieval | Moderate, structured routing enables efficient data localization | Moderate, depends on the implementation |
Resource Consumption | High, due to flooding and lack of optimization in routing | Moderate, structured routing reduces resource consumption compared to unstructured P2P | Moderate, depends on the balance between structured and unstructured elements |
Security | Low, vulnerable to sybil attacks, as nodes can easily join without verification | Moderate to high, DHT-based authentication and routing protocols enhance security | Moderate, depends on the implementation and integration of security measures |
Examples | BitTorrent | Chord, Kademlia | Blockchain networks |
Use Cases | File sharing, ad hoc communication | Distributed storage, content delivery networks | Decentralized finance, decentralized applications |
Aspect | Short Polling | Long Polling | Webhook | WebSockets | Server-Sent Events (SSE) |
---|---|---|---|---|---|
Visualization | |||||
Real-time communication | No, it makes repeated requests even if no data is available | Yes, it keeps connection open until data is available | Yes, server pushes data to client when a particular event happens | Yes, it provides full-duplex communication channels | Yes, it allows a server to push updates to clients |
Efficiency | Low, as it continuously asks for data from the server, leading to high network traffic | Higher than short polling, as it reduces unnecessary network overhead | High, no need for client to continuously poll for data, reducing network traffic | High, as it only communicates when there is new data, and maintains a constant connection | High, as it allows servers to push updates without client requests |
Complexity | Low, as it uses standard HTTP requests | Higher than short polling, as it needs to maintain open connections | Medium, client needs to subscribe to events and server needs to support webhooks | High, as it requires specific protocols and server-side implementation | Medium, as it mostly requires server-side implementation |
HTTP Headers | Sent with every request, increasing overhead | Sent with every request, increasing overhead | Sent only when an event occurs, reducing overhead | Sent only at connection setup, reducing overhead | Sent only at connection setup, reducing overhead |
Data Direction | Bidirectional, but inefficient | Bidirectional, but inefficient | Unidirectional (server to client), best for delivering event notifications | Bidirectional, providing real-time interaction | Unidirectional (server to client), best for delivering updates |
Connection Persistence | No, connections are closed after each request | Yes, until the server has data to send | No, connections are established only when an event occurs | Yes, connections are kept alive until closed by either client or server | Yes, until the client closes the connection |
Use Cases | Best for when updates are infrequent and data is small | Best for when updates are sporadic but real-time delivery is required | Best for real-time notifications, when you want to be notified when a particular event happens | Best for real-time applications, gaming, chat applications etc | Best for real-time applications, especially when updates are only required to be sent to the client |
- Development Styles
- Retry Patterns
Aspect | Code First | API First |
---|---|---|
Visualization | ||
Definition | Development of the application begins with coding, and the APIs are developed from the code | APIs are developed before the actual coding begins. The APIs are designed and documented first, followed by the development of the application |
Approach | Bottom-up: Focuses on the application's functionality. Serves application's needs | Top-down: Client-oriented approach. Serves client's needs |
Design Focus | Focuses primarily on the application's functionality and then on the API capabilities | Prioritizes the design and capabilities of the APIs, and then the application is built around these APIs |
Implementation Speed | Can be faster initially because the development can start immediately without having to wait for the API design | Might be slower to start because the API design needs to be completed first. However, in the long run, it can speed up the development process as it provides a clear roadmap. In addition, cross-team collaboration can be improved as well and decentralize/distribute the workflow among multiple teams without any overhead or delays |
Collaboration | Might lead to less collaboration as developers might not have a clear vision of the final product | Promotes collaboration between front-end and back-end developers, as well as other stakeholders, as everyone has a clear understanding of the APIs and their capabilities |
Scalability | Scalability can be a challenge as changes in the code might require changes in the APIs | Considers scalability from the start. APIs are designed to cater to future needs, which makes scaling more straightforward |
Consistency | May lead to inconsistency in API design as different developers might design APIs differently | Ensures consistency in API design as all APIs are designed before the coding begins, following a predefined set of standards |
Documentation | Can be a challenge in the Code First approach, as it is often treated as an afterthought | Prioritizes documentation, which is done in the initial stages of the project. This ensures that all stakeholders understand the APIs and their capabilities |
Testing | Testing is generally performed after the application has been developed | APIs can be tested independently of the application, allowing for early detection and resolution of issues |
Integration | Might present integration challenges as the APIs might not align with 3rd-party systems or components | Considers integration from the start. APIs are designed to be reusable and can be easily integrated with other systems |
Maintenance | Might lead to higher maintenance costs if changes in the code require changes in the APIs | With its focus on scalability and integration from the start, can lead to lower maintenance costs in the long run |
Use Case | Well-suited for small, simple projects where quick development is required | Ideal for complex, large-scale projects where scalability, consistency, and integration are key considerations |
Patterns​
- Cancel: User can cancel the request
- Immediate retry: User immediately resends a request
- Incremental intervals: User waits for a short time for the first retry, and then incrementally increases the time for subsequent retries
- Exponential Backoff: Retry after waiting doubled from previous attempt (1s, 2s, 4s...)
- Exponential Backoff with Jitter: Adds randomness to waiting times to prevent retries from all hitting the server at once
Cons​
- Degrade/Overload the system
- Idempotency consideration (operation that produces the same outcome for the same input)
- Retry amplification (retries aren't always needed)
Solutions​
- Rate limiting
- Circuit breakers
Best Practices​
- Asynchronous
- REST API
- Versioning
- Communication Channels
- Resource Naming Convention
- Error Handling
- Data Filtering
- Security
- Traceability
- Performance
Semantic Versioning​
Semantic Versioning (SemVer): (1.0.0
)
MAJOR
: version for incompatible contract changesMINOR
: version for adding functionality in a backward-compatible mannerPATCH
: version for backward-compatible bug fixes
Strategies​
Strategy | Definition | Consumers Action | Pros | Cons |
---|---|---|---|---|
Schema Versioning | Explicitly include a version number in the message itself | Consumers can handle messages based on their version |
|
|
Backward Compatible Schema Evolution | Introduce new optional fields to messages without breaking existing consumers | Consumers can ignore new fields they don't understand |
|
|
Forward Compatible Schema Evolution | Mark existing fields as deprecated in older versions | Introduce new fields in newer versions for future consumers |
|
|
Content Negotiation | Producer and consumer negotiate the message format during connection establishment | Allows for dynamic adaptation and interoperability between different versions |
|
|
Considerations​
- Trade-offs: Complexity vs Flexibility, Backward vs Forward Compatibility
- Start simple: Opt for schema versioning or backward-compatible evolution for most cases
- Use content negotiation: For highly dynamic environments with diverse clients
- Versioning granularity: Decide if versioning applies to entire messages or individual fields
- Version deprecation policy: Define a timeline for removing support for older versions
- Versioning tooling: Leverage tools for schema documentation, validation, and migration
- Clear communication: Document versioning policies and notify stakeholders of upcoming changes
- Forsee Future Changes: Plan for message versioning to accommodate future changes in the payload structure, and ensure backward and forward compatibility to facilitate system evolution
- Identify Use Cases:
- Processing large data sets
- Triggering workflows
- Sending notifications
- Real-time communication
- Message Routing and Filtering
- Topic-based vs Queue-based Routing
- Content-Based Routing
- Filtering Techniques (Header, Metadata, Content)
- Messaging Pattern
- Point-to-Point (P2P): Messages are delivered from a single producer to a single consumer. Ideal for one-off tasks or directed communication
- Publish-Subscribe (Pub/Sub): Messages are published to a topic, and any interested subscribers receive them. Great for real-time updates and broadcasting information
- Request-Reply: Similar to synchronous communication, but with a delayed response to handle long-running tasks
- Messaging Technology
- Message Queues (MQ): Robust and reliable, offering features like message persistence, retries, and guaranteed delivery (RabbitMQ, Apache Kafka, Amazon SQS)
- Streaming Platforms (Apache Kafka): Handle high-volume, real-time data pipelines with low latency
- Server-Sent Events (SSE) and WebSockets: Enable bi-directional communication for real-time web applications
Naming​
- Abbreviation Standards: Define a clear abbreviation policy for common elements (
prod
for production) - Case Sensitivity: Decide on case sensitivity (uppercase, lowercase, mixed) for consistency
- Namespace Separation: Utilize separators (hyphen
-
, underscore_
) to differentiate between elements clearly - Naming Length: Maintain a reasonable length to avoid excessive complexity
- Structure:
{base_resource}.{environment}.{application}.{functionality}[.{version}]
(queue.prod.order-processing.payment-confirmation.v2
)- Base Resource: Identify the top-level resource category (
queue
,topic
,exchange
) - Environment: Indicate the deployment environment (
dev
,test
,prod
) - Domain/Product/Application: Specify the entity using the resource (
order-processing
,user-service
) - Functionality: Describe the specific functionality of the resource (
payment-confirmation
,user-registration
) - Versioning Strategy: Establish a versioning approach to denote changes (semantic versioning)
- Base Resource: Identify the top-level resource category (
- Abbreviations: Use abbreviations judiciously for commonly used terms, but ensure clarity for less familiar ones
- Clarity and Consistency: Use clear and concise names that accurately reflect the field's purpose
- Domain-Specificity: Use terminology relevant to the specific domain the message applies to
- Extensibility: Design messages to be easily extensible for future requirements
- Generic Terms: Avoid generic terms like "data" or "value" unless their meaning is self-evident
- Hierarchical Structure: If necessary, use a hierarchical structure for nested data
- Immutability: Prefer immutable message fields to maintain data integrity
- Maintain consistent naming conventions: Use consistent naming throughout the system (
camelCase
,snake_case
) - Message Format: Define a clear and consistent message format (JSON, Protobuf)
- Metadata Fields: Incorporate metadata fields for tracking message processing (
retry_count
,status
,error_details
) - Payload Structure: Design payload structure to encapsulate data relevant to the specific use case
Common Fields​
- Acknowledgment: Mechanism for confirming message receipt or processing status
- Correlation ID: Links related messages for tracking and tracing purposes
- Message ID: Unique identifier for each message instance
- Message Type: Indicates the purpose or category of the message
- Priority: Importance level of the message for processing
- Source/Destination: Identify the sender and recipient of the message
- Timestamps: Creation time, expiration time, and processing time (UTC timestamp as number or in ISO-8601 format)
- Dead Letter Queues: Implement dead letter queues for handling undeliverable messages
- Error Logging: Log errors and exceptions for troubleshooting and analysis
- Retry Policies: Define retry policies with exponential backoff to handle transient failures
- Message Delivery Acknowledgment: Implement acknowledgement mechanisms to confirm message receipt or processing status
- Message Expiration / Time-To-Live (TTL): Implement message expiration to ensure message delivery after specified time periods
- Message Ordering: Ensure message ordering to avoid race conditions
- Message Queuing: Implement message queuing for reliable delivery and load balancing
- Message Routing: Implement message routing to ensure message delivery to the appropriate destination
- Access Control: Implement access controls to restrict message access based on roles and permissions
- Compliance: Ensure compliance with industry standards and regulations (GDPR, HIPAA)
- Encryption: Encrypt sensitive message data both in transit and at rest
- Schema Registry: Use schema registries to manage schema versions and enforce validation
- Auditing: Enable auditing mechanisms for tracking message access and modifications
- Monitoring: Set up monitoring for tracking message processing metrics and error rates
- Batching: Implement batching techniques to reduce the number of messages sent
- Caching: Utilize caching mechanisms for frequently accessed or static message data to enhance performance
- Compression: Compress message data to reduce network overhead and improve performance
- Concurrency: Implement concurrency patterns to handle high message volumes efficiently
- Payload Size: Minimize payload size to reduce network overhead
- Versioning
- Resource Naming Convention
- REST Verbs
- HTTP Status Codes
- Error Handling
- Data Filtering
- Security
- Traceability
- Performance
- Request Payload
- Response Entity
Versioning | Definition | Caching | Compatibility | Example | Use Cases |
---|---|---|---|---|---|
URL Path | Version information is included in the URL path | Each version has a unique URL, making it easy to cache separately | Older versions remain accessible via their unique URLs | test.com/v1/users | Public APIs |
URL Query | Version information is included in the query parameters of the URL | Cache configuration might be complex as the URL is the same but the query parameter differs | Older versions remain accessible as long as the query parameters are supported | /users?version=1.0 | Optional versioning |
Header | Version information is included in the headers of the HTTP request | Caching with headers can be complex as most caches are URL-based | Compatibility depends on how well clients handle headers. Some may not support custom headers or may behave unpredictably | Version: 1.0 | Versioning information should not affect caching or URI structure |
Media Type | Version information is included in the media type | Caching is possible but may require additional configuration, as the URLs are same but media types differ | Compatibility is generally high, but older clients that don't support media type versioning could face issues | application/company.resource.v1+json | When backward-incompatible changes need to be communicated at the content type level |
Body | Version information is included in the request body | Body-based versioning is typically not cache-friendly as it is difficult to segregate based on body content | Compatibility could be an issue, as it depends on clients sending correct data in the body | { "version": "1.0" } | Unifies microservice communication (REST, async messaging, Protocol Buffers) |
URL​
- URL Path and URL Query Parameters: Consider using hyphens (
-
) to separate words in your URLs, as it helps users and search engines identify concepts in the URL more easily/users/very-long-path?long-text=123
- camelCase
/users/veryLongPath?longText=123
Nouns vs Verbs​
- Verbs should not be used in endpoint paths. Instead, the pathname should contain the nouns that identify the object that the endpoint that we are accessing or altering belongs to.
- Instead of using
/getAllClients
to fetch all clients, use/clients
- Instead of using
Singular vs Plural​
- Stick to one convention because it depends on your domain
- as an example
- shopping cart in e-commerce website is 1-to-1 relationship between client and a shopping cart. Therefore, it's confusing to have
/carts
- on the other hand, we can have a blog with articles which makes more sense to have
/articles
endpoint
- shopping cart in e-commerce website is 1-to-1 relationship between client and a shopping cart. Therefore, it's confusing to have
- as an example
Utilize Resource Nesting Efficiency​
- If resource have a has-a relationship to another resource it's good to use nesting while implementing REST API Contracts
- Example: user can have some orders in e-commerce website
- user's orders:
/users/1/orders/
- user's order:
/users/1/orders/1
- user's orders:
- Example: user can have some orders in e-commerce website
- HEAD
- identical to GET, except without the response body. It will do the same GET request but won't return anything
- useful for checking what a GET request will return before actually making a GET request (download large file or response body)
- OPTIONS
- return data describing what other methods and operations the server supports at the given URL
- PATCH
- applied partial modification to the resource (send only username in the body) as opposed to POST which require the full user entity
Resource | GET (read) | POST (create) | PUT (update) | PATCH (partial update) | DELETE (delete) | OPTIONS | HEAD |
---|---|---|---|---|---|---|---|
/users | returns all users | creates a new user | bulk update of users | partial update of all users | delete all users | returns HTTP methods | returns HTTP headers |
/users/1 | returns a specific use | method not allowed (405) | updates a specific user | partial update of a specific user | deletes a specific user |
HTTP Status Codes​
- 1xx: Information
- 2xx: Success
- 3xx: Redirection
- 4xx: Client Error
- 5xx: Server Error
HTTP Code | HTTP Status | Use Cases |
---|---|---|
200 | OK | Request succeeded (REST call) |
201 | CREATED | Request succeeded and resource created (short async call) |
202 | ACCEPTED | Request has been accepted for processing (long async call) |
301 | MOVED PERMANENTLY | Resource was moved to a new place permanently |
400 | BAD REQUEST | REST API client's invalid input. Provide only client's oriented message |
401 | UNAUTHORIZED | Client is not authenticated |
403 | FORBIDDEN | Client is not authorised to access resource |
404 | NOT FOUND | Requested resource is not found |
500 | INTERNAL SERVER ERROR | Any unexpected error. Do not provide any information to a client instead log it |
503 | SERVICE UNAVAILABLE | Do not provide any information to a client instead log it |
-
Do not expose any sensitive information
-
Wrap any API calls and return only predefined messages for
4xx
,5xx
-
Implement global Exception and provide Default Response Entity
- HTTP Code 400: Provide only client's oriented message
- HTTP Code 500: Log the error and provide generic message
-
Standardize response body
{
"timestamp":"2000-01-15T22:00:00.000+0000",
"status":500,
"error":"Internal Server Error",
"message":"Error while processing request",
"path":"/api/user/1"
}
- use URL params
- filter:
GET /users?country=US&city=NY
- sorting:
GET /users?sort=name:asc
- paging:
- offset/limit:
GET /users?offset=3&limit=120
- slower solution because it process x rows and then return y rows
- OFFSET 90000 LIMIT 10: read 90010 rows and then return only 10 rows
- cursor/token:
GET /users?cursor=12345
- more efficient solution especially for large datasets
- token: any column or property that is used to pivot through the data in the table. Most common is
created_at
column - WHERE created_at >= "2024-01-01" LIMIT 10: filter through the table and read only 10 required rows
- page token:
GET /users?page={base64String}
: Base64 encoded string can be represented as a string or JSON object for more advanced use cases
- offset/limit:
- filter:
- Use SSL/TLS for secure communication
- Add secured short-lived headers
- JWT Token:
Authorization: Bearer <token>
- CORS:
Access-Control-Allow-Origin: your.domain.com
- JWT Token:
- Root Certificate
- Rate Limiting
- Design rate limiting rules based on user, IP, action group, etc.
- Traceability headers:
X-Request-Id: 888
,X-Trace: 888
, or custom header - Async Logging
- send logs to a lock-free ring buffer and return
- flush to the disk periodically
- higher throughput and lower latency
Caching​
- store frequently accessed data in the cache instead of database
- query database when there is a cache miss
Payload Compression​
- use gzip
- reduce the data size to speed up the download/upload
Connection Pool​
- opening and closing DB connections add significant overhead
- connection pool maintains a number of open connections for applications to reuse
Batching​
- useful when you need to send multiple requests to the same resource to reduce the number of round trips
This pattern is well-suited for any messaging protocols and data formats. Including but not limited to REST, Pub-Sub
{
"specversion" : "1.0",
"type" : "com.github.pull_request.closed",
"source" : "https://github.com/spec/pull",
"subject" : "123",
"id" : "5f9eab6f-3d1f-4d0f-bd6d-9a8c6c8c6c6c",
"time" : "2020-01-15T22:00:00Z",
"datacontenttype" : "application/json",
"data" : "{ "user": { "id": 1, "name": "John" } }"
}
Attribute | Description | Constraints | Examples |
---|---|---|---|
id | Producers must ensure each event has a unique source and ID. If a duplicate event is resent (e.g., due to a network error), it may have the same ID. Consumers may assume events with identical source and ID are duplicates |
|
|
source | Event details include source type, publisher, and creation process. The source URI format is defined by the creator. Each event needs a unique source + ID combo, managed by producers. Applications can assign unique sources for easier ID creation. Source identifiers can be UUIDs, URNs, DNS or custom schemes. A source can have multiple producers, requiring collaboration on unique ID creation |
|
|
specversion | Attribute includes only major and minor version numbers, allowing for patch changes without altering this property's value. Note: A suffix might be added for testing purposes during release candidate releases |
|
|
type | Describes the type of event associated with the originating occurrence. It's commonly used for routing, observability, and policy enforcement. The format is defined by the producer and may include details such as the type version |
|
|
datacontenttype | Allows data to carry various content types, independent of the event format. It informs consumers about the content's format and encoding |
|
|
dataschema | Identifies the schema that data adheres to. Incompatible changes to the schema SHOULD be reflected by a different URI |
|
|
subject | Event-driven systems use event subjects to filter relevant events for subscribers, especially when dealing with limited middleware. Subscribers can specify filters based on the subject, like file extensions (.jpg, .jpeg) or blob names within a container. The subject essentially allows targeted filtering |
|
|
time | Timestamp of the event occurrence |
|
|
data | Content that is associated with the event |
|
|
- Don't return plain text responses
- In most cases REST APIs should accept JSON for request payload and also respond with JSON because it is a standard for transferring data