Resilience4j Introduction

Resilence4j

  • is a fault tolerance library.
  • use for building microservices architecture
  • provides a set of tools to help developers manage the challenges of distributed systems, particularly when it comes to handling failures gracefully and ensuring system resilience.

Key Functions of Resilience4j:

  • Circuit Breaker
  • Rate Limiter
  • Retry
  • Bulkhead %%(/ˈbʌlkhed/) %%
  • Time Limiter

Circuit Breaker

The Circuit Breaker pattern is used to prevent %%(/prɪˈvent/,阻止)%% an application from repeatedly trying to execute an operation that is likely to fail, allowing it to maintain stability and handle failures gracefully.

key Concepts of Resilience4j Circuit Breaker

  • State:Circuit Breaker works by transitioning between three state: CLOSED, OPEN and HALF-OPEN
    • CLOSED:
      • Normal Operation: In this state, the circuit breaker allow all request to pass through and monitors the outcomes.
    • OPEN:
      • Blocking Requests: In this state, the circuit breaker blocks all requests to the service, immediately returning a failure response.
    • HALF-OPEN:
      • Limited Requests: In this state, the circuit breaker allows a limited number of test requests to pass through to the service.
    • State transition flow like below:
                    +--------------------+
                    |                    |
                    |   Failure Rate >   |
                    |   Threshold        |
                    v                    |
+------------+   +-------+   +--------+  |
|            |   |       |   |        |  |
|   Closed   +--->  Open +---> Half-  +--+
|            |   |       |   | Open   |
+------------+   +-------+   +--------+
                    ^   |        |
                    |   |        |
                    +---+--------+ 
                      Success    
  • Failure Rate and Threshold:
    • We can configure the percentage of failures that will trigger the circuit breaker to open, along with the minimum number of requests that must be made before it starts evaluating the failure rate.
  • Wait Duration:
    • This is the time the circuit breaker remains open before transitioning to the half-open state.
  • Event Monitoring:
    • Resilience4j provides event notifications that allow you to monitor and react to state changes and other circuit breaker events.

Demo of Circuit Breaker

Add Resilience4j Circuit Breaker dependencies to pom.xml file:

<dependency>
    <groupId>io.github.resilience4j</groupId>
    <artifactId>resilience4j-circuitbreaker</artifactId>
    <version>1.7.1</version> <!-- Use the latest version available -->
</dependency>
<dependency>
    <groupId>io.vavr</groupId>
    <artifactId>vavr</artifactId>
    <version>0.10.3</version>
</dependency>

Create a Service to Simulate a Remote Call

public class RemoteService {

    public String call() {
        if (Math.random() > 0.5) {
            throw new RuntimeException("Service unavailable");
        }
        return "Success!";
    }
}

Implement the Circuit Breaker

// Configure the Circuit Breaker
CircuitBreakerConfig config = CircuitBreakerConfig.custom()
		.failureRateThreshold(50) // 50% failure rate allowed
		.waitDurationInOpenState(Duration.ofSeconds(5)) // 5 seconds open state
		.permittedNumberOfCallsInHalfOpenState(3) // 3 test calls in half-open state
		.slidingWindowSize(10) // 10 calls for sliding window
		.build();
// registry is used to create Circuit Breaker.
CircuitBreakerRegistry registry = CircuitBreakerRegistry.of(config);
CircuitBreaker circuitBreaker = registry.circuitBreaker("myCircuitBreaker");

// Simulate remote service
RemoteService remoteService = new RemoteService();

// Wrap the service call with a Circuit Breaker
Supplier<String> decoratedServiceCall = CircuitBreaker
		.decorateSupplier(circuitBreaker, remoteService::call);

// Simulate multiple calls to the remote service
for (int i = 0; i < 20; i++) {
	Try<String> result = Try.ofSupplier(decoratedServiceCall)
			.recover(throwable -> "Fallback response");

	System.out.println("Call result: " + result.get());
	System.out.println("CircuitBreaker state: " + circuitBreaker.getState());

	try {
		Thread.sleep(1000); // Pause between calls
	} catch (InterruptedException e) {
		Thread.currentThread().interrupt();
	}
}
  • output
Call result: Fallback response
Circuit Breaker state: CLOSED
Call result: Success!
...
Call result: Fallback response
Circuit Breaker state: OPEN
Call result: Fallback response
...
Circuit Breaker state: OPEN
Call result: Fallback response
Circuit Breaker state: HALF_OPEN
Call result: Fallback response
Circuit Breaker state: HALF_OPEN
Call result: Success!
Circuit Breaker state: OPEN
Call result: Fallback response
...

Rate Limiter

Resilience4j's Rate Limiter is a designed to help control the rate at which requests are sent to a service or function.
It is useful in scenarios where you need to ensure that your application does not overload a downstream service or an external API by sending too many requests in a short period.

Key Features of Resilience4j Rate Limiter

  • Thread Safety:
    • The Rate Limiter is designed to be thread-safe, ensuring that it can handle concurrent requests in a multi-threaded environment without issues.
  • Asynchronous Support:
    • it supports both synchronous and aysnchronous programming models, making it versatile %%(
      /ˈvɜːrsət(ə)l/, 多用途的)%% for difference types of applications.
  • Seamless Integration:
    • The Rate Limiter can be easily integrated into existing java applications and works well with other Resilience4j modules like Circuit Breaker, Retry, and Bulkhead.

Demo of Rate Limter

Add Resilience4j Rate Limiter dependency to pom.xml

<dependency>
    <groupId>io.github.resilience4j</groupId>
    <artifactId>resilience4j-ratelimiter</artifactId>
    <version>1.7.1</version> <!-- Use the latest version available -->
</dependency>
<dependency>
    <groupId>io.vavr</groupId>
    <artifactId>vavr</artifactId>
    <version>0.10.3</version>
</dependency>

Create the Service to be Limited

public class Service {

    public String process() {
        DateTimeFormatter dateTimeFormatter = DateTimeFormatter.ofPattern("yyyy-MM-dd HH:mm:ss");  
		return "Processing request at " + LocalDateTime.now().format(dateTimeFormatter);
    }
}

Implement the Rate Limiter

Use Resilience4j's Rate Limiter to wrap calls to this service.

// Configure the Rate Limiter
RateLimiterConfig config = RateLimiterConfig.custom()
	.limitForPeriod(5) // Allow 5 requests
	.limitRefreshPeriod(Duration.ofSeconds(1)) // Per second
	.timeoutDuration(Duration.ofMillis(100)) // Wait up to 100ms for permission
	.build();

RateLimiterRegistry registry = RateLimiterRegistry.of(config);
RateLimiter rateLimiter = registry.rateLimiter("myRateLimiter");

// Service to be rate-limited
Service service = new Service();

// Decorate the service call with Rate Limiter
Supplier<String> restrictedSupplier = RateLimiter
	.decorateSupplier(rateLimiter, service::process);

// Simulate multiple calls to the service
for (int i = 0; i < 10; i++) {
	Try<String> result = Try.ofSupplier(restrictedSupplier)
			.recover(throwable -> "Rate limit exceeded");
	
	System.out.println("Call result: " + result.get());
	
	try {
		Thread.sleep(100); // Pause between calls
	} catch (InterruptedException e) {
		Thread.currentThread().interrupt();
	}
}
  • output
Call result: Processing request at 2024-11-13 00:17:40
Call result: Processing request at 2024-11-13 00:17:40
Call result: Processing request at 2024-11-13 00:17:40
Call result: Processing request at 2024-11-13 00:17:40
Call result: Processing request at 2024-11-13 00:17:40
Call result: Processing request at 2024-11-13 00:17:41
Call result: Processing request at 2024-11-13 00:17:41
Call result: Processing request at 2024-11-13 00:17:41
Call result: Processing request at 2024-11-13 00:17:41
Call result: Processing request at 2024-11-13 00:17:41
  • this method prints call result five times in a second.

Retry

Resilince4j's Retry module is designed to provide a robust %%(/roʊˈbʌst/,健壮性)%% mechanism for retrying failed operations in a controlled and configurable manner%%(/ˈmænər/, 方式)%%.
This is especially useful in distributed systems where transient%%(/ˈtrænʃənt/,短暂的)%% faults, such as netwrork issues or temporary unavailability of a service, can cause operations to fail.

Key Features of Resilience4j Retry

  • Interval and Backoff Strategies: Customize the interval between retries, including options for exponential backoff, to manage retry timing effectively.

Demo of Resilience4j Retry

Add Resilience4j Retry dependency to pom.xml

<dependency>
    <groupId>io.github.resilience4j</groupId>
    <artifactId>resilience4j-retry</artifactId>
    <version>1.7.1</version> <!-- Use the latest version available -->
</dependency>
<dependency>
    <groupId>io.vavr</groupId>
    <artifactId>vavr</artifactId>
    <version>0.10.3</version>
</dependency>

Create the Unreliable Service

import java.util.Random;

public class UnreliableService {

    private final Random random = new Random();

    public String fetchData() {
        if (random.nextInt(3) != 0) { // Simulate a failure 2 out of 3 times
            throw new RuntimeException("Transient Error: Failed to fetch data");
        }
        return "Data fetched successfully";
    }
}

Implement the Retry Logic

// Configure the Retry
RetryConfig config = RetryConfig.custom()
		.maxAttempts(3) // Try up to 3 times
		.waitDuration(Duration.ofSeconds(1)) // Wait 1 second between attempts
		.retryExceptions(RuntimeException.class) // Retry on RuntimeException
		.build();

RetryRegistry registry = RetryRegistry.of(config);
Retry retry = registry.retry("unreliableServiceRetry");

// Unreliable service instance
UnreliableService service = new UnreliableService();

// Decorate the service call with Retry
Supplier<String> retryableSupplier = Retry.decorateSupplier(retry, service::fetchData);

// Execute the operation with retry logic
Try<String> result = Try.ofSupplier(retryableSupplier)
		.recover(throwable -> "Fallback: Unable to fetch data.");

System.out.println("Result: " + result.get());

Bulkhead

Resilence4j's Bulkhead module is a crucial component for building resilience applications, particularly in distrubted systems where resource isolation and protection are essential%%(/ɪˈsenʃ(ə)l/,必不可少的)%%. Bulkheads are used to limit the number of concurrent calls to a particular service or component, thereby preventing resource exhaustion and ensuring that failures in one part of the system do not cascade to others

Key Features of Resilience4j Bulkhead

  • Concurrent Limitations: Bulkhead allows you to set a maximum number of concurrent calls to a service. This helps prevent overloading and ensures that system resources are used efficiently.
  • Thread Isolation: By limiting the number of concurrent executions, Bulkhead provides thread isolation, which helps in containing failures and reducing the impact on other parts of the application.
  • Future-Based API: Bulkhead works seamlessly%%(/ˈsiːmləsli/,无缝的)%% with Java CompletableFuture, making it easy to integrate into asynchronous programming models.
  • Semaphore-based and Thread Pool-based Bulkheads: Resilience4j supports two types of bulkheads:
    • Semaphore-based Bulkhead: Restricts the number of concurrent calls using semaphores.
    • Thread Pool-based Bulkhead: Uses a fixed size thread pool to limit concurrent executions, providing more control over execution isolation.

Demo of Resilience4j Bulkhead

Add Resilience4j Bulkhead dependency to pom.xml

<dependency>
    <groupId>io.github.resilience4j</groupId>
    <artifactId>resilience4j-bulkhead</artifactId>
    <version>1.7.1</version> <!-- Use the latest version available -->
</dependency>

Create a Simulated Service

public class SimulatedService {

    public String performTask(String taskName) {
        try {
            System.out.println("Executing task: " + taskName);
            Thread.sleep(1000); // Simulate a delay in processing
        } catch (InterruptedException e) {
            Thread.currentThread().interrupt();
        }
        return "Completed task: " + taskName;
    }
}

Implement the Bulkhead

// Configure the Bulkhead
BulkheadConfig config = BulkheadConfig.custom()
		.maxConcurrentCalls(3) // Limit to 3 concurrent calls
		.maxWaitDuration(Duration.ofMillis(500)) // Wait max 500ms for a permit
		.build();

// Create a Bulkhead registry
BulkheadRegistry registry = BulkheadRegistry.of(config);

// Create a Bulkhead instance
Bulkhead bulkhead = registry.bulkhead("serviceBulkhead");

// Simulated service instance
SimulatedService service = new SimulatedService();

// Decorate the service call with Bulkhead
Supplier<String> decoratedServiceCall = Bulkhead.decorateSupplier(bulkhead, () -> service.performTask("Task"));

// Use an ExecutorService to simulate concurrent executions
ExecutorService executorService = Executors.newFixedThreadPool(5);

for (int i = 1; i <= 10; i++) {
	final String taskName = "Task-" + i;
	executorService.submit(() -> {
		try {
			CompletableFuture<String> result = CompletableFuture.supplyAsync(decoratedServiceCall);
			System.out.println(taskName + " - " + result.join());
		} catch (Exception e) {
			System.out.println("Failed to execute: " + taskName + " - " + e.getMessage());
		}
	});
}

executorService.shutdown();

  • output
Executing task: Task
Executing task: Task
Executing task: Task
Failed to execute: Task-5 - io.github.resilience4j.bulkhead.BulkheadFullException: Bulkhead 'serviceBulkhead' is full and does not permit further calls
Failed to execute: Task-2 - io.github.resilience4j.bulkhead.BulkheadFullException: Bulkhead 'serviceBulkhead' is full and does not permit further calls
Executing task: Task
Executing task: Task
Task-1 - Completed task: Task
Task-3 - Completed task: Task
Task-4 - Completed task: Task
Executing task: Task
Failed to execute: Task-8 - io.github.resilience4j.bulkhead.BulkheadFullException: Bulkhead 'serviceBulkhead' is full and does not permit further calls
Failed to execute: Task-10 - io.github.resilience4j.bulkhead.BulkheadFullException: Bulkhead 'serviceBulkhead' is full and does not permit further calls
Task-9 - Completed task: Task
Task-7 - Completed task: Task
Task-6 - Completed task: Task

TimeLimiter

Resilience4j's TimeLimiter module is designed to add a timeout to execution of a method. This is particularly useful in scenarios where you want to ensure that your system remains responsive, even when a service call or a piece of code takes longer than expected to complete.

Key Features of Resilience4j TimeLimiter

  • Timeout Handling:
    • It allows us to specify a maximum execution duration for a given task. If the task doesn't complete within this period, it will be aborted.
  • Future-Based API:
    • TimeLimiter works seamlessly%%(/ˈsiːmləsli/,无缝的)%% with Java CompletableFuture, making it easy to integrate into asynchronous programming models.
  • Customizable Timeout Configuration:
    • You can configure the timeout duration and specify how to handle timeouts. This flexibility helps tailor the behavior to match the requirements of different parts of your application.

Demo of Resilience4j TimeLimiter

Add Resilience4j TimeLimiter dependency to pom.xml

<dependency>
    <groupId>io.github.resilience4j</groupId>
    <artifactId>resilience4j-timelimiter</artifactId>
    <version>1.7.1</version> <!-- Use the latest version available -->
</dependency>

Create a Long-Running Task

public class LongRunningService {

    public String executeWithPossibleDelay() {
        try {
            System.out.println("Task started");
            Thread.sleep(3000); // Simulate a long-running task (3000 ms)
            return "Task completed successfully";
        } catch (InterruptedException e) {
            Thread.currentThread().interrupt();
            return "Task interrupted";
        }
    }
}

Implement The Time Limiter

// Configure the TimeLimiter
TimeLimiterConfig config = TimeLimiterConfig.custom()
		.timeoutDuration(Duration.ofSeconds(2)) // Set a timeout of 2 seconds
		.cancelRunningFuture(true) // Cancel the task if it times out
		.build();

// Create a TimeLimiter registry and instance
TimeLimiterRegistry registry = TimeLimiterRegistry.of(config);
TimeLimiter timeLimiter = registry.timeLimiter("timeLimiter");

// ScheduledExecutorService for managing asynchronous tasks
ScheduledExecutorService scheduler = Executors.newScheduledThreadPool(2);

// Simulated service
LongRunningService service = new LongRunningService();

// Supplier for the long-running task
Supplier<CompletableFuture<String>> longRunningTaskSupplier = () -> CompletableFuture.supplyAsync(service::executeWithPossibleDelay, scheduler);

// Apply TimeLimiter to the task
CompletableFuture<String> future = timeLimiter.executeFutureSupplier(longRunningTaskSupplier);

// Handle the result
try {
	String result = future.get();
	System.out.println(result);
} catch (ExecutionException e) {
	System.out.println("Execution failed: " + e.getCause().getMessage());
} catch (InterruptedException e) {
	Thread.currentThread().interrupt();
	System.out.println("Thread was interrupted");
}

// Shutdown the executor service
scheduler.shutdown();
  • output
Task started
Execution failed: TimeLimiter 'timeLimiter' recorded a timeout exception.

SpringBoot Integration with Resilience4j

Add Resilience4j Spring Boot Dependency

<dependency>
    <groupId>io.github.resilience4j</groupId>
    <artifactId>resilience4j-spring-boot3</artifactId> <!-- for Spring Boot 3 -->
    <version>2.0.2</version> <!-- Use the latest version -->
</dependency>

Configure Resilience4j

configure Resilience4j in application.yml to set parameters for Circuit Breaker, Rate Limiter, Retry, Bulkhead, and Time Limiter

# application.yml
resilience4j:
  # Circuit Breaker configuration
  circuitbreaker:
    instances:
      backendService:
        # The size of the sliding window used to calculate failure rate and other metrics.
        # In this case, it's set to 10 calls.
        slidingWindowSize: 10
        # The minimum number of calls required before the circuit breaker can start calculating
        # failure rate and making decisions about opening or closing the circuit. Here it's 5 calls.
        minimumNumberOfCalls: 5
        # The threshold for the failure rate. If the failure rate of calls to the backend service
        # exceeds this percentage (50% in this case), the circuit breaker will trip and enter the open state.
        failureRateThreshold: 50
        # The amount of time the circuit breaker will wait in the open state before attempting
        # to transition to the half-open state to test if the backend service has recovered.
        waitDurationInOpenState: 10s

  # Rate Limiter configuration
  ratelimiter:
    instances:
      backendService:
        # The maximum number of requests allowed within the specified limitRefreshPeriod.
        # Here, only 5 requests are allowed per period.
        limitForPeriod: 5
        # The time period after which the rate limit resets. In this case, it resets every 1 second.
        limitRefreshPeriod: 1s

  # Retry configuration
  retry:
    instances:
      backendService:
        # The maximum number of retry attempts that will be made if a call to the backend service fails.
        # Here, it will retry up to 3 times.
        maxAttempts: 3
        # The amount of time to wait between each retry attempt. In this case, it waits 500 milliseconds.
        waitDuration: 500ms

  # Bulkhead configuration
  bulkhead:
    instances:
      backendService:
        # The maximum number of concurrent calls allowed to the backend service.
        # If this limit is reached, further calls will be blocked until some of the existing calls complete.
        maxConcurrentCalls: 5
        # The maximum amount of time a call will wait in the bulkhead queue if the maxConcurrentCalls
        # limit has been reached. Here, it will wait up to 100 milliseconds.
        maxWaitDuration: 100ms

  # Time Limiter configuration
  timelimiter:
    instances:
      backendService:
        # The maximum amount of time a call to the backend service is allowed to take.
        # If the call exceeds this timeout duration (2 seconds in this case), it will be aborted.
        timeoutDuration: 2s

Create Service with Resilience4j Annotations

Create a service class that uses Resilience4j annotations to implements the different resilience patterns

@Service
public class BackendService {

    @TimeLimiter(name = "backendService")
    @CircuitBreaker(name = "backendService", fallbackMethod = "fallback")
    @Retry(name = "backendService")
    @RateLimiter(name = "backendService")
    @Bulkhead(name = "backendService")
    public CompletableFuture<String> performTask() {
        return CompletableFuture.supplyAsync(() -> {
            // Simulate a long-running task
            try {
                // Simulating a random failure
                if (Math.random() > 0.7) {
                    throw new RuntimeException("Simulated failure");
                }
                Thread.sleep(3000); // Simulate delay
                return "Task completed successfully";
            } catch (InterruptedException e) {
                Thread.currentThread().interrupt();
                return "Task interrupted";
            }
        });
    }

    // Fallback method in case of failures
    private CompletableFuture<String> fallback(Throwable throwable) {
        return CompletableFuture.completedFuture("Fallback response due to: " + throwable.getMessage());
    }
}

Create a REST Controller

Create a REST controller to expose an endpoint that triggers the service method.

@RestController
@RequestMapping("/api")
public class BackendController {

    private final BackendService backendService;

    public BackendController(BackendService backendService) {
        this.backendService = backendService;
    }

    @GetMapping("/task")
    public CompletableFuture<String> performTask() {
        return backendService.performTask();
    }
}
posted @ 2024-11-13 12:09  Jacob-Chen  阅读(59)  评论(0)    收藏  举报