High usage
Troubleshooting Guide: High Resource Usage in R2R Docker Deployments
High resource usage in R2R Docker deployments can lead to performance issues, system instability, or even service outages. This guide will help you identify the cause of high resource usage and provide steps to mitigate the problem.
1. Identifying the Issue
First, determine which resources are being overused:
1.1 Check Overall System Resources
Use the top
or htop
command to get an overview of system resource usage:
Look for high CPU usage, memory consumption, or swap usage.
1.2 Check Docker-specific Resource Usage
Use Docker’s built-in commands to check resource usage for containers:
This will show CPU, memory, and I/O usage for each container.
2. Common Causes and Solutions
2.1 High CPU Usage
Possible Causes:
- Inefficient queries or data processing
- Continuous background tasks
- Improperly configured LLM inference
Solutions:
-
Optimize queries:
- Review and optimize database queries, especially those involving large datasets.
-
Adjust background task frequency:
- Review Hatchet workflows and adjust the frequency of recurring tasks.
- Implement rate limiting for resource-intensive operations.
-
LLM configuration:
- If using local LLMs via Ollama, consider adjusting model parameters or switching to a lighter model.
- For cloud LLMs, implement caching to reduce redundant API calls.
-
Scale horizontally:
- Consider distributing the workload across multiple R2R instances.
2.2 High Memory Usage
Possible Causes:
- Memory leaks in custom code
- Inefficient caching
- Large in-memory datasets
Solutions:
-
Identify memory-hungry containers:
-
Analyze R2R application logs:
- Look for patterns of increasing memory usage over time.
-
Optimize memory usage:
- Implement proper garbage collection in custom code.
- Review and optimize caching strategies.
- Consider using streaming for large data processing tasks instead of loading entire datasets into memory.
-
Adjust container memory limits:
- Update the Docker Compose file to set appropriate memory limits for containers:
- Update the Docker Compose file to set appropriate memory limits for containers:
2.3 High Disk I/O
Possible Causes:
- Frequent logging
- Inefficient database operations
- Large file ingestion processes
Solutions:
-
Monitor disk I/O:
-
Optimize logging:
- Reduce log verbosity for non-critical information.
- Implement log rotation to manage file sizes.
-
Database optimizations:
- Ensure proper indexing in Postgres.
- Optimize query patterns to reduce full table scans.
-
File ingestion improvements:
- Implement chunking for large file ingestion.
- Consider using a dedicated storage service for large files.
3. Monitoring and Prevention
Implement proactive monitoring to catch resource issues early:
-
Set up Docker monitoring:
- Use tools like Prometheus and Grafana to monitor Docker metrics.
- Set up alerts for when resource usage exceeds certain thresholds.
-
Implement application-level metrics:
- Use libraries like
prometheus_client
in Python to expose custom metrics. - Monitor key performance indicators specific to your R2R usage.
- Use libraries like
-
Regular performance audits:
- Periodically review resource usage patterns.
- Conduct load testing to identify potential bottlenecks before they impact production.
4. Advanced Debugging
For persistent issues:
-
Profile the R2R application:
- Use Python profiling tools like cProfile or memory_profiler to identify resource-intensive code sections.
-
Analyze Docker logs:
-
Inspect container details:
-
Review orchestration logs:
- Check Hatchet logs for insights into task execution and resource allocation.
5. Scaling Considerations
If high resource usage persists despite optimizations:
-
Vertical scaling:
- Increase resources (CPU, RAM) for the Docker host.
-
Horizontal scaling:
- Implement load balancing across multiple R2R instances.
- Consider using Docker Swarm or Kubernetes for orchestration.
-
Service separation:
- Move resource-intensive components (e.g., database, LLM inference) to dedicated hosts.
Conclusion
High resource usage in R2R Docker deployments can be challenging but is often resolvable through careful analysis and optimization. Always ensure you have proper monitoring in place, and regularly review your deployment’s performance to catch issues early. If problems persist, don’t hesitate to reach out to the R2R community or consider consulting with cloud infrastructure experts.
Was this page helpful?