Commit Graph

184 Commits

Author SHA1 Message Date
e39668e480 Implement Smart Recommendations Engine with dashboard and modals 2025-10-02 08:17:22 -03:00
f5ef2132e5 Update documentation with PromQL query display and latest features 2025-10-02 07:53:07 -03:00
f6de5a5f30 Add PromQL queries display in historical analysis
- Include PromQL queries in API response for workload metrics
- Display queries in historical analysis modal with copy functionality
- Add professional styling for query display sections
- Enable users to copy and validate queries in OpenShift Console
- Organize queries by category: cluster totals, usage, requests, limits
- Add copy-to-clipboard functionality with visual feedback
2025-10-02 07:34:02 -03:00
cf92f0121b Fix conflicting insufficient_historical_data and historical_analysis
- Check both CPU and Memory data availability before historical analysis
- If either CPU or Memory has insufficient data, add warning and skip analysis
- Prevent conflicting insufficient_historical_data and historical_analysis
- Ensure consistent data availability requirements for workload analysis
- Only proceed with P95/P99 calculations when both resources have sufficient data
2025-10-01 16:36:42 -03:00
4721a1ef37 Fix historical analysis contradictions and implement workload-based analysis
- Fix insufficient_historical_data vs historical_analysis contradiction
- Add return statement when insufficient data to prevent P99 calculation
- Implement workload-based historical analysis instead of pod-based
- Add _extract_workload_name() to identify workload from pod names
- Add analyze_workload_historical_usage() for workload-level analysis
- Add _analyze_workload_metrics() with Prometheus workload queries
- Add validate_workload_resources_with_historical_analysis() method
- Update /cluster/status endpoint to use workload analysis by namespace
- Improve reliability by analyzing workloads instead of individual pods
- Maintain fallback to pod-level analysis if workload analysis fails
2025-10-01 16:32:12 -03:00
6f5c8b0cac Fix duplicate validations in cluster status
- Remove duplicate static validations from /cluster/status endpoint
- Use only historical analysis which includes static validations
- Add fallback to static validations only if historical analysis fails
- Eliminate duplicate invalid_ratio and container_metrics validations
- Improve validation efficiency and reduce redundancy
2025-10-01 16:25:38 -03:00
0686749aa8 Fix validation field name in namespace analysis
- Change validation.validation_type to validation.rule_name
- API returns rule_name field, not validation_type
- Fix undefined display in namespace analysis modal
- Ensure proper validation type display in detailed view
2025-10-01 16:17:09 -03:00
162af739e4 Fix Namespace Analysis code duplication
- Remove duplicate createNamespaceDetails() function
- Fix validation.rule_name to validation.validation_type
- Keep only showNamespaceDetailsSimple() function
- Eliminate redundant code in namespace analysis modal
- Improve code maintainability and reduce duplication
2025-10-01 16:12:21 -03:00
560fa69a3b Add Resource Utilization explanation modal
- Add info icon next to Resource Utilization metric
- Create showResourceUtilizationDetails() function
- Explain placeholder implementation status
- Show formula and purpose of Resource Utilization
- Indicate Phase 2 implementation plan
- Provide clear next steps for development
2025-10-01 15:58:43 -03:00
e0bb80865d Update documentation with current project state
- Update README.md with Cluster Overcommit Analysis features
- Add Podman preference over Docker in requirements
- Update DOCUMENTATION.md with Phase 1 completion status
- Update AIAgents-Support.md with Phase 1.3 completion
- Add Cluster Overcommit Analysis to completed features
- Update version to 1.2.0 and dates to 2025-10-01
- Reflect current implementation status across all docs
2025-10-01 15:49:38 -03:00
0ea86ef22f Fix modal close functionality
- Add proper closeModal() function
- Fix close button (X) click handler
- Fix click outside modal to close
- Remove modal from DOM instead of just hiding
- Improve modal user experience
2025-10-01 15:44:50 -03:00
2bb5266753 Improve overcommit UI with info icons and modals
- Replace tooltips with info icons (ℹ️) next to CPU/Memory Overcommit
- Add modal dialogs showing detailed overcommit calculations
- Change Resource Quota Coverage to Resource Utilization
- Add CSS styling for overcommit details modals
- Improve UX with clickable info icons instead of hover tooltips
- Show capacity, requests, overcommit percentage, and available resources
2025-10-01 15:41:43 -03:00
8984701bf3 Add detailed tooltips for overcommit metrics
- Add tooltips showing capacity, requests, and calculation details
- Include CPU and Memory capacity/requests in API response
- Add CSS styling for tooltip hover effects
- Show detailed breakdown: Capacity Total, Requests Total, and calculation formula
- Improve user experience with transparent overcommit information
2025-10-01 15:33:39 -03:00
b7bfd33a28 Add debug logging for overcommit calculation 2025-10-01 15:29:43 -03:00
b83c55bf08 Fix Cluster Overcommit Summary display
- Add overcommit data processing in /cluster/status endpoint
- Extract CPU/Memory capacity and requests from Prometheus
- Calculate overcommit percentages and resource quota coverage
- Update frontend to use new overcommit data structure
- Fix issue where Cluster Overcommit Summary was showing all zeros
2025-10-01 15:13:04 -03:00
fae1d6fb18 Fix workload metrics API pod name matching
- Use regex pattern pod=~"{workload}.*" in workload metrics API
- This matches the fix applied to historical analysis
- Should resolve issue where resource-governance workload data was not being retrieved
- Both historical analysis and workload metrics now use consistent pod name matching
2025-10-01 14:57:27 -03:00
35fed5eb01 Fix Prometheus queries for pod name matching
- Use regex pattern pod=~"{pod.name}.*" instead of exact match
- This allows matching pods with suffixes like resource-governance-78b77cc868-gchx7
- Apply fix to both CPU and Memory queries for usage, requests, and limits
- Should resolve issue where resource-governance pod data was not being retrieved
2025-10-01 14:53:40 -03:00
3df8d6bd42 Fix historical data retrieval
- Revert step calculation to 60s for better data retrieval
- Reduce threshold to 3 data points for insufficient data detection
- Add detailed logging for Prometheus query debugging
- Ensure historical data is properly retrieved from Prometheus
2025-10-01 14:51:37 -03:00
9e4f66052c Fix insufficient historical data detection
- Adjust Prometheus query step based on time range (5min for 24h)
- Reduce threshold from 10 to 5 data points for insufficient data detection
- Add debug logging to understand data point counts
- Improve step calculation: 30s for 1h, 5min for 24h, 30min for 7d
2025-10-01 14:48:05 -03:00
ee20a09147 Fix data unification and efficiency calculations
- Unify Prometheus queries between namespace analysis and historical analysis
- Fix efficiency calculations to prevent division by zero
- Remove duplicate validations in validation service
- Improve frontend data display with clear numerical values
- Add proper error handling for missing data
2025-10-01 14:43:43 -03:00
6ad1997afd Remove simulated data and enable real Prometheus metrics 2025-09-30 21:13:46 -03:00
6f914c9404 Fix: update Prometheus URL to use HTTPS instead of HTTP 2025-09-30 21:05:40 -03:00
20ae326158 Fix: historical analysis implementation with OpenShift-specific Prometheus queries 2025-09-30 21:01:00 -03:00
f3dff8be76 Fix JavaScript error in historical analysis modal - preserve DOM elements when loading metrics 2025-09-30 20:50:25 -03:00
3445f58a11 Update Prometheus queries to use OpenShift-specific metrics
- Use node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate for CPU usage
- Use container_memory_working_set_bytes with kubelet job for memory usage
- Use kube_pod_container_resource_requests/limits with kube-state-metrics job
- Add workload-specific filtering to match OpenShift dashboard behavior
- This should resolve the 'insufficient data' issue by using the same metrics as OpenShift
2025-09-30 20:42:59 -03:00
0068db5a9e Fix remaining indentation error in routes.py 2025-09-30 18:07:05 -03:00
7efbd94b50 Fix indentation errors in routes.py 2025-09-30 18:06:48 -03:00
5f3f737b3a Add simulated data fallback for historical analysis when Prometheus is not accessible 2025-09-30 18:06:10 -03:00
2b2b3c23b2 Fix: Historical analysis now shows real consumption numbers and percentages relative to cluster totals 2025-09-30 18:03:17 -03:00
5c5643576f Fix: Add query_range method to PrometheusClient for historical metrics 2025-09-30 17:45:18 -03:00
a847f0cd92 Fix: Add missing PrometheusClient import for workload metrics endpoint 2025-09-30 17:43:22 -03:00
f0d3831263 Feature: Add real Prometheus metrics visualization for historical analysis 2025-09-30 17:41:39 -03:00
d683704593 Fix: Integrate historical analysis validations in cluster status endpoint 2025-09-30 17:37:09 -03:00
a986e10b0a Phase 1: Mark Enhanced Validation & Categorization as COMPLETED 2025-09-30 16:49:41 -03:00
f3b8022224 Phase 1.2: Complete Historical Analysis Integration - Add insufficient data detection, seasonal patterns, and integrate in main dashboard 2025-09-30 16:48:31 -03:00
c2d2b46b11 Revert: Put AIAgents-Support.md back in .gitignore as it's for AI agent context only 2025-09-30 16:41:56 -03:00
1abe4c9f09 Fix: Remove AIAgents-Support.md from .gitignore and update with current file structure 2025-09-30 16:31:44 -03:00
459e46e2b0 Update: Documentation files with current implementation status and achievements 2025-09-30 16:29:15 -03:00
1052740f21 Remove: Duplicate functions with alerts that were overriding the modal functionality 2025-09-30 13:49:13 -03:00
bd2af094e6 Fix: Remove alert from analyzeNamespace function and use proper modal 2025-09-30 13:46:16 -03:00
4ce538c35c Replace alerts with proper modals for namespace analysis and fix functionality 2025-09-30 13:44:17 -03:00
3bf0c99fd6 Add: Simple namespace analysis with detailed pod and container information 2025-09-30 13:40:55 -03:00
e2311b6967 Add: Detailed namespace analysis with modal popup showing pod and container details 2025-09-30 13:37:55 -03:00
96c29d4179 Fix: Problem Summary table displaying namespace names and pod counts correctly 2025-09-30 13:34:36 -03:00
c91b517138 Fix: dict object has no attribute name error 2025-09-30 12:27:01 -03:00
9f8cad6803 Fix: dict object has no attribute resources error 2025-09-30 12:25:20 -03:00
ea309e8ef0 Fix: ContainerResource object is not subscriptable error 2025-09-30 12:23:46 -03:00
af42204897 Fix API endpoint - use correct /api/v1/cluster/status 2025-09-30 12:14:17 -03:00
883c50a104 Simplify dashboard - remove redundancies and create pragmatic interface 2025-09-30 10:45:07 -03:00
fa8f3a41e5 Implement simplified UI/UX with health scores and grouped validations 2025-09-30 09:37:49 -03:00