Commit Graph

43 Commits

Author SHA1 Message Date
260d8114c5 Fix container data structure access in SmartRecommendationsService 2025-10-02 08:20:44 -03:00
cf92f0121b Fix conflicting insufficient_historical_data and historical_analysis
- Check both CPU and Memory data availability before historical analysis
- If either CPU or Memory has insufficient data, add warning and skip analysis
- Prevent conflicting insufficient_historical_data and historical_analysis
- Ensure consistent data availability requirements for workload analysis
- Only proceed with P95/P99 calculations when both resources have sufficient data
2025-10-01 16:36:42 -03:00
4721a1ef37 Fix historical analysis contradictions and implement workload-based analysis
- Fix insufficient_historical_data vs historical_analysis contradiction
- Add return statement when insufficient data to prevent P99 calculation
- Implement workload-based historical analysis instead of pod-based
- Add _extract_workload_name() to identify workload from pod names
- Add analyze_workload_historical_usage() for workload-level analysis
- Add _analyze_workload_metrics() with Prometheus workload queries
- Add validate_workload_resources_with_historical_analysis() method
- Update /cluster/status endpoint to use workload analysis by namespace
- Improve reliability by analyzing workloads instead of individual pods
- Maintain fallback to pod-level analysis if workload analysis fails
2025-10-01 16:32:12 -03:00
35fed5eb01 Fix Prometheus queries for pod name matching
- Use regex pattern pod=~"{pod.name}.*" instead of exact match
- This allows matching pods with suffixes like resource-governance-78b77cc868-gchx7
- Apply fix to both CPU and Memory queries for usage, requests, and limits
- Should resolve issue where resource-governance pod data was not being retrieved
2025-10-01 14:53:40 -03:00
3df8d6bd42 Fix historical data retrieval
- Revert step calculation to 60s for better data retrieval
- Reduce threshold to 3 data points for insufficient data detection
- Add detailed logging for Prometheus query debugging
- Ensure historical data is properly retrieved from Prometheus
2025-10-01 14:51:37 -03:00
9e4f66052c Fix insufficient historical data detection
- Adjust Prometheus query step based on time range (5min for 24h)
- Reduce threshold from 10 to 5 data points for insufficient data detection
- Add debug logging to understand data point counts
- Improve step calculation: 30s for 1h, 5min for 24h, 30min for 7d
2025-10-01 14:48:05 -03:00
ee20a09147 Fix data unification and efficiency calculations
- Unify Prometheus queries between namespace analysis and historical analysis
- Fix efficiency calculations to prevent division by zero
- Remove duplicate validations in validation service
- Improve frontend data display with clear numerical values
- Add proper error handling for missing data
2025-10-01 14:43:43 -03:00
f3b8022224 Phase 1.2: Complete Historical Analysis Integration - Add insufficient data detection, seasonal patterns, and integrate in main dashboard 2025-09-30 16:48:31 -03:00
c91b517138 Fix: dict object has no attribute name error 2025-09-30 12:27:01 -03:00
9f8cad6803 Fix: dict object has no attribute resources error 2025-09-30 12:25:20 -03:00
fa8f3a41e5 Implement simplified UI/UX with health scores and grouped validations 2025-09-30 09:37:49 -03:00
16827e1084 Fix: corrigido erro de sintaxe elif sem if 2025-09-29 21:21:52 -03:00
ee4b22693e Fix: adicionado métricas detalhadas de containers e removido validações duplicadas 2025-09-29 21:21:34 -03:00
e7a5afafe7 Fix: corrigido tolerância excessiva na validação de ratio CPU/Memory 2025-09-29 21:15:05 -03:00
b4190a9e97 MAJOR: corrigido valores hardcoded e implementado exibição inteligente de unidades (milicores/MiB) 2025-09-29 20:15:56 -03:00
fefe65f586 CRITICAL FIX: corrigido cálculo de overcommit de memória (bytes/GiB) 2025-09-29 18:44:34 -03:00
bd3ab16f5d Fix: corrigido acesso a atributos de ContainerResource como objeto 2025-09-29 18:07:46 -03:00
2237e15534 Fix: corrigido tratamento de ContainerResource como objeto Pydantic 2025-09-29 18:05:57 -03:00
952ca042a2 Fix: adicionado import Optional faltante 2025-09-29 17:55:53 -03:00
525c1b28a0 Fix: adicionado metodo _validate_qos_class faltante 2025-09-29 17:55:37 -03:00
cdf13b4e2b Fix: adicionado metodo _determine_qos_class faltante 2025-09-29 17:53:58 -03:00
3a5af8ce67 Feat: implementar dashboard de cluster health com QoS e Resource Quotas
- Adicionar modelos para QoSClassification, ResourceQuota e ClusterHealth
- Implementar classificação automática de QoS (Guaranteed, Burstable, BestEffort)
- Criar análise de Resource Quotas com recomendações automáticas
- Adicionar dashboard principal com visão geral do cluster
- Implementar análise de overcommit com métricas visuais
- Adicionar top resource consumers com ranking
- Criar distribuição de QoS com estatísticas
- Adicionar novos endpoints API para cluster health e QoS
- Melhorar interface com design responsivo e intuitivo
- Alinhar com práticas Red Hat para gerenciamento de recursos
2025-09-29 16:35:07 -03:00
afc7462b40 Feat: implementar sistema de recomendações inteligentes e categorização de workloads 2025-09-29 15:26:09 -03:00
63a284f4b2 Fix pod_count handling - it's already an integer from Kubernetes API 2025-09-29 14:22:03 -03:00
6376a9e15e Fix array access errors - add proper length validation before accessing array indices 2025-09-29 14:20:14 -03:00
94ca6543a1 Add debug logging to identify array access error 2025-09-29 14:17:52 -03:00
3632f88c8d Fix array access validation - add length checks before accessing array indices 2025-09-29 14:15:55 -03:00
523da8168a Fix pod count error - add proper validation for Prometheus query results 2025-09-29 14:11:52 -03:00
514ea60274 Fix namespace historical analysis - use Kubernetes API for accurate pod count and remove duplicate function 2025-09-29 14:07:49 -03:00
09ee5e009d Fix JSON serialization issues with safe float conversion 2025-09-29 13:50:47 -03:00
8307eeb646 Fix Prometheus SSL and authentication in historical analysis 2025-09-29 13:47:58 -03:00
6b2f8de6b6 Fix Prometheus queries using correct OpenShift metrics from console dashboard
- Updated CPU usage query to use node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate
- Updated memory usage query to use container_memory_working_set_bytes with correct job and metrics_path
- Updated requests/limits queries to use kube_resourcequota with correct cluster and type parameters
- Applied fixes to both get_workload_historical_analysis and get_namespace_historical_analysis functions
- Queries now match the working queries from OpenShift console dashboard
2025-09-29 13:33:48 -03:00
32ef5d859c Fix: Remove prometheus_client parameter from historical analysis functions 2025-09-29 13:25:13 -03:00
39b6a06de7 Fix: Remove incorrect prometheus_client parameter from _query_prometheus calls 2025-09-29 13:10:51 -03:00
fd2a2f45a4 Enhance: Show specific request and limit values in ratio validation messages 2025-09-29 12:20:35 -03:00
0a5b8a03c6 Implement workload-based historical analysis with timeline buttons 2025-09-26 13:50:44 -03:00
0132a90387 Move Historical Analysis button to individual pod cards with pod-specific Prometheus queries 2025-09-26 10:01:51 -03:00
3511e1cd41 Implement individual namespace historical analysis with modal UI 2025-09-26 09:07:58 -03:00
f38689d9dd Translate all Portuguese text to English 2025-09-25 21:05:41 -03:00
f8279933d6 Fix: Translate all remaining Portuguese text to English in routes, services and frontend 2025-09-25 20:40:52 -03:00
89a7ee41de Fix: Translate all validation messages and UI text from Portuguese to English 2025-09-25 20:08:13 -03:00
3a6875a80e Add CI/CD with GitHub Actions and migrate to Deployment
- Migrate from DaemonSet to Deployment for better efficiency
- Add GitHub Actions for automatic build and deploy
- Add Blue-Green deployment strategy with health checks
- Add scripts for development and production workflows
- Update documentation with CI/CD flow
2025-09-25 17:20:38 -03:00
4d60c0e039 Initial commit: OpenShift Resource Governance Tool
- Implementa ferramenta completa de governança de recursos
- Backend Python com FastAPI para coleta de dados
- Validações seguindo best practices Red Hat
- Integração com Prometheus e VPA
- UI web interativa para visualização
- Relatórios em JSON, CSV e PDF
- Deploy como DaemonSet com RBAC
- Scripts de automação para build e deploy
2025-09-25 14:26:24 -03:00