Skip to content

Daily Civic Intelligence Refinement Engine#407

Open
RohanExploit wants to merge 2 commits intomainfrom
civic-intelligence-engine-538990691647937483
Open

Daily Civic Intelligence Refinement Engine#407
RohanExploit wants to merge 2 commits intomainfrom
civic-intelligence-engine-538990691647937483

Conversation

@RohanExploit
Copy link
Owner

@RohanExploit RohanExploit commented Feb 16, 2026

Implemented a self-improving AI infrastructure that runs daily to analyze civic issues, detect trends, adapt severity scoring and duplicate detection parameters, and generate a daily Civic Intelligence Index snapshot. This replaces hardcoded rules with dynamic configurations stored in data/modelWeights.json.


PR created automatically by Jules for task 538990691647937483 started by @RohanExploit


Summary by cubic

Adds a daily Civic Intelligence Refinement engine that detects trends, adapts model weights, and generates a daily Civic Intelligence Index snapshot. Also fixes a Render deployment import error. Implements task 538990691647937483.

  • New Features

    • Daily engine orchestrates trend analysis, weight updates (severity mapping, keywords, duplicate radius), and snapshot output.
    • AdaptiveWeights persists dynamic config (severity keywords, urgency patterns, categories, severity_mapping, duplicate_search_radius).
    • IntelligenceIndex computes a 0–100 daily score; TrendAnalyzer finds top keywords, category spikes, and hotspots.
    • PriorityEngine now uses AdaptiveWeights; duplicate detection and grievance severity mapping read dynamic values.
    • Scheduler job to run the refinement; tests cover index scoring, weight upgrades, keyword additions, and radius tuning.
  • Bug Fixes

    • Removed PYTHONPATH from render.yaml to fix backend imports during Render deploy.

Written for commit c4ec97d. Summary will update on new commits.

Summary by CodeRabbit

Release Notes

New Features

  • Implemented dynamic adaptive weight system for intelligent issue classification that evolves based on patterns
  • Added Civic Intelligence Engine providing daily trend analysis, hotspot detection, and intelligence scoring
  • Automated daily refinement process optimizes severity mappings and keywords from recent civic activity
  • Enhanced duplicate detection with adaptive spatial radius based on issue density

Tests

  • Added comprehensive test coverage for civic intelligence workflows and trend analysis

- Created `AdaptiveWeights` for dynamic model weights (severity, keywords, duplicate radius).
- Created `TrendAnalyzer` for daily trend detection.
- Created `IntelligenceIndex` for calculating daily civic score.
- Created `CivicIntelligenceEngine` to orchestrate daily refinement.
- Refactored `PriorityEngine` and `tasks.py` to use adaptive weights.
- Updated `issues.py` to use dynamic duplicate search radius.
- Added scheduled job `backend/scheduler/daily_refinement_job.py`.
- Added tests in `tests/test_civic_intelligence.py`.
@google-labs-jules
Copy link
Contributor

👋 Jules, reporting for duty! I'm here to lend a hand with this pull request.

When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down.

I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job!

For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with @jules. You can find this option in the Pull Request section of your global Jules UI settings. You can always switch back!

New to Jules? Learn more at jules.google/docs.


For security, I will only act on instructions from the user who triggered this task.

Copilot AI review requested due to automatic review settings February 16, 2026 18:56
@netlify
Copy link

netlify bot commented Feb 16, 2026

Deploy Preview for fixmybharat canceled.

Name Link
🔨 Latest commit c4ec97d
🔍 Latest deploy log https://app.netlify.com/projects/fixmybharat/deploys/69936c1204a9c90008e77556

@github-actions
Copy link

🙏 Thank you for your contribution, @RohanExploit!

PR Details:

Quality Checklist:
Please ensure your PR meets the following criteria:

  • Code follows the project's style guidelines
  • Self-review of code completed
  • Code is commented where necessary
  • Documentation updated (if applicable)
  • No new warnings generated
  • Tests added/updated (if applicable)
  • All tests passing locally
  • No breaking changes to existing functionality

Review Process:

  1. Automated checks will run on your code
  2. A maintainer will review your changes
  3. Address any requested changes promptly
  4. Once approved, your PR will be merged! 🎉

Note: The maintainers will monitor code quality and ensure the overall project flow isn't broken.

@coderabbitai
Copy link

coderabbitai bot commented Feb 16, 2026

📝 Walkthrough

Walkthrough

This pull request introduces an adaptive weights management system and civic intelligence refinement engine. New modules handle weight persistence, trend analysis, intelligence scoring, and daily refinement jobs. Existing systems are updated to use dynamic weights instead of hardcoded values. A test suite validates the end-to-end workflow.

Changes

Cohort / File(s) Summary
Adaptive Weights System
backend/adaptive_weights.py, data/modelWeights.json
New AdaptiveWeights class manages dynamic configuration for severity keywords, urgency patterns, categories, and severity mappings with JSON persistence. Default schemas provided for instantiation and fallback.
Civic Intelligence Engine
backend/civic_intelligence.py, backend/scheduler/daily_refinement_job.py
New singleton CivicIntelligenceEngine orchestrates daily refinement workflow including trend analysis, weight optimization, keyword optimization, and duplicate radius adjustment. Scheduled job executes the workflow and reports results.
Trend Analysis & Scoring
backend/trend_analyzer.py, backend/intelligence_index.py
TrendAnalyzer detects trends, keywords, and geographic hotspots from 24-hour activity. IntelligenceIndex computes civic intelligence score (0-100) based on resolutions, verifications, engagement, and category spikes.
Integration with Existing Systems
backend/priority_engine.py, backend/routers/issues.py, backend/tasks.py
PriorityEngine refactored to load dynamic weights via reload_weights() instead of hardcoded mappings. Deduplication radius in issues router now derives from AdaptiveWeights. Task module uses dynamic severity mapping for grievance creation.
Configuration & Environment
render.yaml
Removed PYTHONPATH environment variable from backend service configuration.
Test Coverage
tests/test_civic_intelligence.py
Comprehensive integration test validating end-to-end civic intelligence refinement, weight optimization, keyword updates, duplicate radius adjustment, and snapshot persistence with specific assertions on computed scores and state changes.

Sequence Diagram(s)

sequenceDiagram
    participant Job as Daily Refinement Job
    participant Engine as CivicIntelligenceEngine
    participant Trends as TrendAnalyzer
    participant Index as IntelligenceIndex
    participant Weights as AdaptiveWeights
    participant DB as Database

    Job->>Engine: refine_daily(db)
    
    Engine->>Trends: analyze_trends(db)
    Trends->>DB: query last 48h issues
    DB-->>Trends: issue records
    Trends-->>Engine: trend_data (keywords, hotspots, spikes)
    
    Engine->>Index: calculate_score(db, trend_data)
    Index->>DB: query 24h metrics
    DB-->>Index: activity data
    Index-->>Engine: intelligence_score
    
    Engine->>Engine: _optimize_weights(db)
    Engine->>DB: query critical grievances
    DB-->>Engine: grievance records
    Engine->>Weights: update_severity_mapping()
    Weights-->>Engine: updated mappings
    
    Engine->>Engine: _optimize_keywords(db, trend_data)
    Engine->>Weights: add_keyword_to_category()
    
    Engine->>Engine: _optimize_duplicates(trend_data)
    Engine->>Weights: update duplicate_search_radius
    
    Engine->>Engine: _save_snapshot(snapshot)
    
    Engine-->>Job: snapshot result
    Job->>Job: output report
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Suggested labels

ECWoC26, ECWoC26-L3

Poem

🐰 New weights dance in JSON delight,
Trends spin hotspots through day and night,
Intelligence blooms from grievances deep,
While rabbits refine what the city shall reap! 🌱

🚥 Pre-merge checks | ✅ 3 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 46.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title directly and clearly describes the main change: a daily Civic Intelligence Refinement Engine, which is the core feature introduced across all modified files.
Description check ✅ Passed The description is comprehensive and well-structured, covering the main implementation, key components, bug fixes, and testing. It exceeds the template requirements by including detailed summaries and auto-generated context.
Merge Conflict Detection ✅ Passed ✅ No merge conflicts detected when merging into main

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch civic-intelligence-engine-538990691647937483

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

13 issues found across 10 files

Prompt for AI agents (all issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="backend/trend_analyzer.py">

<violation number="1" location="backend/trend_analyzer.py:86">
P2: Bug: Falsy check on float coordinates will skip valid `0.0` values. `0.0` is a valid coordinate but evaluates as falsy in Python. Use explicit `is not None` checks instead.</violation>
</file>

<file name="backend/routers/issues.py">

<violation number="1" location="backend/routers/issues.py:98">
P2: No validation on the dynamically loaded `duplicate_search_radius`. The value is read from a JSON file and used directly without any type check or bounds enforcement. If the JSON is corrupted or the value is set to 0, negative, or extremely large (e.g., by the daily refinement engine), deduplication will silently break. Consider adding a guard, e.g., `max(10.0, min(search_radius, 500.0))`.</violation>

<violation number="2" location="backend/routers/issues.py:98">
P2: Synchronous file I/O blocking the async event loop. `AdaptiveWeights()` reads `data/modelWeights.json` from disk in its constructor (`load_weights()`), and this is called directly in the async handler without `run_in_threadpool`. The rest of this endpoint is careful to offload all blocking I/O to a thread pool — this should be consistent. Consider either: (1) wrapping this in `await run_in_threadpool(lambda: AdaptiveWeights().duplicate_search_radius)`, or (2) caching the `AdaptiveWeights` instance (e.g., as a module-level singleton) so the file is not re-read on every request.</violation>
</file>

<file name="backend/civic_intelligence.py">

<violation number="1" location="backend/civic_intelligence.py:23">
P2: Thread-unsafe singleton: the `__new__` check-then-assign on `_instance` is not protected by a lock. Under concurrent access (e.g., FastAPI with multiple workers/threads), this can produce multiple instances with divergent state. Use `threading.Lock` to guard initialization.</violation>
</file>

<file name="backend/tasks.py">

<violation number="1" location="backend/tasks.py:51">
P2: Severity values loaded from the external JSON file are not validated before use. Downstream, `SeverityLevel(severity)` in `grievance_service.py` will raise `ValueError` if the value isn't one of `'low'`, `'medium'`, `'high'`, `'critical'`. The old hardcoded mapping was safe by construction; the new dynamic source should validate values before passing them along.</violation>
</file>

<file name="tests/test_civic_intelligence.py">

<violation number="1" location="tests/test_civic_intelligence.py:73">
P2: Singleton test isolation issue: `engine.weights_manager` is replaced but never restored after the test. Since `CivicIntelligenceEngine` is a singleton, the modified `weights_manager` leaks to any subsequent test that uses `get_civic_intelligence_engine()`. Consider resetting `_instance = None` on `CivicIntelligenceEngine` in a fixture teardown, or saving/restoring the original `weights_manager`.</violation>

<violation number="2" location="tests/test_civic_intelligence.py:141">
P2: Missing `import sys` — `sys.exit()` on this line will raise `NameError` at runtime when the file is executed directly.</violation>
</file>

<file name="backend/priority_engine.py">

<violation number="1" location="backend/priority_engine.py:17">
P1: `reload_weights()` doesn't actually reload from disk. `get_weights()` only returns cached in-memory state from the `AdaptiveWeights` instance. You need to call `self.weights_manager.load_weights()` first to re-read `modelWeights.json`, otherwise this method is a no-op after the initial construction — defeating the purpose of dynamic weight refinement.</violation>
</file>

<file name="backend/scheduler/daily_refinement_job.py">

<violation number="1" location="backend/scheduler/daily_refinement_job.py:22">
P2: Inconsistent use of `print()` vs `logger.info()`. This is a scheduled job where stdout may not be captured by log aggregation. All output on lines 22–36 should use `logger.info()` to ensure consistent observability — especially since errors already go through `logger.error()`.</violation>
</file>

<file name="backend/adaptive_weights.py">

<violation number="1" location="backend/adaptive_weights.py:126">
P1: Shallow `.copy()` on nested dicts causes shared mutable state: `add_keyword_to_category` will mutate the class-level `DEFAULT_CATEGORIES` (and `DEFAULT_SEVERITY_KEYWORDS`) because inner lists are still shared references. Use `copy.deepcopy()` instead.</violation>

<violation number="2" location="backend/adaptive_weights.py:163">
P2: `os.makedirs(os.path.dirname(self.weights_file))` crashes with `FileNotFoundError` if `weights_file` has no directory component (e.g., `'modelWeights.json'`). Guard against an empty dirname.</violation>
</file>

<file name="data/modelWeights.json">

<violation number="1" location="data/modelWeights.json:51">
P1: Duplicate keywords across severity levels cause ambiguous scoring. `"smoke"` appears in both `critical` (line 51) and `high` (line 66), and `"attack"` appears in both `critical` (line 19) and `high` (line 81). When a report contains one of these words, the assigned severity will depend on implementation-specific iteration order rather than an intentional classification. Remove the duplicates from the lower-priority level, or if both levels should match, document the precedence rule.</violation>

<violation number="2" location="data/modelWeights.json:217">
P2: Duplicate keywords across non-adjacent severity levels. `"leaning"` is in both `high` (line 118) and `low` (line 217), and `"dirty"` is in both `medium` (line 134) and `low` (line 216). The `high`↔`low` gap for "leaning" is especially problematic — a leaning structure could be scored as low-priority instead of high. Remove from the less appropriate level.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

Comment on lines +17 to +19
def reload_weights(self):
"""Reloads weights from the AdaptiveWeights manager."""
weights = self.weights_manager.get_weights()
Copy link

@cubic-dev-ai cubic-dev-ai bot Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: reload_weights() doesn't actually reload from disk. get_weights() only returns cached in-memory state from the AdaptiveWeights instance. You need to call self.weights_manager.load_weights() first to re-read modelWeights.json, otherwise this method is a no-op after the initial construction — defeating the purpose of dynamic weight refinement.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At backend/priority_engine.py, line 17:

<comment>`reload_weights()` doesn't actually reload from disk. `get_weights()` only returns cached in-memory state from the `AdaptiveWeights` instance. You need to call `self.weights_manager.load_weights()` first to re-read `modelWeights.json`, otherwise this method is a no-op after the initial construction — defeating the purpose of dynamic weight refinement.</comment>

<file context>
@@ -1,105 +1,25 @@
-            "Environment": ["tree", "cutting", "deforestation", "forest", "nature"],
-            "Flooding": ["flood", "waterlogging", "water logged", "rain", "drainage"]
-        }
+    def reload_weights(self):
+        """Reloads weights from the AdaptiveWeights manager."""
+        weights = self.weights_manager.get_weights()
</file context>
Suggested change
def reload_weights(self):
"""Reloads weights from the AdaptiveWeights manager."""
weights = self.weights_manager.get_weights()
def reload_weights(self):
"""Reloads weights from the AdaptiveWeights manager."""
self.weights_manager.load_weights()
weights = self.weights_manager.get_weights()
Fix with Cubic


def __init__(self, weights_file: str = DEFAULT_WEIGHTS_FILE):
self.weights_file = weights_file
self.severity_keywords = self.DEFAULT_SEVERITY_KEYWORDS.copy()
Copy link

@cubic-dev-ai cubic-dev-ai bot Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: Shallow .copy() on nested dicts causes shared mutable state: add_keyword_to_category will mutate the class-level DEFAULT_CATEGORIES (and DEFAULT_SEVERITY_KEYWORDS) because inner lists are still shared references. Use copy.deepcopy() instead.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At backend/adaptive_weights.py, line 126:

<comment>Shallow `.copy()` on nested dicts causes shared mutable state: `add_keyword_to_category` will mutate the class-level `DEFAULT_CATEGORIES` (and `DEFAULT_SEVERITY_KEYWORDS`) because inner lists are still shared references. Use `copy.deepcopy()` instead.</comment>

<file context>
@@ -0,0 +1,197 @@
+
+    def __init__(self, weights_file: str = DEFAULT_WEIGHTS_FILE):
+        self.weights_file = weights_file
+        self.severity_keywords = self.DEFAULT_SEVERITY_KEYWORDS.copy()
+        self.urgency_patterns = list(self.DEFAULT_URGENCY_PATTERNS)
+        self.categories = self.DEFAULT_CATEGORIES.copy()
</file context>
Fix with Cubic

"open electrical box",
"burning",
"flame",
"smoke",
Copy link

@cubic-dev-ai cubic-dev-ai bot Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: Duplicate keywords across severity levels cause ambiguous scoring. "smoke" appears in both critical (line 51) and high (line 66), and "attack" appears in both critical (line 19) and high (line 81). When a report contains one of these words, the assigned severity will depend on implementation-specific iteration order rather than an intentional classification. Remove the duplicates from the lower-priority level, or if both levels should match, document the precedence rule.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At data/modelWeights.json, line 51:

<comment>Duplicate keywords across severity levels cause ambiguous scoring. `"smoke"` appears in both `critical` (line 51) and `high` (line 66), and `"attack"` appears in both `critical` (line 19) and `high` (line 81). When a report contains one of these words, the assigned severity will depend on implementation-specific iteration order rather than an intentional classification. Remove the duplicates from the lower-priority level, or if both levels should match, document the precedence rule.</comment>

<file context>
@@ -0,0 +1,466 @@
+            "open electrical box",
+            "burning",
+            "flame",
+            "smoke",
+            "crack",
+            "fissure"
</file context>
Fix with Cubic

# Group by approximate location (0.01 degree ~ 1.1km)
loc_counter = Counter()
for issue in issues:
if issue.latitude and issue.longitude:
Copy link

@cubic-dev-ai cubic-dev-ai bot Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Bug: Falsy check on float coordinates will skip valid 0.0 values. 0.0 is a valid coordinate but evaluates as falsy in Python. Use explicit is not None checks instead.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At backend/trend_analyzer.py, line 86:

<comment>Bug: Falsy check on float coordinates will skip valid `0.0` values. `0.0` is a valid coordinate but evaluates as falsy in Python. Use explicit `is not None` checks instead.</comment>

<file context>
@@ -0,0 +1,99 @@
+        # Group by approximate location (0.01 degree ~ 1.1km)
+        loc_counter = Counter()
+        for issue in issues:
+            if issue.latitude and issue.longitude:
+                # Round to 2 decimal places (approx 1.1km resolution)
+                key = (round(issue.latitude, 2), round(issue.longitude, 2))
</file context>
Fix with Cubic

try:
# Find existing open issues within 50 meters
# Get dynamic radius from AdaptiveWeights
search_radius = AdaptiveWeights().duplicate_search_radius
Copy link

@cubic-dev-ai cubic-dev-ai bot Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: No validation on the dynamically loaded duplicate_search_radius. The value is read from a JSON file and used directly without any type check or bounds enforcement. If the JSON is corrupted or the value is set to 0, negative, or extremely large (e.g., by the daily refinement engine), deduplication will silently break. Consider adding a guard, e.g., max(10.0, min(search_radius, 500.0)).

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At backend/routers/issues.py, line 98:

<comment>No validation on the dynamically loaded `duplicate_search_radius`. The value is read from a JSON file and used directly without any type check or bounds enforcement. If the JSON is corrupted or the value is set to 0, negative, or extremely large (e.g., by the daily refinement engine), deduplication will silently break. Consider adding a guard, e.g., `max(10.0, min(search_radius, 500.0))`.</comment>

<file context>
@@ -93,9 +94,12 @@ async def create_issue(
         try:
-            # Find existing open issues within 50 meters
+            # Get dynamic radius from AdaptiveWeights
+            search_radius = AdaptiveWeights().duplicate_search_radius
+
+            # Find existing open issues within dynamic radius
</file context>
Fix with Cubic

engine._save_snapshot = original_save

if __name__ == "__main__":
sys.exit(pytest.main(["-v", __file__]))
Copy link

@cubic-dev-ai cubic-dev-ai bot Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Missing import syssys.exit() on this line will raise NameError at runtime when the file is executed directly.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At tests/test_civic_intelligence.py, line 141:

<comment>Missing `import sys` — `sys.exit()` on this line will raise `NameError` at runtime when the file is executed directly.</comment>

<file context>
@@ -0,0 +1,141 @@
+    engine._save_snapshot = original_save
+
+if __name__ == "__main__":
+    sys.exit(pytest.main(["-v", __file__]))
</file context>
Fix with Cubic


# 2. Setup Engine with Test Weights
engine = get_civic_intelligence_engine()
engine.weights_manager = AdaptiveWeights(weights_file=str(test_weights_file))
Copy link

@cubic-dev-ai cubic-dev-ai bot Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Singleton test isolation issue: engine.weights_manager is replaced but never restored after the test. Since CivicIntelligenceEngine is a singleton, the modified weights_manager leaks to any subsequent test that uses get_civic_intelligence_engine(). Consider resetting _instance = None on CivicIntelligenceEngine in a fixture teardown, or saving/restoring the original weights_manager.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At tests/test_civic_intelligence.py, line 73:

<comment>Singleton test isolation issue: `engine.weights_manager` is replaced but never restored after the test. Since `CivicIntelligenceEngine` is a singleton, the modified `weights_manager` leaks to any subsequent test that uses `get_civic_intelligence_engine()`. Consider resetting `_instance = None` on `CivicIntelligenceEngine` in a fixture teardown, or saving/restoring the original `weights_manager`.</comment>

<file context>
@@ -0,0 +1,141 @@
+
+    # 2. Setup Engine with Test Weights
+    engine = get_civic_intelligence_engine()
+    engine.weights_manager = AdaptiveWeights(weights_file=str(test_weights_file))
+    # Reset severity mapping to default to test update
+    engine.weights_manager.severity_mapping["pothole"] = "medium"
</file context>
Fix with Cubic

engine = get_civic_intelligence_engine()
result = engine.refine_daily(db)

print("\n--- Civic Intelligence Report ---")
Copy link

@cubic-dev-ai cubic-dev-ai bot Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Inconsistent use of print() vs logger.info(). This is a scheduled job where stdout may not be captured by log aggregation. All output on lines 22–36 should use logger.info() to ensure consistent observability — especially since errors already go through logger.error().

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At backend/scheduler/daily_refinement_job.py, line 22:

<comment>Inconsistent use of `print()` vs `logger.info()`. This is a scheduled job where stdout may not be captured by log aggregation. All output on lines 22–36 should use `logger.info()` to ensure consistent observability — especially since errors already go through `logger.error()`.</comment>

<file context>
@@ -0,0 +1,47 @@
+        engine = get_civic_intelligence_engine()
+        result = engine.refine_daily(db)
+
+        print("\n--- Civic Intelligence Report ---")
+        print(f"Date: {result['date']}")
+        print(f"Index Score: {result['civic_intelligence_index']['score']}")
</file context>
Fix with Cubic

}

# Ensure directory exists
os.makedirs(os.path.dirname(self.weights_file), exist_ok=True)
Copy link

@cubic-dev-ai cubic-dev-ai bot Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: os.makedirs(os.path.dirname(self.weights_file)) crashes with FileNotFoundError if weights_file has no directory component (e.g., 'modelWeights.json'). Guard against an empty dirname.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At backend/adaptive_weights.py, line 163:

<comment>`os.makedirs(os.path.dirname(self.weights_file))` crashes with `FileNotFoundError` if `weights_file` has no directory component (e.g., `'modelWeights.json'`). Guard against an empty dirname.</comment>

<file context>
@@ -0,0 +1,197 @@
+        }
+
+        # Ensure directory exists
+        os.makedirs(os.path.dirname(self.weights_file), exist_ok=True)
+
+        try:
</file context>
Fix with Cubic

"old",
"rusty",
"dirty",
"leaning"
Copy link

@cubic-dev-ai cubic-dev-ai bot Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Duplicate keywords across non-adjacent severity levels. "leaning" is in both high (line 118) and low (line 217), and "dirty" is in both medium (line 134) and low (line 216). The highlow gap for "leaning" is especially problematic — a leaning structure could be scored as low-priority instead of high. Remove from the less appropriate level.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At data/modelWeights.json, line 217:

<comment>Duplicate keywords across non-adjacent severity levels. `"leaning"` is in both `high` (line 118) and `low` (line 217), and `"dirty"` is in both `medium` (line 134) and `low` (line 216). The `high`↔`low` gap for "leaning" is especially problematic — a leaning structure could be scored as low-priority instead of high. Remove from the less appropriate level.</comment>

<file context>
@@ -0,0 +1,466 @@
+            "old",
+            "rusty",
+            "dirty",
+            "leaning"
+        ]
+    },
</file context>
Fix with Cubic

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a “Daily Civic Intelligence Refinement Engine” that analyzes recent issues daily, adapts configurable weights (severity mapping, keywords, duplicate radius) persisted in data/modelWeights.json, and saves a daily Civic Intelligence Index snapshot for reporting.

Changes:

  • Introduces CivicIntelligenceEngine + supporting modules (TrendAnalyzer, IntelligenceIndex) to compute trends, adjust weights, and save daily snapshots.
  • Replaces hardcoded severity mapping and fixed dedupe radius with AdaptiveWeights loaded from data/modelWeights.json.
  • Adds a scheduler entrypoint and an integration-style test for the daily refinement flow.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 13 comments.

Show a summary per file
File Description
backend/adaptive_weights.py New weights manager that loads/saves dynamic configuration to JSON.
data/modelWeights.json Baseline dynamic configuration (keywords, patterns, categories, severity mapping).
backend/priority_engine.py Switches priority analysis from hardcoded rules to AdaptiveWeights-backed config.
backend/routers/issues.py Uses adaptive duplicate radius for spatial deduplication.
backend/tasks.py Uses adaptive severity mapping when creating grievances in background jobs.
backend/trend_analyzer.py New trend detection (keywords, category spikes, hotspots).
backend/intelligence_index.py New daily index score calculator derived from trend + DB metrics.
backend/civic_intelligence.py New orchestration engine for daily refinement + snapshot persistence.
backend/scheduler/daily_refinement_job.py New runnable script to execute the daily refinement and print a report.
tests/test_civic_intelligence.py New test covering trend analysis, index scoring, weight updates, and snapshot output.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +5 to +10
# Ensure project root is in sys.path
sys.path.append(os.path.abspath(os.path.join(os.path.dirname(__file__), '../../')))

from backend.database import SessionLocal
from backend.civic_intelligence import get_civic_intelligence_engine

Copy link

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This script and the refinement engine rely on relative paths (e.g., data/dailySnapshots and data/modelWeights.json) and only tweak sys.path. When run via cron/systemd, the working directory is often not the repo root, so snapshots/weights may be written to an unexpected location or fail. Prefer resolving paths relative to the project root (based on __file__/Path.resolve()) and/or accept an explicit base directory via env/config instead of assuming CWD.

Copilot uses AI. Check for mistakes.
Comment on lines +124 to +131
def __init__(self, weights_file: str = DEFAULT_WEIGHTS_FILE):
self.weights_file = weights_file
self.severity_keywords = self.DEFAULT_SEVERITY_KEYWORDS.copy()
self.urgency_patterns = list(self.DEFAULT_URGENCY_PATTERNS)
self.categories = self.DEFAULT_CATEGORIES.copy()
self.severity_mapping = self.DEFAULT_SEVERITY_MAPPING.copy()
self.duplicate_search_radius = self.DEFAULT_DUPLICATE_RADIUS

Copy link

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AdaptiveWeights initializes nested default dict/list structures with shallow copies (dict.copy() / list(...)). This means the inner lists are shared across instances and can also mutate the class-level DEFAULT_* constants when add_keyword_to_category() appends, causing cross-request/test contamination. Use copy.deepcopy(...) (or build fresh dicts/lists) for severity_keywords, urgency_patterns, and categories (and any other nested defaults) during initialization to ensure instance isolation.

Copilot uses AI. Check for mistakes.
Comment on lines +163 to +170
os.makedirs(os.path.dirname(self.weights_file), exist_ok=True)

try:
with open(self.weights_file, 'w') as f:
json.dump(data, f, indent=4)
logger.info(f"Saved weights to {self.weights_file}")
except Exception as e:
logger.error(f"Failed to save weights to {self.weights_file}: {e}")
Copy link

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

save_weights() writes JSON directly to the target file. If the app reads data/modelWeights.json concurrently (e.g., API requests creating AdaptiveWeights while the daily refinement job is saving), readers can observe a partially-written file and hit JSON decode errors. Consider writing to a temp file and atomically replacing (e.g., write to weights_file.tmp, flush/fsync, then os.replace), and optionally add a file lock for multi-process safety.

Suggested change
os.makedirs(os.path.dirname(self.weights_file), exist_ok=True)
try:
with open(self.weights_file, 'w') as f:
json.dump(data, f, indent=4)
logger.info(f"Saved weights to {self.weights_file}")
except Exception as e:
logger.error(f"Failed to save weights to {self.weights_file}: {e}")
directory = os.path.dirname(self.weights_file)
if directory:
os.makedirs(directory, exist_ok=True)
temp_file = f"{self.weights_file}.tmp"
try:
# Write to a temporary file first
with open(temp_file, 'w') as f:
json.dump(data, f, indent=4)
f.flush()
os.fsync(f.fileno())
# Atomically replace the target file with the temp file
os.replace(temp_file, self.weights_file)
logger.info(f"Saved weights to {self.weights_file}")
except Exception as e:
logger.error(f"Failed to save weights to {self.weights_file}: {e}")
# Best-effort cleanup of temp file on failure
try:
if os.path.exists(temp_file):
os.remove(temp_file)
except Exception:
# Swallow cleanup errors to avoid masking the original exception
pass

Copilot uses AI. Check for mistakes.
Comment on lines +97 to 103
# Get dynamic radius from AdaptiveWeights
search_radius = AdaptiveWeights().duplicate_search_radius

# Find existing open issues within dynamic radius
# Optimization: Use bounding box to filter candidates in SQL
min_lat, max_lat, min_lon, max_lon = get_bounding_box(latitude, longitude, 50.0)
min_lat, max_lat, min_lon, max_lon = get_bounding_box(latitude, longitude, search_radius)

Copy link

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This instantiates AdaptiveWeights() inside the request path, which always hits the filesystem to load data/modelWeights.json. Since create_issue is a hot endpoint, this adds guaranteed per-request disk I/O and JSON parsing. Consider caching the weights manager (module-level singleton with explicit reload/TTL) or pulling the radius from a shared in-memory config that the daily refinement job updates.

Copilot uses AI. Check for mistakes.
Comment on lines 13 to +22
def __init__(self):
# Keyword dictionaries for Severity Classification
self.severity_keywords = {
"critical": [
"fire", "explosion", "blood", "death", "collapse", "gas leak",
"electric shock", "spark", "electrocution", "drowning",
"flood", "landslide", "earthquake", "cyclone", "hurricane",
"attack", "assault", "rabid", "deadly", "fatal", "emergency",
"blocked road", "ambulance", "hospital", "school", "child",
"exposed wire", "transformer", "chemical", "toxic", "poison",
"weapon", "gun", "bomb", "terror", "riot", "stampede",
"structural failure", "pillar", "bridge", "flyover",
"open manhole", "live wire", "gas smell", "open electrical box",
"burning", "flame", "smoke", "crack", "fissure"
],
"high": [
"accident", "injury", "broken", "bleeding", "hazard", "risk",
"dangerous", "unsafe", "threat", "pollution", "smoke",
"sewage", "overflow", "contamination", "infection", "disease",
"mosquito", "dengue", "malaria", "typhoid", "cholera",
"rat", "snake", "stray dog", "bite", "attack", "cattle",
"theft", "robbery", "burglary", "harassment", "abuse",
"illegal", "crime", "violation", "bribe", "corruption",
"traffic jam", "congestion", "gridlock", "delay",
"no water", "power cut", "blackout", "load shedding",
"pothole", "manhole", "open drain", "water logging",
"dead", "animal", "fish", "stuck",
"not working", "signal", "traffic light", "fallen tree",
"water leakage", "leakage", "burst", "pipe burst", "damage",
"leaning", "tilted", "unstable", "waterlogging"
],
"medium": [
"garbage", "trash", "waste", "litter", "rubbish", "dustbin",
"smell", "odor", "stink", "foul", "dirty", "unclean",
"messy", "ugly", "eyesore", "bad", "poor",
"leak", "drip", "seepage", "moisture", "damp",
"noise", "loud", "sound", "music", "party", "barking",
"encroachment", "hawker", "vendor", "stall", "shop",
"parking", "parked", "vehicle", "car", "bike", "scooter",
"construction", "debris", "material", "sand", "cement",
"graffiti", "poster", "banner", "hoarding", "advertisement",
"slippery", "muddy", "path", "pavement", "sidewalk",
"crowd", "gathering", "tap", "wasting", "running water",
"speed breaker", "hump", "bump"
],
"low": [
"light", "lamp", "bulb", "flicker", "dim", "dark",
"sign", "board", "paint", "color", "faded",
"bench", "chair", "seat", "grass", "plant", "tree",
"leaf", "branch", "garden", "park", "playground",
"cosmetic", "look", "appearance", "aesthetic",
"old", "rusty", "dirty", "leaning"
]
}
self.weights_manager = AdaptiveWeights()
self.reload_weights()

# Regex patterns for Urgency Scoring
self.urgency_patterns = [
(r"\b(now|immediately|urgent|emergency|critical|danger|help)\b", 20),
(r"\b(today|tonight|morning|evening|afternoon)\b", 10),
(r"\b(yesterday|last night|week|month)\b", 5),
(r"\b(blood|bleeding|injury|hurt|pain|dead)\b", 25),
(r"\b(fire|smoke|flame|burn|gas|leak|explosion)\b", 30),
(r"\b(blocked|stuck|trapped|jam)\b", 15),
(r"\b(school|hospital|clinic)\b", 15), # Sensitive locations
(r"\b(child|kid|baby|elderly|senior)\b", 10) # Vulnerable groups
]

# Category mapping
self.categories = {
"Fire": ["fire", "smoke", "flame", "burn", "explosion", "burning"],
"Pothole": ["pothole", "hole", "crater", "road damage", "broken road"],
"Street Light": ["light", "lamp", "bulb", "dark", "street light", "flicker"],
"Garbage": ["garbage", "trash", "waste", "litter", "rubbish", "dump", "dustbin"],
"Water Leak": ["water", "leak", "pipe", "burst", "flood", "seepage", "drip", "leakage", "tap", "running"],
"Stray Animal": ["dog", "cat", "cow", "cattle", "monkey", "bite", "stray", "animal", "rabid", "dead animal"],
"Construction Safety": ["construction", "debris", "material", "cement", "sand", "building"],
"Illegal Parking": ["parking", "parked", "blocking", "vehicle", "car", "bike"],
"Vandalism": ["graffiti", "paint", "broken", "destroy", "damage", "poster"],
"Infrastructure": ["bridge", "flyover", "pillar", "crack", "collapse", "structure", "manhole", "drain", "wire", "cable", "pole", "electrical box", "electric box", "transformer", "sidewalk", "pavement", "tile", "speed breaker", "road"],
"Traffic Sign": ["sign", "signal", "light", "traffic", "board", "direction", "stop sign"],
"Public Facilities": ["toilet", "washroom", "bench", "seat", "park", "garden", "playground", "slide", "swing"],
"Tree Hazard": ["tree", "branch", "fallen", "root", "leaf"],
"Accessibility": ["ramp", "wheelchair", "step", "stair", "access", "disability"],
"Noise Pollution": ["noise", "loud", "sound", "music", "speaker"],
"Air Pollution": ["smoke", "dust", "fume", "smell", "pollution", "air"],
"Water Pollution": ["river", "lake", "pond", "chemical", "oil", "poison", "fish"],
"Health Hazard": ["mosquito", "dengue", "malaria", "rat", "disease", "health"],
"Crowd": ["crowd", "gathering", "mob", "people", "protest"],
"Gas Leak": ["gas", "leak", "smell", "cylinder", "pipeline"],
"Environment": ["tree", "cutting", "deforestation", "forest", "nature"],
"Flooding": ["flood", "waterlogging", "water logged", "rain", "drainage"]
}
def reload_weights(self):
"""Reloads weights from the AdaptiveWeights manager."""
weights = self.weights_manager.get_weights()
self.severity_keywords = weights["severity_keywords"]
self.urgency_patterns = weights["urgency_patterns"]
self.categories = weights["categories"]
Copy link

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PriorityEngine loads weights once at import time via AdaptiveWeights(). After the daily refinement job updates modelWeights.json, the /api/analyze-issue endpoint will continue using stale in-memory weights until the process is restarted (since reload_weights() is never called). If the intent is “daily self-improving”, add a reload strategy (e.g., reload on each request with a cheap mtime check/TTL, or expose a refresh hook called by the scheduler).

Copilot uses AI. Check for mistakes.
# Group by approximate location (0.01 degree ~ 1.1km)
loc_counter = Counter()
for issue in issues:
if issue.latitude and issue.longitude:
Copy link

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hotspot detection uses if issue.latitude and issue.longitude:, which will incorrectly skip valid coordinates when either value is 0.0 (falsy). Use explicit is not None checks so issues at/near the equator or prime meridian are included in hotspot/duplicate optimization logic.

Suggested change
if issue.latitude and issue.longitude:
if issue.latitude is not None and issue.longitude is not None:

Copilot uses AI. Check for mistakes.
from backend.adaptive_weights import AdaptiveWeights
from backend.trend_analyzer import TrendAnalyzer
from backend.intelligence_index import IntelligenceIndex
from backend.models import EscalationAudit, Issue, SeverityLevel, Grievance
Copy link

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EscalationAudit is imported but never used in this module. Please remove the unused import to avoid lint failures and keep dependencies clear.

Suggested change
from backend.models import EscalationAudit, Issue, SeverityLevel, Grievance
from backend.models import Issue, SeverityLevel, Grievance

Copilot uses AI. Check for mistakes.
Comment on lines +77 to +88
for category, count in high_severity_grievances:
# If we see many high severity issues for a category, ensure mapping reflects it
if count >= 3: # Threshold
current_mapping = self.weights_manager.severity_mapping.get(category.lower())

# If current mapping is lower (medium/low), upgrade it
# Logic: If it's mapped to 'low' or 'medium', but we see 3+ high/critical, upgrade to 'high'
# If we see 5+ critical, upgrade to 'critical' (simplified)

if current_mapping in ['low', 'medium']:
logger.info(f"Auto-adjusting severity for category '{category}' to 'high' due to {count} high/critical reports.")
self.weights_manager.update_severity_mapping(category, 'high')
Copy link

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The inline comment mentions upgrading to 'critical' when there are 5+ critical reports, but the current logic only upgrades to 'high' and never sets 'critical'. Either implement the critical-upgrade branch (e.g., count critical separately) or remove/adjust the comment to match behavior so future readers don’t assume it’s happening.

Suggested change
for category, count in high_severity_grievances:
# If we see many high severity issues for a category, ensure mapping reflects it
if count >= 3: # Threshold
current_mapping = self.weights_manager.severity_mapping.get(category.lower())
# If current mapping is lower (medium/low), upgrade it
# Logic: If it's mapped to 'low' or 'medium', but we see 3+ high/critical, upgrade to 'high'
# If we see 5+ critical, upgrade to 'critical' (simplified)
if current_mapping in ['low', 'medium']:
logger.info(f"Auto-adjusting severity for category '{category}' to 'high' due to {count} high/critical reports.")
self.weights_manager.update_severity_mapping(category, 'high')
# Count critical grievances separately so we can upgrade mappings to 'critical' when appropriate
critical_grievances = db.query(Grievance.category, func.count(Grievance.id))\
.filter(Grievance.created_at >= one_day_ago)\
.filter(Grievance.severity == SeverityLevel.CRITICAL)\
.group_by(Grievance.category).all()
critical_counts = {category: count for category, count in critical_grievances}
for category, count in high_severity_grievances:
# If we see many high severity issues for a category, ensure mapping reflects it
current_mapping = self.weights_manager.severity_mapping.get(category.lower())
# Logic:
# - If we see 5+ critical reports, upgrade mapping to 'critical'
# - Else, if it's mapped to 'low' or 'medium' and we see 3+ high/critical, upgrade to 'high'
critical_count = critical_counts.get(category, 0)
if critical_count >= 5 and current_mapping != 'critical':
logger.info(
f"Auto-adjusting severity for category '{category}' to 'critical' due to {critical_count} critical reports."
)
self.weights_manager.update_severity_mapping(category, 'critical')
continue
if count >= 3 and current_mapping in ['low', 'medium']: # Threshold for upgrading to 'high'
logger.info(
f"Auto-adjusting severity for category '{category}' to 'high' due to {count} high/critical reports."
)
self.weights_manager.update_severity_mapping(category, 'high')

Copilot uses AI. Check for mistakes.
# but we can look at upvotes on issues created in last 24h)
engagement_score = 0
if total_issues > 0:
engagement_score = db.query(func.sum(Issue.upvotes)).filter(
Copy link

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The extra indentation on this line is inconsistent with the surrounding block and can trip strict linters/formatters. Align engagement_score = ... to one indent level under the if total_issues > 0: block.

Suggested change
engagement_score = db.query(func.sum(Issue.upvotes)).filter(
engagement_score = db.query(func.sum(Issue.upvotes)).filter(

Copilot uses AI. Check for mistakes.
Comment on lines +72 to +92
engine = get_civic_intelligence_engine()
engine.weights_manager = AdaptiveWeights(weights_file=str(test_weights_file))
# Reset severity mapping to default to test update
engine.weights_manager.severity_mapping["pothole"] = "medium"
engine.weights_manager.save_weights()

# 3. Run Refinement
# Patch save_snapshot to use tmp_path
snapshot_dir = tmp_path / "dailySnapshots"
os.makedirs(snapshot_dir, exist_ok=True)

original_save = engine._save_snapshot
def mock_save_snapshot(snapshot):
filename = f"{snapshot['date']}.json"
filepath = snapshot_dir / filename
with open(filepath, 'w') as f:
json.dump(snapshot, f, indent=4)

engine._save_snapshot = mock_save_snapshot

result = engine.refine_daily(test_db)
Copy link

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test mutates a singleton engine instance (get_civic_intelligence_engine()) and monkey-patches _save_snapshot without a try/finally. If an assertion fails, the patched method (and temp-file-backed weights_manager) can leak into subsequent tests in the same process. Use monkeypatch (pytest fixture) and/or a try/finally to always restore _save_snapshot, and consider resetting the engine singleton between tests.

Copilot uses AI. Check for mistakes.
Removed the `PYTHONPATH: backend` environment variable from `render.yaml`. This setting was causing import errors for `backend.*` modules because it prepended `.../backend` to `sys.path`, leading Python to look for packages inside the `backend` directory instead of the root directory. `start-backend.py` already handles path manipulation correctly.
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 11

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
backend/priority_engine.py (1)

124-125: ⚠️ Potential issue | 🟠 Major

Module-level singleton never refreshes weights after daily refinement.

The priority_engine singleton is instantiated at import time and creates its own AdaptiveWeights instance. After the daily refinement job completes and updates modelWeights.json, this singleton continues using stale in-memory weights because reload_weights() is never called. All issue-analysis requests will use outdated weights until the worker process restarts.

The daily job invokes CivicIntelligenceEngine.refine_daily() (which has its separate AdaptiveWeights instance and saves updates to disk) but does not trigger any refresh on priority_engine.

Call priority_engine.reload_weights() after the daily refinement completes, or implement lazy reloading when the weights file's mtime changes.

🤖 Fix all issues with AI agents
In `@backend/adaptive_weights.py`:
- Around line 124-132: The constructor is making shallow copies of nested
mutable defaults causing instance mutations to corrupt class-level constants;
update __init__ to use deep copies (e.g., copy.deepcopy) for
DEFAULT_SEVERITY_KEYWORDS, DEFAULT_URGENCY_PATTERNS, DEFAULT_CATEGORIES, and
DEFAULT_SEVERITY_MAPPING instead of .copy() or list() so each instance gets
independent nested structures, import the copy module, and keep
DEFAULT_DUPLICATE_RADIUS as-is (it's immutable); ensure load_weights() behavior
still safely replaces keys but rely on deep-copied instance attributes to avoid
shared references.
- Around line 152-170: The save_weights method currently writes directly to
self.weights_file which can lead to partial/corrupt JSON under concurrent
access; change AdaptiveWeights.save_weights to perform an atomic write by
serializing data to a temporary file in the same directory (e.g.,
self.weights_file + ".tmp" or use tempfile.NamedTemporaryFile), fsync the temp
file and directory, then atomically replace the target via
os.replace(self.weights_file, temp_path); alternatively (or additionally) use an
inter-process file lock around read/write operations (e.g., with fcntl.flock)
keyed to self.weights_file to prevent concurrent writers/reader races when
loading/saving.

In `@backend/civic_intelligence.py`:
- Around line 123-145: _optimize_duplicates only increases
weights_manager.duplicate_search_radius and never reduces it; modify this method
to add a decay path: when top_hotspot count is below a low-density threshold
(e.g., count < 3) and duplicate_search_radius is above a defined baseline (e.g.,
weights_manager.duplicate_search_radius_baseline or a constant), reduce
duplicate_search_radius by a small step (e.g., -5.0) bounded at the baseline,
then call weights_manager.save_weights(); ensure the logic references
_optimize_duplicates, top_hotspot.get("count"),
weights_manager.duplicate_search_radius and weights_manager.save_weights so the
adjustment is reversible and avoids permanent ratcheting to 100m.
- Around line 63-88: The auto-upgrade in _optimize_weights is comparing display
names from Grievance.category to slug keys in
self.weights_manager.severity_mapping (so "Street Light".lower() -> "street
light" won't match "streetlight"); fix by normalizing the category into the same
slug form used by severity_mapping before lookups and updates (e.g., lowercasing
and removing whitespace/normalizing punctuation or using the project's slugify
utility if available) so use that normalized key for severity_mapping.get(...)
and when calling self.weights_manager.update_severity_mapping(category, ...)
ensure you pass both the original display name where needed and the normalized
slug when updating the mapping store.

In `@backend/routers/issues.py`:
- Around line 97-98: AdaptiveWeights() is being instantiated on every request
(blocking sync file I/O in its __init__), so replace per-request construction
with a cached module-level instance or memoized factory and read its
duplicate_search_radius from that instance (e.g., create a single
AdaptiveWeights instance similar to priority_engine in
backend/priority_engine.py or wrap AdaptiveWeights construction with
functools.lru_cache/functools.cache), or if you must read fresh-from-disk on
each call, perform AdaptiveWeights() creation/load inside an async-safe thread
via fastapi.concurrency.run_in_threadpool before accessing
duplicate_search_radius; update the code that reads duplicate_search_radius to
use the cached/memoized instance or the run_in_threadpool-wrapped construction
instead of calling AdaptiveWeights() directly on each request.

In `@backend/scheduler/daily_refinement_job.py`:
- Around line 17-44: The code creates db = SessionLocal() before the try block
so if SessionLocal() raises, the finally block's db.close() will raise a
NameError and hide the original exception; move SessionLocal() into the try (or
initialize db = None before try) and in the finally guard the close with if db:
db.close() so that db is only closed when successfully created; update the block
surrounding get_civic_intelligence_engine() and engine.refine_daily(db)
(references: SessionLocal, db, get_civic_intelligence_engine,
engine.refine_daily) accordingly to prevent masking the original error.

In `@backend/tasks.py`:
- Around line 48-54: The severity lookup is failing because issue.category
(e.g., "Street Light") doesn't match compact mapping keys (e.g., "streetlight")
and .lower() will also crash if category is None; fix by normalizing both sides:
update AdaptiveWeights.get_weights (or its loader) to normalize mapping keys
(strip whitespace, remove internal spaces/normalize punctuation, and lowercase)
and in backend/tasks.py normalize the incoming category similarly (safe against
None by using a fallback like ''), then perform the lookup against the
normalized mapping (use AdaptiveWeights, get_weights, and the severity_mapping
variable names to locate changes); this ensures _optimize_weights can take
effect and prevents AttributeError on None categories.

In `@backend/trend_analyzer.py`:
- Around line 82-98: The hotspot function _find_hotspots is dropping valid
coordinates at 0.0 because the guard uses "if issue.latitude and
issue.longitude" which treats 0.0 as falsy; change the check to explicitly test
for None (e.g., "is not None") so loc_counter still aggregates points at
latitude 0.0 or longitude 0.0; update the loop that builds key =
(round(issue.latitude, 2), round(issue.longitude, 2)) to only execute when
issue.latitude is not None and issue.longitude is not None and leave the rest of
the logic (loc_counter, most_common, hotspots) unchanged.

In `@tests/test_civic_intelligence.py`:
- Line 29: The local variable one_day_ago is assigned but never used; either
remove the unused assignment one_day_ago = now - timedelta(hours=24) from
tests/test_civic_intelligence.py or replace its use where a 24-hour boundary is
required (for example pass one_day_ago into the TrendAnalyzer or test setup
instead of a hardcoded time) so the variable is actually referenced (look for
uses of now and TrendAnalyzer in this test to determine the intended placement).
- Line 141: The test module calls sys.exit(pytest.main(...)) but never imports
the sys module; add an import for sys near the top of the file (alongside other
imports) so that the call to sys.exit(...) resolves correctly; ensure the import
appears before the module-level invocation of sys.exit in
tests/test_civic_intelligence.py.
- Around line 118-131: The test is missing an assertion that verifies keyword
optimization added "severe" to the Pothole category: after the optimization run
that produced updated_weights (and used AdaptiveWeights loaded from the temp
defaults), add an assertion that "severe" is present in
updated_weights['keywords']['Pothole'] (or the equivalent keywords structure
stored on updated_weights) to ensure the keyword optimization path is covered;
keep the existing duplicate radius assertion for 'duplicate_search_radius'
intact.
🧹 Nitpick comments (11)
data/modelWeights.json (1)

1-466: Duplicate keywords across severity levels create dead entries.

Several keywords appear in multiple severity tiers (e.g., "smoke" at lines 51 and 66, "attack" at lines 19 and 81, "dirty" at lines 135 and 216, "leaning" at lines 118 and 217). Since PriorityEngine._calculate_severity checks critical first, then high, etc., the lower-tier duplicates will never influence scoring — they are effectively dead entries. This also means the self-learning engine could add a keyword to a lower tier while it already exists at a higher tier, silently wasting effort.

Consider deduplicating across tiers so the intent is clear and future automated keyword additions don't produce confusion.

backend/intelligence_index.py (1)

41-45: Indentation inconsistency.

Line 43 has an extra leading space compared to line 42. While Python allows this (the if block is still syntactically correct), it breaks visual alignment and could trip up future maintainers.

         engagement_score = 0
         if total_issues > 0:
-             engagement_score = db.query(func.sum(Issue.upvotes)).filter(
+            engagement_score = db.query(func.sum(Issue.upvotes)).filter(
                 Issue.created_at >= one_day_ago
             ).scalar() or 0
backend/trend_analyzer.py (1)

66-80: Keyword extraction is limited to single words — multi-word terms from severity/category configs are invisible.

re.findall(r'\b[a-z]{3,}\b', ...) only captures individual words. Compound terms like "gas leak", "blocked road", "pipe burst", "stray dog" (which are configured in severity keywords and categories) will never surface as trending keywords. This limits the value of the trend→keyword→category auto-association in _optimize_keywords. Consider adding n-gram extraction (bigrams at minimum) if the intent is to catch these multi-word patterns.

backend/adaptive_weights.py (2)

146-147: Use logger.exception to preserve the traceback in error logs.

logger.error(...) discards the stack trace. logger.exception(...) (or logger.error(..., exc_info=True)) inside an except block automatically attaches the traceback, which is critical for diagnosing file I/O failures in production.

Proposed fix
         except Exception as e:
-            logger.error(f"Failed to load weights from {self.weights_file}: {e}")
+            logger.exception(f"Failed to load weights from {self.weights_file}: {e}")
 ...
         except Exception as e:
-            logger.error(f"Failed to save weights to {self.weights_file}: {e}")
+            logger.exception(f"Failed to save weights to {self.weights_file}: {e}")

Also applies to: 169-170


182-187: Placeholder method update_category_weight is a no-op — consider removing or raising NotImplementedError.

The method body is pass, which silently does nothing. Callers (or future callers) won't know their call had no effect. If it's intentionally deferred, raise NotImplementedError makes the contract explicit.

backend/civic_intelligence.py (3)

21-29: Singleton via __new__ is fragile — adding __init__ later will silently re-run on every call.

Python calls __init__ after __new__ on every instantiation, even when __new__ returns an existing instance. If someone adds an __init__ in the future, it will reset weights_manager, trend_analyzer, and intelligence_index on every CivicIntelligenceEngine() call, breaking the singleton contract. A safer pattern is to guard initialization in __init__ itself.

Safer singleton pattern
 class CivicIntelligenceEngine:
     _instance = None
+    _initialized = False

-    def __new__(cls):
-        if cls._instance is None:
-            cls._instance = super(CivicIntelligenceEngine, cls).__new__(cls)
-            cls._instance.weights_manager = AdaptiveWeights()
-            cls._instance.trend_analyzer = TrendAnalyzer()
-            cls._instance.intelligence_index = IntelligenceIndex()
-        return cls._instance
+    def __new__(cls):
+        if cls._instance is None:
+            cls._instance = super().__new__(cls)
+        return cls._instance
+
+    def __init__(self):
+        if not self._initialized:
+            self.weights_manager = AdaptiveWeights()
+            self.trend_analyzer = TrendAnalyzer()
+            self.intelligence_index = IntelligenceIndex()
+            type(self)._initialized = True

49-51: Two separate datetime.now() calls for the same snapshot — use a single timestamp.

Lines 50 and 51 each call datetime.now(timezone.utc). Under normal conditions the difference is negligible, but for correctness the date and timestamp in a single snapshot should be derived from one instant.

Proposed fix
+        now = datetime.now(timezone.utc)
         snapshot = {
-            "date": datetime.now(timezone.utc).date().isoformat(),
-            "timestamp": datetime.now(timezone.utc).isoformat(),
+            "date": now.date().isoformat(),
+            "timestamp": now.isoformat(),

90-121: Datetime computation is repeated inside the loop — hoist outside.

now and one_day_ago are recomputed on every keyword iteration (lines 103–104) but are effectively constant within a single refine_daily call.

Proposed fix
     def _optimize_keywords(self, db: Session, trend_data: dict):
         top_keywords = trend_data.get("top_keywords", [])
+        now = datetime.now(timezone.utc)
+        one_day_ago = now - timedelta(hours=24)

         for kw_obj in top_keywords:
             keyword = kw_obj["keyword"]
-            now = datetime.now(timezone.utc)
-            one_day_ago = now - timedelta(hours=24)

             issues_with_kw = db.query(Issue.category, func.count(Issue.id))\
tests/test_civic_intelligence.py (1)

83-90: Consider using pytest's monkeypatch fixture instead of manual patching.

Directly assigning to engine._save_snapshot and restoring in a cleanup block (line 138) is fragile — if an assertion fails before line 138, the original is never restored. Using monkeypatch.setattr handles teardown automatically even on failure.

Proposed fix
-def test_civic_intelligence_refinement(test_db, tmp_path):
+def test_civic_intelligence_refinement(test_db, tmp_path, monkeypatch):
     ...
-    original_save = engine._save_snapshot
-    def mock_save_snapshot(snapshot):
+    def mock_save_snapshot(snapshot):
         filename = f"{snapshot['date']}.json"
         filepath = snapshot_dir / filename
         with open(filepath, 'w') as f:
             json.dump(snapshot, f, indent=4)
 
-    engine._save_snapshot = mock_save_snapshot
+    monkeypatch.setattr(engine, "_save_snapshot", mock_save_snapshot)
     ...
-    # Cleanup
-    engine._save_snapshot = original_save
backend/scheduler/daily_refinement_job.py (2)

22-38: Prefer logger over print for a scheduled/background job.

Scheduled jobs typically run unattended. print output may be lost depending on the deployment environment, whereas logger.info integrates with the already-configured logging and provides timestamps, levels, and structured output.


5-6: sys.path manipulation is fragile.

The commit message notes that a conflicting PYTHONPATH was removed from render.yaml because start-backend.py already handles path manipulation. Having yet another sys.path.append here creates a second, potentially conflicting, path entry. Consider relying on the existing path setup or using -m invocation (python -m backend.scheduler.daily_refinement_job) from the project root instead.

Comment on lines +124 to +132
def __init__(self, weights_file: str = DEFAULT_WEIGHTS_FILE):
self.weights_file = weights_file
self.severity_keywords = self.DEFAULT_SEVERITY_KEYWORDS.copy()
self.urgency_patterns = list(self.DEFAULT_URGENCY_PATTERNS)
self.categories = self.DEFAULT_CATEGORIES.copy()
self.severity_mapping = self.DEFAULT_SEVERITY_MAPPING.copy()
self.duplicate_search_radius = self.DEFAULT_DUPLICATE_RADIUS

self.load_weights()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Shallow copy of mutable class defaults — mutations can corrupt the class-level constants.

.copy() on a dict of lists is a shallow copy. If load_weights() doesn't replace a key (e.g., JSON file exists but omits severity_keywords), the instance attribute still points to the same inner lists as DEFAULT_SEVERITY_KEYWORDS. Any later mutation (e.g., self.severity_keywords["critical"].append(...)) would permanently corrupt the class-level default for all future instances.

Proposed fix — use deep copies
+import copy
+
 def __init__(self, weights_file: str = DEFAULT_WEIGHTS_FILE):
     self.weights_file = weights_file
-    self.severity_keywords = self.DEFAULT_SEVERITY_KEYWORDS.copy()
-    self.urgency_patterns = list(self.DEFAULT_URGENCY_PATTERNS)
-    self.categories = self.DEFAULT_CATEGORIES.copy()
-    self.severity_mapping = self.DEFAULT_SEVERITY_MAPPING.copy()
+    self.severity_keywords = copy.deepcopy(self.DEFAULT_SEVERITY_KEYWORDS)
+    self.urgency_patterns = copy.deepcopy(self.DEFAULT_URGENCY_PATTERNS)
+    self.categories = copy.deepcopy(self.DEFAULT_CATEGORIES)
+    self.severity_mapping = copy.deepcopy(self.DEFAULT_SEVERITY_MAPPING)
     self.duplicate_search_radius = self.DEFAULT_DUPLICATE_RADIUS
 
     self.load_weights()
🤖 Prompt for AI Agents
In `@backend/adaptive_weights.py` around lines 124 - 132, The constructor is
making shallow copies of nested mutable defaults causing instance mutations to
corrupt class-level constants; update __init__ to use deep copies (e.g.,
copy.deepcopy) for DEFAULT_SEVERITY_KEYWORDS, DEFAULT_URGENCY_PATTERNS,
DEFAULT_CATEGORIES, and DEFAULT_SEVERITY_MAPPING instead of .copy() or list() so
each instance gets independent nested structures, import the copy module, and
keep DEFAULT_DUPLICATE_RADIUS as-is (it's immutable); ensure load_weights()
behavior still safely replaces keys but rely on deep-copied instance attributes
to avoid shared references.

Comment on lines +152 to +170
def save_weights(self):
"""Saves current weights to JSON file."""
data = {
"severity_keywords": self.severity_keywords,
"urgency_patterns": self.urgency_patterns,
"categories": self.categories,
"severity_mapping": self.severity_mapping,
"duplicate_search_radius": self.duplicate_search_radius
}

# Ensure directory exists
os.makedirs(os.path.dirname(self.weights_file), exist_ok=True)

try:
with open(self.weights_file, 'w') as f:
json.dump(data, f, indent=4)
logger.info(f"Saved weights to {self.weights_file}")
except Exception as e:
logger.error(f"Failed to save weights to {self.weights_file}: {e}")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

No file locking on read/write — concurrent access can corrupt modelWeights.json.

Multiple processes (e.g., web workers, scheduler, and background tasks) can all instantiate AdaptiveWeights and call save_weights() concurrently. Without file locking (e.g., fcntl.flock or an atomic-write pattern like write-to-temp + rename), a concurrent read during a partial write will produce corrupt JSON. This is especially relevant since the scheduler job, background tasks, and request handlers all touch this file.

Consider at minimum using atomic writes (write to a temp file, then os.replace) to prevent partial-read corruption.

Proposed fix — atomic write pattern
+import tempfile
+
 def save_weights(self):
     """Saves current weights to JSON file."""
     data = { ... }

     os.makedirs(os.path.dirname(self.weights_file), exist_ok=True)

     try:
-        with open(self.weights_file, 'w') as f:
-            json.dump(data, f, indent=4)
+        dir_name = os.path.dirname(self.weights_file)
+        with tempfile.NamedTemporaryFile('w', dir=dir_name, suffix='.tmp', delete=False) as f:
+            json.dump(data, f, indent=4)
+            tmp_path = f.name
+        os.replace(tmp_path, self.weights_file)
         logger.info(f"Saved weights to {self.weights_file}")
     except Exception as e:
         logger.exception(f"Failed to save weights to {self.weights_file}: {e}")
+        if 'tmp_path' in locals() and os.path.exists(tmp_path):
+            os.unlink(tmp_path)
🧰 Tools
🪛 Ruff (0.15.0)

[warning] 169-169: Do not catch blind exception: Exception

(BLE001)


[warning] 170-170: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

🤖 Prompt for AI Agents
In `@backend/adaptive_weights.py` around lines 152 - 170, The save_weights method
currently writes directly to self.weights_file which can lead to partial/corrupt
JSON under concurrent access; change AdaptiveWeights.save_weights to perform an
atomic write by serializing data to a temporary file in the same directory
(e.g., self.weights_file + ".tmp" or use tempfile.NamedTemporaryFile), fsync the
temp file and directory, then atomically replace the target via
os.replace(self.weights_file, temp_path); alternatively (or additionally) use an
inter-process file lock around read/write operations (e.g., with fcntl.flock)
keyed to self.weights_file to prevent concurrent writers/reader races when
loading/saving.

Comment on lines +63 to +88
def _optimize_weights(self, db: Session):
"""
Adjusts weights based on manual feedback (escalations and high severity reports).
"""
now = datetime.now(timezone.utc)
one_day_ago = now - timedelta(hours=24)

# Find critical/high grievances from last 24h
# We look for patterns where a category is consistently marked high/critical
high_severity_grievances = db.query(Grievance.category, func.count(Grievance.id))\
.filter(Grievance.created_at >= one_day_ago)\
.filter(Grievance.severity.in_([SeverityLevel.CRITICAL, SeverityLevel.HIGH]))\
.group_by(Grievance.category).all()

for category, count in high_severity_grievances:
# If we see many high severity issues for a category, ensure mapping reflects it
if count >= 3: # Threshold
current_mapping = self.weights_manager.severity_mapping.get(category.lower())

# If current mapping is lower (medium/low), upgrade it
# Logic: If it's mapped to 'low' or 'medium', but we see 3+ high/critical, upgrade to 'high'
# If we see 5+ critical, upgrade to 'critical' (simplified)

if current_mapping in ['low', 'medium']:
logger.info(f"Auto-adjusting severity for category '{category}' to 'high' due to {count} high/critical reports.")
self.weights_manager.update_severity_mapping(category, 'high')
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

_optimize_weights shares the same category-key mismatch as tasks.py.

Grievance.category stores display names (e.g., "Street Light") while severity_mapping uses slug-like keys ("streetlight"). The .lower() on line 80 produces "street light", which won't match "streetlight", so the auto-upgrade logic will silently skip most categories.

🤖 Prompt for AI Agents
In `@backend/civic_intelligence.py` around lines 63 - 88, The auto-upgrade in
_optimize_weights is comparing display names from Grievance.category to slug
keys in self.weights_manager.severity_mapping (so "Street Light".lower() ->
"street light" won't match "streetlight"); fix by normalizing the category into
the same slug form used by severity_mapping before lookups and updates (e.g.,
lowercasing and removing whitespace/normalizing punctuation or using the
project's slugify utility if available) so use that normalized key for
severity_mapping.get(...) and when calling
self.weights_manager.update_severity_mapping(category, ...) ensure you pass both
the original display name where needed and the normalized slug when updating the
mapping store.

Comment on lines +123 to +145
def _optimize_duplicates(self, trend_data: dict):
"""
Adjusts duplicate search radius based on issue density.
"""
hotspots = trend_data.get("hotspots", [])
if not hotspots:
return

# Get density of top hotspot
top_hotspot = hotspots[0]
count = top_hotspot.get("count", 0)

current_radius = self.weights_manager.duplicate_search_radius

# Logic: If many issues in one spot (high density), increase radius to catch duplicates better
if count >= 10 and current_radius < 100:
new_radius = min(current_radius + 10.0, 100.0)
logger.info(f"High issue density detected. Increasing duplicate search radius to {new_radius}m")
self.weights_manager.duplicate_search_radius = new_radius
self.weights_manager.save_weights()

# Logic: If density is very low but we have many issues, maybe decrease radius?
# (Omitted for safety to avoid missing duplicates)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Duplicate search radius can only grow — no mechanism to shrink it back.

_optimize_duplicates increases the radius when hotspot density is high but never decreases it when density drops. Over time, the radius will ratchet up to 100 m and stay there permanently, potentially over-merging distinct issues in sparse areas. The comment on line 144 acknowledges this.

Consider adding a decay path (e.g., decrease by a small step when no hotspot exceeds a density threshold) or resetting to a baseline periodically.

🤖 Prompt for AI Agents
In `@backend/civic_intelligence.py` around lines 123 - 145, _optimize_duplicates
only increases weights_manager.duplicate_search_radius and never reduces it;
modify this method to add a decay path: when top_hotspot count is below a
low-density threshold (e.g., count < 3) and duplicate_search_radius is above a
defined baseline (e.g., weights_manager.duplicate_search_radius_baseline or a
constant), reduce duplicate_search_radius by a small step (e.g., -5.0) bounded
at the baseline, then call weights_manager.save_weights(); ensure the logic
references _optimize_duplicates, top_hotspot.get("count"),
weights_manager.duplicate_search_radius and weights_manager.save_weights so the
adjustment is reversible and avoids permanent ratcheting to 100m.

Comment on lines +97 to +98
# Get dynamic radius from AdaptiveWeights
search_radius = AdaptiveWeights().duplicate_search_radius
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

AdaptiveWeights() instantiated on every issue-creation request — synchronous file I/O on the async event loop.

AdaptiveWeights().__init__ calls load_weights(), which performs synchronous open() + json.load() on every POST /api/issues request. This blocks the async event loop and adds per-request disk I/O on the hot path.

Consider caching a module-level instance (similar to priority_engine in backend/priority_engine.py) or using a lru_cache/functools.cache wrapper, and reloading periodically or on signal.

Proposed fix — use a cached instance
-from backend.adaptive_weights import AdaptiveWeights
+from backend.adaptive_weights import AdaptiveWeights
+
+# Module-level cached instance; reloaded by the daily refinement job
+_adaptive_weights = AdaptiveWeights()
 ...
             # Get dynamic radius from AdaptiveWeights
-            search_radius = AdaptiveWeights().duplicate_search_radius
+            search_radius = _adaptive_weights.duplicate_search_radius

If you need fresh-from-disk reads, wrap the instantiation in run_in_threadpool to avoid blocking the event loop.

🤖 Prompt for AI Agents
In `@backend/routers/issues.py` around lines 97 - 98, AdaptiveWeights() is being
instantiated on every request (blocking sync file I/O in its __init__), so
replace per-request construction with a cached module-level instance or memoized
factory and read its duplicate_search_radius from that instance (e.g., create a
single AdaptiveWeights instance similar to priority_engine in
backend/priority_engine.py or wrap AdaptiveWeights construction with
functools.lru_cache/functools.cache), or if you must read fresh-from-disk on
each call, perform AdaptiveWeights() creation/load inside an async-safe thread
via fastapi.concurrency.run_in_threadpool before accessing
duplicate_search_radius; update the code that reads duplicate_search_radius to
use the cached/memoized instance or the run_in_threadpool-wrapped construction
instead of calling AdaptiveWeights() directly on each request.

Comment on lines +48 to 54
# Map issue category to grievance severity using AdaptiveWeights
# Load fresh weights to ensure we use the latest optimized mapping
weights_manager = AdaptiveWeights()
severity_mapping = weights_manager.get_weights().get("severity_mapping", {})

# Default fallback if mapping fails
severity = severity_mapping.get(issue.category.lower(), 'medium')
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Category name / severity-mapping key mismatch — many categories will silently fall through to 'medium'.

issue.category holds values like "Street Light", "Stray Animal", "Water Leak", etc., but the severity_mapping keys are short-hand tokens ("streetlight", "animal", "flood", …). After .lower(), "street light" != "streetlight", so most categories miss the lookup and land on the 'medium' fallback. This was likely the same behaviour before the refactor, but now that the mapping is "adaptive" and intended to be auto-tuned, the mismatch will prevent _optimize_weights from having any real effect for those categories.

Additionally, issue.category could be None (the column has no explicit NOT NULL constraint), which would cause an AttributeError on .lower().

Proposed defensive lookup
-        severity = severity_mapping.get(issue.category.lower(), 'medium')
+        cat = (issue.category or "").lower().replace(" ", "")
+        severity = severity_mapping.get(cat, 'medium')

A more robust approach would be to normalize the mapping keys and the category names to the same form (e.g., strip spaces, lowercase) in AdaptiveWeights itself.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Map issue category to grievance severity using AdaptiveWeights
# Load fresh weights to ensure we use the latest optimized mapping
weights_manager = AdaptiveWeights()
severity_mapping = weights_manager.get_weights().get("severity_mapping", {})
# Default fallback if mapping fails
severity = severity_mapping.get(issue.category.lower(), 'medium')
# Map issue category to grievance severity using AdaptiveWeights
# Load fresh weights to ensure we use the latest optimized mapping
weights_manager = AdaptiveWeights()
severity_mapping = weights_manager.get_weights().get("severity_mapping", {})
# Default fallback if mapping fails
cat = (issue.category or "").lower().replace(" ", "")
severity = severity_mapping.get(cat, 'medium')
🤖 Prompt for AI Agents
In `@backend/tasks.py` around lines 48 - 54, The severity lookup is failing
because issue.category (e.g., "Street Light") doesn't match compact mapping keys
(e.g., "streetlight") and .lower() will also crash if category is None; fix by
normalizing both sides: update AdaptiveWeights.get_weights (or its loader) to
normalize mapping keys (strip whitespace, remove internal spaces/normalize
punctuation, and lowercase) and in backend/tasks.py normalize the incoming
category similarly (safe against None by using a fallback like ''), then perform
the lookup against the normalized mapping (use AdaptiveWeights, get_weights, and
the severity_mapping variable names to locate changes); this ensures
_optimize_weights can take effect and prevents AttributeError on None
categories.

Comment on lines +82 to +98
def _find_hotspots(self, issues: List[Any]) -> List[Dict[str, Any]]:
# Group by approximate location (0.01 degree ~ 1.1km)
loc_counter = Counter()
for issue in issues:
if issue.latitude and issue.longitude:
# Round to 2 decimal places (approx 1.1km resolution)
key = (round(issue.latitude, 2), round(issue.longitude, 2))
loc_counter[key] += 1

hotspots = []
for (lat, lon), count in loc_counter.most_common(5):
if count > 1: # Only report if more than 1 issue
hotspots.append({
"latitude": lat,
"longitude": lon,
"count": count
})
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Bug: coordinates at 0.0 (equator / prime meridian) are silently excluded from hotspot analysis.

if issue.latitude and issue.longitude is falsy when either value is 0.0, which is a valid geographic coordinate. Issues located along the equator or prime meridian will be dropped from clustering.

🐛 Proposed fix
         for issue in issues:
-            if issue.latitude and issue.longitude:
+            if issue.latitude is not None and issue.longitude is not None:
                 # Round to 2 decimal places (approx 1.1km resolution)
                 key = (round(issue.latitude, 2), round(issue.longitude, 2))
                 loc_counter[key] += 1
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def _find_hotspots(self, issues: List[Any]) -> List[Dict[str, Any]]:
# Group by approximate location (0.01 degree ~ 1.1km)
loc_counter = Counter()
for issue in issues:
if issue.latitude and issue.longitude:
# Round to 2 decimal places (approx 1.1km resolution)
key = (round(issue.latitude, 2), round(issue.longitude, 2))
loc_counter[key] += 1
hotspots = []
for (lat, lon), count in loc_counter.most_common(5):
if count > 1: # Only report if more than 1 issue
hotspots.append({
"latitude": lat,
"longitude": lon,
"count": count
})
def _find_hotspots(self, issues: List[Any]) -> List[Dict[str, Any]]:
# Group by approximate location (0.01 degree ~ 1.1km)
loc_counter = Counter()
for issue in issues:
if issue.latitude is not None and issue.longitude is not None:
# Round to 2 decimal places (approx 1.1km resolution)
key = (round(issue.latitude, 2), round(issue.longitude, 2))
loc_counter[key] += 1
hotspots = []
for (lat, lon), count in loc_counter.most_common(5):
if count > 1: # Only report if more than 1 issue
hotspots.append({
"latitude": lat,
"longitude": lon,
"count": count
})
🤖 Prompt for AI Agents
In `@backend/trend_analyzer.py` around lines 82 - 98, The hotspot function
_find_hotspots is dropping valid coordinates at 0.0 because the guard uses "if
issue.latitude and issue.longitude" which treats 0.0 as falsy; change the check
to explicitly test for None (e.g., "is not None") so loc_counter still
aggregates points at latitude 0.0 or longitude 0.0; update the loop that builds
key = (round(issue.latitude, 2), round(issue.longitude, 2)) to only execute when
issue.latitude is not None and issue.longitude is not None and leave the rest of
the logic (loc_counter, most_common, hotspots) unchanged.


# 1. Create Dummy Issues
now = datetime.now(timezone.utc)
one_day_ago = now - timedelta(hours=24)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Unused variable one_day_ago.

one_day_ago is assigned but never referenced. It looks like it was intended for constructing time-window boundaries (e.g., the TrendAnalyzer 24h window) but ended up unused. Remove it or use it where appropriate to avoid confusion.

🧰 Tools
🪛 Ruff (0.15.0)

[error] 29-29: Local variable one_day_ago is assigned to but never used

Remove assignment to unused variable one_day_ago

(F841)

🤖 Prompt for AI Agents
In `@tests/test_civic_intelligence.py` at line 29, The local variable one_day_ago
is assigned but never used; either remove the unused assignment one_day_ago =
now - timedelta(hours=24) from tests/test_civic_intelligence.py or replace its
use where a 24-hour boundary is required (for example pass one_day_ago into the
TrendAnalyzer or test setup instead of a hardcoded time) so the variable is
actually referenced (look for uses of now and TrendAnalyzer in this test to
determine the intended placement).

Comment on lines +118 to +131
# Check Keyword Optimization
# "severe" should be added to Pothole category (appeared 3 times)
# Note: logic checks if keyword is NOT in current keywords. Default keywords don't have "severe".
# Wait, 'AdaptiveWeights' in test is loaded from temp file which starts with defaults.
# Default Pothole keywords: ["pothole", "hole", "crater", "road damage", "broken road"]
# "severe" is not there.
# But "severe" might be a stopword? No.
# Also need to check if "severe" is extracted as top keyword. TrendAnalyzer: top_n=5.
# Keywords: pothole(3+1+1=5), severe(3), crater(1), road(1), damage(1), garbage(2), streetlight(1), broken(1)
# "severe" should be in top 5.

# Check duplicate radius optimization
# Density was 10. Should increase radius by 10 (default 50 -> 60).
assert updated_weights['duplicate_search_radius'] == 60.0
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Missing assertion for keyword optimization.

Lines 118–127 contain detailed comments explaining that "severe" should be added to Pothole category keywords, but no assert statement actually validates this. This means the keyword optimization path has zero test coverage here.

+    # "severe" should have been added to Pothole keywords
+    pothole_keywords = updated_weights['categories'].get('Pothole', [])
+    assert 'severe' in pothole_keywords, "Expected 'severe' to be added to Pothole keywords"
+
     # Check duplicate radius optimization
🤖 Prompt for AI Agents
In `@tests/test_civic_intelligence.py` around lines 118 - 131, The test is missing
an assertion that verifies keyword optimization added "severe" to the Pothole
category: after the optimization run that produced updated_weights (and used
AdaptiveWeights loaded from the temp defaults), add an assertion that "severe"
is present in updated_weights['keywords']['Pothole'] (or the equivalent keywords
structure stored on updated_weights) to ensure the keyword optimization path is
covered; keep the existing duplicate radius assertion for
'duplicate_search_radius' intact.

engine._save_snapshot = original_save

if __name__ == "__main__":
sys.exit(pytest.main(["-v", __file__]))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

sys is not imported — this will raise NameError at runtime.

Line 141 references sys.exit(...) but sys is never imported in this file. This block will crash if the file is executed directly.

Proposed fix

Add the import at the top of the file:

 import pytest
 import os
 import json
+import sys
 from datetime import datetime, timedelta, timezone
🧰 Tools
🪛 Ruff (0.15.0)

[error] 141-141: Undefined name sys

(F821)

🤖 Prompt for AI Agents
In `@tests/test_civic_intelligence.py` at line 141, The test module calls
sys.exit(pytest.main(...)) but never imports the sys module; add an import for
sys near the top of the file (alongside other imports) so that the call to
sys.exit(...) resolves correctly; ensure the import appears before the
module-level invocation of sys.exit in tests/test_civic_intelligence.py.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments