Skip to content

⚡ Bolt: Optimize issue retrieval and implement blockchain integrity#348

Merged
RohanExploit merged 1 commit intomainfrom
bolt-optimize-db-blockchain-1135829609348711867
Feb 6, 2026
Merged

⚡ Bolt: Optimize issue retrieval and implement blockchain integrity#348
RohanExploit merged 1 commit intomainfrom
bolt-optimize-db-blockchain-1135829609348711867

Conversation

@RohanExploit
Copy link
Owner

@RohanExploit RohanExploit commented Feb 6, 2026

This PR introduces significant performance improvements to the VishwaGuru backend by optimizing how issues are retrieved and processed. By switching from full ORM model loading to targeted column projection, we reduce the memory footprint and database transfer size for common endpoints. Additionally, as requested, a "blockchain" feature has been implemented in the form of a cryptographic integrity seal that chains reports together using SHA-256 hashes, providing a robust audit trail with minimal performance overhead.

Key Changes:

  • Column projection in backend/routers/issues.py for retrieval endpoints.
  • Atomic update() for deduplication upvotes.
  • integrity_hash column added to Issue model.
  • Lightweight hashing logic in create_issue.
  • Verified with regression tests.

PR created automatically by Jules for task 1135829609348711867 started by @RohanExploit

Summary by CodeRabbit

  • New Features

    • Added integrity verification mechanism to issues using hash-based validation.
  • Bug Fixes

    • Optimized issue queries and searches for improved response performance.
    • Modified user issues endpoint to return streamlined data responses.

- Optimized `get_user_issues`, `get_nearby_issues`, and spatial deduplication queries using SQLAlchemy column projection.
- Reduced database I/O and memory usage by avoiding full ORM object loading for list views.
- Implemented a lightweight SHA-256 integrity seal (blockchain) for issues to ensure data tamper-resistance.
- Improved atomic upvote increments using bulk `update()` queries.
- Updated database migration logic to handle the new `integrity_hash` column.

Co-authored-by: RohanExploit <178623867+RohanExploit@users.noreply.github.com>
Copilot AI review requested due to automatic review settings February 6, 2026 14:17
@google-labs-jules
Copy link
Contributor

👋 Jules, reporting for duty! I'm here to lend a hand with this pull request.

When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down.

I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job!

For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with @jules. You can find this option in the Pull Request section of your global Jules UI settings. You can always switch back!

New to Jules? Learn more at jules.google/docs.


For security, I will only act on instructions from the user who triggered this task.

@netlify
Copy link

netlify bot commented Feb 6, 2026

Deploy Preview for fixmybharat canceled.

Name Link
🔨 Latest commit 969e852
🔍 Latest deploy log https://app.netlify.com/projects/fixmybharat/deploys/6985f7ec4eacf000085764bc

@RohanExploit RohanExploit temporarily deployed to bolt-optimize-db-blockchain-1135829609348711867 - vishwaguru-backend PR #348 February 6, 2026 14:17 — with Render Destroyed
@github-actions
Copy link

github-actions bot commented Feb 6, 2026

🙏 Thank you for your contribution, @RohanExploit!

PR Details:

Quality Checklist:
Please ensure your PR meets the following criteria:

  • Code follows the project's style guidelines
  • Self-review of code completed
  • Code is commented where necessary
  • Documentation updated (if applicable)
  • No new warnings generated
  • Tests added/updated (if applicable)
  • All tests passing locally
  • No breaking changes to existing functionality

Review Process:

  1. Automated checks will run on your code
  2. A maintainer will review your changes
  3. Address any requested changes promptly
  4. Once approved, your PR will be merged! 🎉

Note: The maintainers will monitor code quality and ensure the overall project flow isn't broken.

@coderabbitai
Copy link

coderabbitai bot commented Feb 6, 2026

📝 Walkthrough

Walkthrough

This pull request adds blockchain-based integrity verification to issues via a new integrity_hash column computed using SHA-256, optimizes list-view endpoints using column projection instead of full ORM loading, and implements atomic SQL updates for deduplication logic.

Changes

Cohort / File(s) Summary
Documentation & Learning Notes
.jules/bolt.md
Added learning notes on performance implications of column projection vs. full ORM loading for list views and spatial queries.
Database Schema
backend/init_db.py
Introduces migration to add integrity_hash VARCHAR column to issues table with error handling for idempotency.
Model Definition
backend/models.py
Added integrity_hash optional string field to Issue model for storing blockchain-based integrity seals.
Router Optimization & Integrity
backend/routers/issues.py
Implemented SHA-256 integrity_hash computation on issue creation; converted get_nearby_issues and get_user_issues to use column projection instead of full ORM instances; updated deduplication path to use atomic SQL updates.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • #342: Implements column projection optimization for list-view endpoints (get_recent_issues), following the same performance pattern applied here.
  • #346: Modifies backend/routers/issues.py to replace full ORM loading with column projection across issue-list endpoints.
  • #343: Alters Issue data loading and serialization in backend/routers/issues.py toward lighter, projected responses.

Suggested labels

size/m

Poem

🐰 A hash is born, integrity's crown,
SHA-256 seals each issue down,
Columns dance, no full loads weigh,
Swift queries light the faster way,
Deduplication's atomic gleam! ✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 60.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately reflects the primary changes: database optimization via column projection and integrity_hash blockchain implementation.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch bolt-optimize-db-blockchain-1135829609348711867

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@backend/routers/issues.py`:
- Around line 168-177: The current create_issue flow (where prev_issue is read
via run_in_threadpool and integrity_hash computed) is vulnerable to race
conditions and omits important fields; fix by serializing the create path (e.g.,
acquire a DB-level advisory lock or run create_issue inside a
SERIALIZABLE/REPEATABLE READ transaction) so reading prev_issue and inserting
the new Issue happen atomically, and change the integrity_hash computation to
include all material fields (e.g., description, category, latitude, longitude,
user_email, location, created_at and prev_hash) so tampering is detectable;
ensure the lock/transaction surrounds the query of Issue.integrity_hash, hash
computation, and the subsequent insert to guarantee a linear chain.
🧹 Nitpick comments (2)
backend/init_db.py (1)

100-105: Inconsistent logging: print() vs logger.info().

Other migration steps in this file (e.g., lines 19, 27, 35) use logger.info() for success messages, while this block (and the latitude/longitude/location/action_plan blocks) use print(). Consider using logger.info() here for consistency with the structured logging approach used elsewhere.

Also, per the Ruff S110 hint, the bare except Exception: pass silently swallows errors. The existing pattern throughout this file has the same issue, but for new code it's worth logging at debug level to aid troubleshooting unexpected migration failures (e.g., permission errors that aren't "column already exists").

Suggested fix
             # Add integrity_hash column for blockchain feature
             try:
                 conn.execute(text("ALTER TABLE issues ADD COLUMN integrity_hash VARCHAR"))
-                print("Migrated database: Added integrity_hash column.")
-            except Exception:
-                pass
+                logger.info("Migrated database: Added integrity_hash column.")
+            except Exception:
+                logger.debug("integrity_hash column likely already exists.")
backend/routers/issues.py (1)

546-565: Inconsistent created_at serialization across endpoints.

In get_user_issues, created_at is passed as a raw datetime object (Line 556), relying on Pydantic to serialize it. In get_recent_issues (Line 604), .isoformat() is called explicitly because the data is also cached as JSON. This inconsistency could lead to different date formats in API responses if the serialization paths diverge (e.g., a Pydantic config change).

Consider aligning the approach — either always let Pydantic handle it (preferred) or always pre-format.

Comment on lines +168 to +177
# Blockchain feature: calculate integrity hash for the report
# Optimization: Fetch only the last hash to maintain the chain with minimal overhead
prev_issue = await run_in_threadpool(
lambda: db.query(Issue.integrity_hash).order_by(Issue.id.desc()).first()
)
prev_hash = prev_issue[0] if prev_issue and prev_issue[0] else ""

# Simple but effective SHA-256 chaining
hash_content = f"{description}|{category}|{prev_hash}"
integrity_hash = hashlib.sha256(hash_content.encode()).hexdigest()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Race condition: concurrent creates break the hash chain.

The "blockchain" integrity chain fetches the previous hash (Line 171) and computes the new hash (Line 177) without any serialization. Two concurrent create_issue requests will both read the same prev_hash, producing two issues that chain off the same predecessor — breaking the chain's linear integrity guarantee. This is the core value proposition of chaining, so it's a significant gap.

Additionally, the hash only covers description|category|prev_hash. Fields like latitude, longitude, user_email, location, and created_at are excluded, meaning those fields could be tampered with without breaking the chain. If the goal is tamper-evident audit, consider including all material fields.

Possible mitigations:

  • Use a database-level advisory lock or serializable isolation for the create path to enforce sequential chaining.
  • Include all material issue fields in the hash input.
🤖 Prompt for AI Agents
In `@backend/routers/issues.py` around lines 168 - 177, The current create_issue
flow (where prev_issue is read via run_in_threadpool and integrity_hash
computed) is vulnerable to race conditions and omits important fields; fix by
serializing the create path (e.g., acquire a DB-level advisory lock or run
create_issue inside a SERIALIZABLE/REPEATABLE READ transaction) so reading
prev_issue and inserting the new Issue happen atomically, and change the
integrity_hash computation to include all material fields (e.g., description,
category, latitude, longitude, user_email, location, created_at and prev_hash)
so tampering is detectable; ensure the lock/transaction surrounds the query of
Issue.integrity_hash, hash computation, and the subsequent insert to guarantee a
linear chain.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces performance optimizations and a blockchain integrity feature to the VishwaGuru backend. The optimization work focuses on replacing full ORM model loading with targeted column projection in issue retrieval endpoints, which reduces memory footprint and database transfer costs. The blockchain feature adds cryptographic hash chaining to create an audit trail for issue creation.

Changes:

  • Column projection optimization for get_nearby_issues, get_user_issues, and spatial deduplication queries to reduce memory usage
  • Atomic update() query pattern for upvote deduplication to prevent race conditions
  • Blockchain integrity hash implementation using SHA-256 chaining with previous issue hash

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 8 comments.

File Description
backend/routers/issues.py Implements column projection for issue queries, atomic upvote updates, and blockchain integrity hash calculation during issue creation
backend/models.py Adds integrity_hash column to Issue model for blockchain feature
backend/init_db.py Database migration to add integrity_hash column
.jules/bolt.md Documents the column projection optimization pattern and best practices

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +168 to +177
# Blockchain feature: calculate integrity hash for the report
# Optimization: Fetch only the last hash to maintain the chain with minimal overhead
prev_issue = await run_in_threadpool(
lambda: db.query(Issue.integrity_hash).order_by(Issue.id.desc()).first()
)
prev_hash = prev_issue[0] if prev_issue and prev_issue[0] else ""

# Simple but effective SHA-256 chaining
hash_content = f"{description}|{category}|{prev_hash}"
integrity_hash = hashlib.sha256(hash_content.encode()).hexdigest()
Copy link

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing test coverage for the newly introduced blockchain integrity feature. The integrity_hash calculation and chaining logic is not covered by any tests. Given that this is a critical security feature for audit trails, it should have comprehensive test coverage including: sequential issue creation to verify proper chaining, concurrent creation to test race conditions, and hash verification to ensure data integrity.

Add tests to verify the blockchain functionality works correctly under both normal and concurrent conditions.

Copilot uses AI. Check for mistakes.
Comment on lines +148 to 157
# Atomic update for upvotes to prevent race conditions
# Use query update to avoid fetching the full model instance
await run_in_threadpool(
lambda: db.query(Issue).filter(Issue.id == linked_issue_id).update({
Issue.upvotes: func.coalesce(Issue.upvotes, 0) + 1
}, synchronize_session=False)
)

# Update the database with the upvote
# Commit the upvote
await run_in_threadpool(db.commit)
Copy link

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The atomic upvote update using query().update() is correct, but the commit is done in a separate threadpool call without error handling for the update operation itself. If the update() call succeeds but the commit fails, the update is rolled back but no error is raised to the caller. The deduplication flow continues as if the upvote succeeded.

Consider wrapping both the update and commit in the same lambda to ensure transactional consistency, or add error handling after the update to verify it affected exactly one row before committing.

Copilot uses AI. Check for mistakes.
Comment on lines +549 to +550
desc = row.description or ""
short_desc = desc[:100] + "..." if len(desc) > 100 else desc
Copy link

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Description truncation logic doesn't handle None values safely. If row.description is None, the code tries to slice None which would raise a TypeError. While line 549 guards against this with desc = row.description or "", the same pattern should be consistently applied.

The current code at line 549 is correct, but ensure this pattern is consistently used elsewhere. Consider extracting this into a helper function to avoid duplication and potential bugs.

Copilot uses AI. Check for mistakes.
# Blockchain feature: calculate integrity hash for the report
# Optimization: Fetch only the last hash to maintain the chain with minimal overhead
prev_issue = await run_in_threadpool(
lambda: db.query(Issue.integrity_hash).order_by(Issue.id.desc()).first()
Copy link

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The blockchain implementation assumes Issue.id is a reliable ordering mechanism, but this may not be safe in all database configurations. Some databases don't guarantee that auto-increment IDs are strictly sequential across concurrent transactions (e.g., if a transaction with ID 100 commits before a transaction with ID 99).

Consider adding a dedicated created_at or sequence column with a database-level constraint to ensure proper ordering, or add a chain_index column that's explicitly managed. Alternatively, document that this blockchain feature requires single-writer guarantees or accept that the chain ordering might not perfectly match chronological order under high concurrency.

Suggested change
lambda: db.query(Issue.integrity_hash).order_by(Issue.id.desc()).first()
lambda: db.query(Issue.integrity_hash)
.order_by(Issue.created_at.desc(), Issue.id.desc())
.first()

Copilot uses AI. Check for mistakes.
Comment on lines +168 to +177
# Blockchain feature: calculate integrity hash for the report
# Optimization: Fetch only the last hash to maintain the chain with minimal overhead
prev_issue = await run_in_threadpool(
lambda: db.query(Issue.integrity_hash).order_by(Issue.id.desc()).first()
)
prev_hash = prev_issue[0] if prev_issue and prev_issue[0] else ""

# Simple but effective SHA-256 chaining
hash_content = f"{description}|{category}|{prev_hash}"
integrity_hash = hashlib.sha256(hash_content.encode()).hexdigest()
Copy link

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Critical race condition in blockchain integrity implementation. When multiple issues are created concurrently, they can read the same previous hash before any of them commits, resulting in multiple issues with the same previous hash and breaking the blockchain chain integrity. This defeats the purpose of the blockchain audit trail.

Consider using database-level locking (SELECT FOR UPDATE) or a transaction-level sequence to ensure each issue gets a unique position in the chain. Alternatively, calculate the hash after commit and update it, or use a dedicated sequence/counter column to enforce ordering independently of the hash chain.

Copilot uses AI. Check for mistakes.
Comment on lines +546 to +565
# Convert results to dictionaries for faster serialization and schema compliance
data = []
for row in results:
desc = row.description or ""
short_desc = desc[:100] + "..." if len(desc) > 100 else desc

data.append({
"id": row.id,
"category": row.category,
"description": short_desc,
"created_at": row.created_at,
"image_path": row.image_path,
"status": row.status,
"upvotes": row.upvotes if row.upvotes is not None else 0,
"location": row.location,
"latitude": row.latitude,
"longitude": row.longitude
})

return data
Copy link

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

API response format inconsistency with declared response_model. This endpoint declares response_model=List[IssueSummaryResponse] but returns a list of plain dictionaries instead of Pydantic model instances. This bypasses FastAPI's automatic validation and serialization, which could lead to runtime errors if the response structure doesn't match the schema.

The correct approach is to either: (1) Return IssueSummaryResponse instances constructed from the rows, or (2) Continue returning dictionaries but ensure they're properly validated. Note that FastAPI with response_model will automatically validate and serialize dictionaries if they match the schema structure, but it's more explicit and maintainable to use Pydantic instances.

Copilot uses AI. Check for mistakes.
prev_hash = prev_issue[0] if prev_issue and prev_issue[0] else ""

# Simple but effective SHA-256 chaining
hash_content = f"{description}|{category}|{prev_hash}"
Copy link

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incomplete blockchain integrity: The integrity hash only includes description and category, but excludes other important fields like location (latitude/longitude), user_email, and timestamps. This means two issues with the same description and category but different locations would produce identical hash chain entries, making it impossible to verify the full integrity of the data.

Consider including all immutable fields in the hash calculation: description, category, latitude, longitude, location, user_email, and optionally a timestamp. This provides stronger integrity guarantees and makes the blockchain audit trail more meaningful.

Suggested change
hash_content = f"{description}|{category}|{prev_hash}"
# Include all relevant immutable fields to strengthen integrity guarantees
hash_parts = [
description or "",
category or "",
user_email or "",
"" if latitude is None else str(latitude),
"" if longitude is None else str(longitude),
location or "",
prev_hash,
]
hash_content = "|".join(hash_parts)

Copilot uses AI. Check for mistakes.
try:
conn.execute(text("ALTER TABLE issues ADD COLUMN integrity_hash VARCHAR"))
print("Migrated database: Added integrity_hash column.")
except Exception:
Copy link

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'except' clause does nothing but pass and there is no explanatory comment.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments