Cloudflare Workers/D1 implementation of the Mass Murder Canada site, migrated from the original Go/Echo app.
Original project: github.com/darron/ff
- Original public URL structure preserved.
- Admin dashboard for records and linked news stories.
- REST-style admin APIs for CRUD + AI queue operations.
- AI summarization pipeline:
- Per-story extraction and summary.
- Record-level synthesis across linked sources.
- Source typing (
official,news,social,other) with social-only incidents treated as alleged. - Chunked queue processing for large records.
- AI summaries are rendered as HTML from Markdown on record pages.
- Sentry error monitoring (
fetch+queue) via@sentry/cloudflare.
See docs/SETUP.md for full setup.
Quick start:
npm install- Configure admin auth (see docs/ADMIN_SETUP.md)
npm run dev- Deploy as needed:
- Staging:
npx wrangler deploy --env staging - Production:
npx wrangler deploy --env production
- Staging:
- docs/README.md - Docs index
- docs/SETUP.md - Setup/deployment
- docs/ADMIN_SETUP.md - Admin dashboard + API
- docs/SECURITY.md - Security notes
- docs/NVM_GUIDE.md - Node/NVM guidance
- docs/CHANGELOG.md - Change history
ff-workers/
├── src/
│ ├── index.js # Worker entrypoint (routes + queue + Sentry wrapper)
│ ├── admin.js # Admin API handlers
│ ├── admin-ui.js # Admin dashboard HTML/JS
│ ├── ai-summary.js # Queue-driven AI summarization pipeline
│ ├── source-classification.js # URL/source credibility typing
│ ├── db.js # Record/story queries
│ ├── auth.js # Admin authentication/session helpers
│ └── templates.js # Public page templates + markdown renderer
├── scripts/
│ └── deploy-production-with-sentry.sh
├── migrations/
│ ├── 0001_initial.sql
│ ├── 0002_data.sql
│ ├── data/
│ └── prod-data/
├── wrangler.toml
├── package.json
├── migrate-data.cjs
├── import-prod-dump.cjs
└── database_dump.sql
Public:
//records/group/:group/records/provinces/:province/records/:id
Admin:
/admin/admin/api/records/*/admin/api/stories/*/admin/api/sentry-test
Configured in wrangler.toml:
compatibility_flags = ["nodejs_compat"](required for Sentry SDK)- Queue binding name in code:
SUMMARY_QUEUE
Staging (--env staging):
- Worker:
massmurdercanada-staging - AI: enabled, manual on save (
AI_SUMMARY_AUTO_ON_SAVE=false) AI_SUMMARY_STORIES_PER_JOB=10- Queue:
massmurdercanada-staging-summary - Queue consumer:
max_batch_size=5,max_batch_timeout=10
Production (--env production):
- Worker/routes:
massmurdercanadaonmassmurdercanada.org/* - AI: enabled, auto on save (
AI_SUMMARY_AUTO_ON_SAVE=true) AI_SUMMARY_STORIES_PER_JOB=5- Queue:
massmurdercanada-production-summary - Queue consumer:
max_batch_size=1,max_batch_timeout=5
Trigger paths:
- Manual per-record:
POST /admin/api/records/:id/summarize - Bulk backfill:
POST /admin/api/records/summarize-all - Auto-on-save (when enabled): record/story create/update operations enqueue a job
Bulk backfill request options:
limit(1-100, default25)offset(default0)only_missing(defaulttrue)include_fallback(defaulttrue)
Extraction flow per story:
- Reuse stored
body_textwhen sufficient. - Direct fetch + structured extraction (JSON-LD/article/main/meta).
- Optional summarize daemon fallback (
AI_FETCH_SUMMARIZE_DAEMON_URL). - Optional fallback readers:
r.jina.ai,markdown.new.
Additional behavior:
- RCMP URLs are normalized from
rcmp-grc.gc.catorcmp.ca. - Unsafe URLs (non-public/localhost/private IP) are blocked.
- Large records are processed in chunks; final synthesis runs on last chunk.
- Source selection for synthesis favors
official/newsand de-emphasizes social links unless social is all that exists. - Structured logs are emitted as
ai_summary_queue_job. - Record metadata date is treated as year-only for synthesis context.
Optional summarize daemon token secret:
npx wrangler secret put AI_FETCH_SUMMARIZE_DAEMON_TOKEN --env productionnpx wrangler secret put AI_FETCH_SUMMARIZE_DAEMON_TOKEN --env staging
Create queues once (use latest Wrangler):
npx wrangler@latest queues create massmurdercanada-staging-summary \
--message-retention-period-secs 86400 \
--delivery-delay-secs 0
npx wrangler@latest queues create massmurdercanada-production-summary \
--message-retention-period-secs 86400 \
--delivery-delay-secs 0Then deploy the Worker for each environment.
Sentry is wired through @sentry/cloudflare and reads runtime config from env/secrets:
SENTRY_DSN(secret)SENTRY_RELEASE(optional var)SENTRY_ENVIRONMENT(optional var)
Set DSN secret (production):
npx wrangler secret put SENTRY_DSN --env productionAdmin Sentry test:
- Dashboard button calls
POST /admin/api/sentry-test. - If
SENTRY_DSNis not set (e.g., staging), endpoint returns412instead of failing deployment/runtime.
Release + deploy workflow:
- One-command flow:
npm run deploy:production:sentry - Script: scripts/deploy-production-with-sentry.sh
- Requires env vars in shell:
SENTRY_AUTH_TOKEN,SENTRY_ORG,SENTRY_PROJECT
- Dates are stored in mixed formats, but UI and synthesis treat canonical record dates as year-level context.
- AI backfill targets missing summaries by default and can include existing fallback summaries.
- Story summaries and record synthesis are stored in D1 (
news_stories.ai_summary,records.ai_summary).