147 Commits

Author SHA1 Message Date
1932931221 Additional bug verification in INVESTIGATION.md
More bugs verified:
- BUG-021: Chunk validation exists via yield_spans 
- BUG-027: N/A - defaulting to 0.0 is reasonable fallback
- BUG-055: collection_model now returns None instead of "unknown" 

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-19 20:47:43 +00:00
74e26e1892 Continue bug verification in INVESTIGATION.md
Additional bugs verified as fixed:
- BUG-017: collection_name index exists 
- BUG-020: server_id index exists 
- BUG-039: Email folder errors handled gracefully 
- BUG-041: N/A - reasonable behavior (backup disabled without key)

Updated bug count to 30+ confirmed fixed

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-19 20:45:43 +00:00
c2be97aad5 Verify and document fixed bugs in INVESTIGATION.md
Updated bug statuses after verification:
- BUG-014: CORS now uses settings.SERVER_URL 
- BUG-015: Celery has global retry config 
- BUG-016: safe_task_execution re-raises exceptions 
- BUG-019: embed_status properly set to STORED 
- BUG-031: SearchConfig enforces limits 
- BUG-033: No print statements in src/memory 
- BUG-035: Task time limits configured 
- BUG-037: Timezone handling fixed 
- BUG-040: Resource limits added (via BUG-067) 
- BUG-043: Health check validates dependencies 
- BUG-054: OAuthToken is intentional mixin design
- BUG-060: Print changed to logger.debug 

Updated executive summary with fix count (25+)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-19 20:40:55 +00:00
0da4f03656 Mark BUG-003, 008, 009, 013 as already fixed
Verified in code review:
- BUG-003: BM25 applies all SearchFilters
- BUG-008: yield_spans() guarantees token limits
- BUG-009: Uses FOR UPDATE SKIP LOCKED
- BUG-013: Has retry logic with exponential backoff

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-19 20:28:34 +00:00
2e3371ec4e Update INVESTIGATION.md with verified bug fixes
- Mark BUG-010 (MCP servers) as already fixed
- Mark BUG-011 (User ID type) as already fixed
- Document BUG-061 to BUG-068 fixes from commit 1c43f1a

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-19 20:25:09 +00:00
1c43f1ae62 Fix 7 critical security and code quality bugs (BUG-061 to BUG-068)
Security Fixes:
- BUG-061: Replace insecure SHA-256 password hashing with bcrypt
- BUG-065: Add constant-time comparison for password verification
- BUG-062: Remove full OAuth token logging
- BUG-064: Remove shell=True from subprocess calls

Code Quality:
- BUG-063: Update 24+ deprecated SQLAlchemy .get() calls

Infrastructure:
- BUG-067: Add resource limits to Docker services
- BUG-068: Enable Redis persistence (AOF)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-19 20:22:46 +00:00
Daniel O'Connell
adff8662bb Add investigation findings documentation
Documents 100+ issues found across:
- Data layer (10 issues)
- Content processing (12 issues)
- Search system (14 issues)
- API layer (12 issues)
- Worker tasks (20 issues)
- Infrastructure (12 issues)
- Code quality (20+ issues)

Includes database statistics and improvement suggestions.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-19 19:24:17 +01:00
Daniel O'Connell
52274f82a6 Fix 19 bugs from investigation
Critical/High severity fixes:
- BUG-001: Path traversal vulnerabilities (3 endpoints)
- BUG-003: BM25 filters now apply size/observation_types
- BUG-006: Remove API key from log messages
- BUG-008: Chunk size validation before yielding
- BUG-009: Race condition fix with FOR UPDATE SKIP LOCKED
- BUG-010: Add mcp_servers property to MessageProcessor
- BUG-011: Fix user_id type (BigInteger→Integer)
- BUG-012: Swap inverted score thresholds
- BUG-013: Add retry logic to embedding pipeline
- BUG-014: Fix CORS to use specific origin
- BUG-015: Add Celery retry/timeout defaults
- BUG-016: Re-raise exceptions for Celery retries

Medium severity fixes:
- BUG-017: Add collection_name index on Chunk
- BUG-031: Add SearchConfig limits (max 1000/300s)
- BUG-033: Replace debug prints with logger calls
- BUG-037: Clarify timezone handling in scheduler
- BUG-043: Health check now validates DB + Qdrant
- BUG-055: collection_model returns None not "unknown"

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-19 19:07:10 +01:00
Daniel O'Connell
a1444efaac Add database and file restore tools
tools/restore_databases.sh: Script to restore PostgreSQL and Qdrant
backups from encrypted backup files.

tools/restore_files.py: Python script to restore Fernet-encrypted
file backups.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-19 18:38:25 +01:00
Daniel O'Connell
116d0362a2 Fix REGISTER_ENABLED always evaluating to True (BUG-005)
Logic error: `boolean_env(...) or True` always evaluates to True,
making the environment variable useless.

Fixed by removing `or True`. Note: This setting is currently unused
in the codebase but the fix ensures correct behavior when it's used.

Also updates DISCORD_MODEL default to claude-haiku-4-5 for faster
and cheaper responses.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-19 18:33:05 +01:00
Daniel O'Connell
28bc10df92 Fix break_chunk() appending wrong object (BUG-007)
The function was appending the entire DataChunk object instead of
the individual item when processing non-string data (e.g., images).

Bug: `result.append(chunk)` should have been `result.append(c)`

This caused:
- Type mismatches (returning DataChunk instead of MulitmodalChunk)
- Potential circular references
- Embedding failures for mixed content

Fixed by appending the individual item `c` instead of the parent `chunk`.
Updated existing test and added new test to verify behavior.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-19 18:27:13 +01:00
Daniel O'Connell
21dedbeb61 Fix search score aggregation to use mean instead of sum
BUG-004: Score aggregation was broken - documents with more chunks
would always rank higher regardless of relevance because scores were
summed instead of averaged.

Changes:
- Changed score calculation from sum() to mean()
- Added comprehensive tests for SearchResult.from_source_item()
- Added tests for elide_content helper

This ensures search results are ranked by actual relevance rather
than by the number of chunks in the document.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-19 18:25:15 +01:00
Daniel O'Connell
93b77a16d6 Add pytest markers for fast/slow test separation
- Add --run-slow flag to optionally include slow tests
- Auto-detect tests that use db_session, test_db, db_engine, or qdrant fixtures
- Skip slow tests by default for faster development iteration
- Usage: pytest (fast only) or pytest --run-slow (all tests)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-19 18:21:41 +01:00
56ed7b7d8f fix scheduler 2025-11-04 12:46:38 +00:00
470061bd43 fix health checks 2025-11-03 22:53:19 +00:00
Daniel O'Connell
ad6510bd17 add a bunch of tests 2025-11-03 23:23:41 +01:00
56c0df9761 tweaks 2025-11-03 19:42:13 +00:00
b568222e88 list discord command 2025-11-03 19:19:55 +00:00
8893018af1 multiple mcp servers 2025-11-03 16:41:26 +00:00
2d3dc06fdf tool to set up discord bot 2025-11-03 11:11:19 +00:00
2944a0bce1 properly handle mcp redirects 2025-11-03 00:00:02 +00:00
Daniel O'Connell
0d9f8beec3 handle mcp servers in discord 2025-11-02 23:49:50 +01:00
Daniel O'Connell
64bb926eba mcp servers for discord bots 2025-11-02 23:49:44 +01:00
Daniel O'Connell
6250586d1f prompt from bot user 2025-11-02 23:49:35 +01:00
9182f15c45 properly handle bot prompts 2025-11-02 15:51:30 +00:00
Daniel O'Connell
afdff1708b prompt from bot user 2025-11-02 16:46:26 +01:00
Daniel O'Connell
64e84b1c89 basic tools 2025-11-02 16:34:38 +01:00
798b4779da unify discord callers 2025-11-02 14:46:43 +00:00
69192f834a handle discord threads 2025-11-02 11:23:31 +00:00
Daniel O'Connell
6bd7df8ee3 properly handle images by anthropic 2025-11-02 12:08:46 +01:00
a4f42e656a save images 2025-11-02 10:25:23 +00:00
e95a082147 allow discord tools 2025-11-02 00:50:12 +00:00
c42513100b db backups 2025-11-02 00:24:35 +00:00
a5bc53326d backups 2025-11-02 00:01:35 +00:00
131427255a fix typing indicator 2025-11-01 20:27:57 +00:00
Daniel O'Connell
ff3ca4f109 show typing 2025-11-01 21:13:39 +01:00
3b216953ab better docker compise 2025-11-01 19:51:41 +00:00
Daniel O'Connell
d7e403fb83 optional chattiness 2025-11-01 20:39:15 +01:00
57145ac7b4 fix bugs 2025-11-01 19:35:20 +00:00
Daniel O'Connell
814090dccb use db bots 2025-11-01 18:52:37 +01:00
Daniel O'Connell
9639fa3dd7 use usage tracker 2025-11-01 18:49:06 +01:00
Daniel O'Connell
8af07f0dac add slash commands for discord 2025-11-01 18:04:38 +01:00
Daniel O'Connell
c296f3b533 extract usage 2025-11-01 17:56:20 +01:00
Daniel O'Connell
07852f9ee7 Base usage tracker 2025-11-01 16:22:40 +01:00
Daniel O'Connell
bcb470db9b use redis for celery backend 2025-11-01 15:55:59 +01:00
EC2 Default User
4fedd8fe04 fix admin 2025-10-20 22:09:06 +00:00
Daniel O'Connell
aaa0c2c3cd better discord integration 2025-10-20 23:08:34 +02:00
Daniel O'Connell
1a3cf9c931 add tetsts 2025-10-20 21:10:39 +02:00
Daniel O'Connell
1606348d8b discord integration 2025-10-20 03:47:13 +02:00
Daniel O'Connell
e68671deb4 handle openai 2025-10-13 11:59:23 +02:00