125 Commits

Author SHA1 Message Date
Daniel O'Connell
52274f82a6 Fix 19 bugs from investigation
Critical/High severity fixes:
- BUG-001: Path traversal vulnerabilities (3 endpoints)
- BUG-003: BM25 filters now apply size/observation_types
- BUG-006: Remove API key from log messages
- BUG-008: Chunk size validation before yielding
- BUG-009: Race condition fix with FOR UPDATE SKIP LOCKED
- BUG-010: Add mcp_servers property to MessageProcessor
- BUG-011: Fix user_id type (BigInteger→Integer)
- BUG-012: Swap inverted score thresholds
- BUG-013: Add retry logic to embedding pipeline
- BUG-014: Fix CORS to use specific origin
- BUG-015: Add Celery retry/timeout defaults
- BUG-016: Re-raise exceptions for Celery retries

Medium severity fixes:
- BUG-017: Add collection_name index on Chunk
- BUG-031: Add SearchConfig limits (max 1000/300s)
- BUG-033: Replace debug prints with logger calls
- BUG-037: Clarify timezone handling in scheduler
- BUG-043: Health check now validates DB + Qdrant
- BUG-055: collection_model returns None not "unknown"

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-19 19:07:10 +01:00
Daniel O'Connell
116d0362a2 Fix REGISTER_ENABLED always evaluating to True (BUG-005)
Logic error: `boolean_env(...) or True` always evaluates to True,
making the environment variable useless.

Fixed by removing `or True`. Note: This setting is currently unused
in the codebase but the fix ensures correct behavior when it's used.

Also updates DISCORD_MODEL default to claude-haiku-4-5 for faster
and cheaper responses.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-19 18:33:05 +01:00
Daniel O'Connell
28bc10df92 Fix break_chunk() appending wrong object (BUG-007)
The function was appending the entire DataChunk object instead of
the individual item when processing non-string data (e.g., images).

Bug: `result.append(chunk)` should have been `result.append(c)`

This caused:
- Type mismatches (returning DataChunk instead of MulitmodalChunk)
- Potential circular references
- Embedding failures for mixed content

Fixed by appending the individual item `c` instead of the parent `chunk`.
Updated existing test and added new test to verify behavior.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-19 18:27:13 +01:00
Daniel O'Connell
21dedbeb61 Fix search score aggregation to use mean instead of sum
BUG-004: Score aggregation was broken - documents with more chunks
would always rank higher regardless of relevance because scores were
summed instead of averaged.

Changes:
- Changed score calculation from sum() to mean()
- Added comprehensive tests for SearchResult.from_source_item()
- Added tests for elide_content helper

This ensures search results are ranked by actual relevance rather
than by the number of chunks in the document.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-19 18:25:15 +01:00
56ed7b7d8f fix scheduler 2025-11-04 12:46:38 +00:00
Daniel O'Connell
ad6510bd17 add a bunch of tests 2025-11-03 23:23:41 +01:00
56c0df9761 tweaks 2025-11-03 19:42:13 +00:00
b568222e88 list discord command 2025-11-03 19:19:55 +00:00
8893018af1 multiple mcp servers 2025-11-03 16:41:26 +00:00
2d3dc06fdf tool to set up discord bot 2025-11-03 11:11:19 +00:00
2944a0bce1 properly handle mcp redirects 2025-11-03 00:00:02 +00:00
Daniel O'Connell
0d9f8beec3 handle mcp servers in discord 2025-11-02 23:49:50 +01:00
Daniel O'Connell
64bb926eba mcp servers for discord bots 2025-11-02 23:49:44 +01:00
Daniel O'Connell
6250586d1f prompt from bot user 2025-11-02 23:49:35 +01:00
9182f15c45 properly handle bot prompts 2025-11-02 15:51:30 +00:00
Daniel O'Connell
afdff1708b prompt from bot user 2025-11-02 16:46:26 +01:00
Daniel O'Connell
64e84b1c89 basic tools 2025-11-02 16:34:38 +01:00
798b4779da unify discord callers 2025-11-02 14:46:43 +00:00
69192f834a handle discord threads 2025-11-02 11:23:31 +00:00
Daniel O'Connell
6bd7df8ee3 properly handle images by anthropic 2025-11-02 12:08:46 +01:00
a4f42e656a save images 2025-11-02 10:25:23 +00:00
e95a082147 allow discord tools 2025-11-02 00:50:12 +00:00
a5bc53326d backups 2025-11-02 00:01:35 +00:00
131427255a fix typing indicator 2025-11-01 20:27:57 +00:00
Daniel O'Connell
ff3ca4f109 show typing 2025-11-01 21:13:39 +01:00
3b216953ab better docker compise 2025-11-01 19:51:41 +00:00
Daniel O'Connell
d7e403fb83 optional chattiness 2025-11-01 20:39:15 +01:00
57145ac7b4 fix bugs 2025-11-01 19:35:20 +00:00
Daniel O'Connell
814090dccb use db bots 2025-11-01 18:52:37 +01:00
Daniel O'Connell
9639fa3dd7 use usage tracker 2025-11-01 18:49:06 +01:00
Daniel O'Connell
8af07f0dac add slash commands for discord 2025-11-01 18:04:38 +01:00
Daniel O'Connell
c296f3b533 extract usage 2025-11-01 17:56:20 +01:00
Daniel O'Connell
07852f9ee7 Base usage tracker 2025-11-01 16:22:40 +01:00
Daniel O'Connell
bcb470db9b use redis for celery backend 2025-11-01 15:55:59 +01:00
EC2 Default User
4fedd8fe04 fix admin 2025-10-20 22:09:06 +00:00
Daniel O'Connell
aaa0c2c3cd better discord integration 2025-10-20 23:08:34 +02:00
Daniel O'Connell
1a3cf9c931 add tetsts 2025-10-20 21:10:39 +02:00
Daniel O'Connell
1606348d8b discord integration 2025-10-20 03:47:13 +02:00
Daniel O'Connell
e68671deb4 handle openai 2025-10-13 11:59:23 +02:00
Daniel O'Connell
99d3843f47 move to general LLM providers 2025-10-13 03:23:20 +02:00
Daniel O'Connell
08d17c28dd run discord collector 2025-10-12 23:43:44 +02:00
Daniel O'Connell
e086b4a3a6 add Discord ingester 2025-10-12 23:13:30 +02:00
Daniel O'Connell
f454aa9afa change schedule call signature 2025-10-12 10:17:22 +02:00
Daniel O'Connell
a3544222e7 add scheduled calls 2025-08-12 23:37:54 +00:00
EC2 Default User
a2d107fad7 command to add blog 2025-08-09 00:31:54 +00:00
Daniel O'Connell
b68e15d3ab add blogs 2025-08-09 02:07:49 +02:00
EC2 Default User
862251fedb fix oauth 2025-07-26 14:57:41 +00:00
EC2 Default User
cf456c04d6 handle books 2025-07-24 21:33:15 +00:00
EC2 Default User
907375eee5 fix approx tokens call 2025-07-24 17:57:39 +00:00
EC2 Default User
86c96da1b9 add send_response tool 2025-07-16 15:42:13 +00:00