#243 — Enhance error messages to simplify alerts handling.

Repo: Twill-AI/facade State: closed | Status: done Assignee: Unassigned

Created: 2024-12-12 · Updated: 2025-03-24

Description

Caused by https://twill-network.slack.com/archives/C07TPN6FCBX/p1733928926330279

The issue is in supporting exclude_message_substrings field/feature in https://github.com/Twill-AI/alerting.

Initial query to logs usually contains where Log_s has "ERROR" in order to find logs with “ERROR” level. exclude_message_substrings works with results of this query. So if it contains “uvicorn.protocols.utils.ClientDisconnected” but log entries returned by query even doesn’t have it then this exclusion/silencing doesn’t work. Because in code we have

                    error_message = "Failed to handle LLM Engine response"
                    # Log error with all details.
                    log.error(
                        "%s: %s in LLM Chat %s:",
                        session_id,
                        error_message,
                        llmchat_id,
                        exc_info=True,
                    )

which leads to logs (without where Log_s has "ERROR"):

2024-12-11 14:54:16.729|ERROR|LOCALTENAN|websockets_router:503|266b75ad-3fed-440e-a0c6-c487b65ffa90: Failed to handle LLM Engine response in LLM Chat 55:
Traceback (most recent call last):
...
websockets.exceptions.ConnectionClosedOK: received 1001 (going away); then sent 1001 (going away)
...
    raise ClientDisconnected from exc
uvicorn.protocols.utils.ClientDisconnected
...

Azure’s logging splits multi-line log error records to separate lines. So where Log_s has "ERROR" is not returning

Notes

Add implementation notes, blockers, and context here

Add wikilinks to related people, meetings, or other tickets