AIAny - TRIAGE: Dialectical Reasoning for Explainable Risk Prediction on Irregularly Sampled Medical Time Series with LLMs

Introduction

Clinical early warning from irregularly sampled EHR time series demands both well-calibrated risk estimates and explanations clinicians can verify. The core insight of this work is that asking an LLM to commit to a single outcome before scoring fosters overconfident, polarized predictions; instead, inspecting alternative outcomes and eliciting a dedicated rationale per outcome lets the model express graded, comparable risk via its implicit probabilities.

Key Findings

Dialectical supervision: train an LLM to produce outcome-specific rationales (one rationale per candidate outcome) and derive a continuous risk score from the model’s implicit probabilities conditioned on those rationales. This reduces the tendency to collapse to extreme binary predictions.
Empirical gains: across three irregularly sampled medical time-series benchmarks, TRIAGE yields an average AUPRC improvement of 3.3% and reduces calibration error by 81% versus competitive baselines. An LLM-as-judge evaluation rated TRIAGE rationales ~20% higher in clinical reasoning quality than baseline post-hoc explanations.
Practical result: a single, relatively small open-source LLM can deliver both discriminative, better-calibrated risk estimates and explicit, outcome-grounded natural language explanations in one pass.

Who it's for and trade-offs

Great fit if you need clinically grounded, inspectable risk estimates from irregular EHR time series and want explanations tied to alternative outcomes rather than post-hoc salience. Look elsewhere if you cannot supply any supervision for outcome-specific rationales, if deployment requires a fully validated clinical-grade pipeline (TRIAGE is a research prototype requiring external clinical validation), or if strict latency/compute limits prohibit running an LLM-based reasoning step. The method improves explainability and calibration but adds the cost of generating and supervising multiple outcome-specific rationales per example.

Where it fits

Positions between opaque numerical time-series predictors and post-hoc explanation systems: it uses language-model reasoning as the primary evidentiary interface (rationales) while extracting calibrated probabilities rather than relying on a single committed prediction.

Implementation note

The authors provide training recipes (dialectical reasoning supervision + self-refinement) and released code to reproduce experiments on public benchmarks; clinical deployment still requires dataset-specific validation, privacy safeguards, and regulatory review.

TRIAGE: Dialectical Reasoning for Explainable Risk Prediction on Irregularly Sampled Medical Time Series with LLMs

Introduction

Key Findings

Who it's for and trade-offs

Where it fits

Implementation note

Information

Categories

Tags

More Items

DecoEvo: Score-Decoupled Co-Evolution of Solver and Rubric-Generator Skills in Text Space

Mage-VL: An Efficient Codec-Native Streaming Multimodal Foundation Model

Kimi K3: Open Frontier Intelligence