Sincere thanks to all my coauthors for their great work and support. Hope we can make more great things together.
Streaming Hallucination Detection in Long Chain-of-Thought Reasoning
One-Eval: An Agentic System for Automated and Traceable LLM Evaluation
Reallocating Attention Across Layers to Reduce Multimodal Hallucination
Corrector: an execute-to-correct paradigm for efficient llm secure inference
LoopLM: A Self-Iterative Reasoning Framework for Large Language Models
A subtree of DataFlow, focusing on automated evaluation of LLMs.
An entropy-based script that solves the CS2 Friberg Game in the minimum number of guesses.