FLARE: Active Retrieval Augmented Generation
FLARE (EMNLP 2023) improves on standard RAG by triggering retrieval mid-generation using token-probability confidence thresholds, reaching 51.0 EM on 2WikiMultihopQA versus 39.4 for single-retrieval — but calibration failures in instruction-tuned chat models limit its reliability for production finance agents.
