Appends the Boolean Python

About 553,000 results

Open links in new tab

Any time

arxiv.org
https://arxiv.org › html
Agentic AI Security: Threats, Defenses, Evaluation, and Open …
In this survey, we focus on this very timely and pertinent problem of agentic AI security, and present the current state-of-the-art in novel AI agent attack methodologies, defense strategies, …
arxiv.org
https://arxiv.org › abs
[2502.19328] Agentic Reward Modeling: Integrating Human …
Feb 26, 2025 · In this paper, we propose agentic reward modeling, a reward system that combines reward models with verifiable correctness signals from different aspects to provide …
arxiv.org
https://arxiv.org › pdf
[PDF]
Detecting and Mitigating Reward Hacking in Reinforcement …
Jul 9, 2025 · Despite growing awareness of this problem, systematic detection and mitigation approaches remain limited. This paper presents a large-scale empirical study of reward …
arxiv.org
https://arxiv.org › html
Beyond Accuracy: A Multi-Dimensional Framework for Evaluating ...
Abstract Current agentic AI benchmarks predominantly evaluate task completion accuracy, while overlooking critical enterprise requirements such as cost-efficiency, reliability, and operational …
arxiv.org
https://arxiv.org › html
Adaptive Monitoring and Real‑World Evaluation of Agentic AI …
Abstract Agentic artificial intelligence (AI) — multi‑agent systems that combine large language models with external tools and autonomous planning — are rapidly transitioning from research …
arxiv.org
https://arxiv.org › html
The Real Barrier to LLM Agent Usability is Agentic ROI - arXiv.org
May 23, 2025 · We outline the roadmap across different development stages to bridge the current usability gaps, aiming to make LLM agents truly scalable, accessible, and effective in real …
github.com
https://github.com › junhua › awesome-llm-agents
GitHub - junhua/awesome-llm-agents: A Collection of High …
The detailed thought process of forming this project is documented at this Medium Post. It's put behind a paywall to prevent the evil LLMs' crawling. The full category breakdown. Retrieval …
aclanthology.org
https://aclanthology.org
[PDF]
Agentic Reward Modeling: Integrating Human Preferences …
Reward models (RMs) are crucial for the train- ing and inference-time scaling up of large lan- guage models (LLMs). However, existing re- ward models primarily focus .
github.com
https://github.com › THU-KEG › Agentic-Reward-Modeling
GitHub - THU-KEG/Agentic-Reward-Modeling: [ACL 2025] Agentic Reward …
We empirically implement a reward agent in this repo, named RewardAgent, that combines human preference rewards with two verifiable signals: factuality and instruction following, to …
openreview.net
https://openreview.net › pdf
[PDF]
SAFEEVALAGENT: TOWARD AGENTIC AND SELF EVOLVING …
to generate and perpetually evolve a comprehensive safety bench-mark. SafeEvalAgent leverages a synergistic pipeline of specialized agents and incorporates a Self-evolving …

Some results have been removed
Pagination
- 1
- 2
- 3
- 4
- 5
- Next

Agentic AI Security: Threats, Defenses, Evaluation, and Open …

[2502.19328] Agentic Reward Modeling: Integrating Human …

Detecting and Mitigating Reward Hacking in Reinforcement …

Beyond Accuracy: A Multi-Dimensional Framework for Evaluating ...

Adaptive Monitoring and Real‑World Evaluation of Agentic AI …

The Real Barrier to LLM Agent Usability is Agentic ROI - arXiv.org

GitHub - junhua/awesome-llm-agents: A Collection of High …

Agentic Reward Modeling: Integrating Human Preferences …

GitHub - THU-KEG/Agentic-Reward-Modeling: [ACL 2025] Agentic Reward …

SAFEEVALAGENT: TOWARD AGENTIC AND SELF EVOLVING …