Skip to content

Automatically update arXiv papers about LLM Reasoning, LLM Evaluation, LLM & MLLM and Video Understanding using Github Actions.

Notifications You must be signed in to change notification settings

Xuchen-Li/llm-arxiv-daily

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Updated on 2025.02.28

Table of Contents
  1. LLM Reasoning
  2. LLM Evaluation
  3. LLM MLLM
  4. Video Understanding

LLM Reasoning

Publish Date Title Authors PDF Code
2025-02-27 FINEREASON: Evaluating and Improving LLMs' Deliberate Reasoning through Reflective Puzzle Solving Guizhen Chen et.al. 2502.20238 null
2025-02-27 Collaborative Stance Detection via Small-Large Language Model Consistency Verification Yu Yan et.al. 2502.19954 null
2025-02-27 Meta-Reasoner: Dynamic Guidance for Optimized Inference-time Reasoning in Large Language Models Yuan Sui et.al. 2502.19918 null
2025-02-27 Order Doesn't Matter, But Reasoning Does: Training LLMs with Order-Centric Augmentation Qianxi He et.al. 2502.19907 null
2025-02-27 Towards Multimodal Large-Language Models for Parent-Child Interaction: A Focus on Joint Attention Weiyan Shi et.al. 2502.19877 null
2025-02-26 Weaker LLMs' Opinions Also Matter: Mixture of Opinions Enhances LLM's Mathematical Reasoning Yanan Chen et.al. 2502.19622 null
2025-02-26 General Reasoning Requires Learning to Reason from the Get-go Seungwook Han et.al. 2502.19402 null
2025-02-26 BIG-Bench Extra Hard Mehran Kazemi et.al. 2502.19187 null
2025-02-25 Scalable Best-of-N Selection for Large Language Models via Self-Certainty Zhewei Kang et.al. 2502.18581 null
2025-02-25 SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution Yuxiang Wei et.al. 2502.18449 null
2025-02-25 Towards Thinking-Optimal Scaling of Test-Time Compute for LLM Reasoning Wenkai Yang et.al. 2502.18080 null
2025-02-21 Improving Value-based Process Verifier via Structural Prior Injection Zetian Sun et.al. 2502.17498 null
2025-02-24 Making LLMs Reason? The Intermediate Language Problem in Neurosymbolic Approaches Alexander Beiser et.al. 2502.17216 null
2025-02-24 Shakti-VLMs: Scalable Vision-Language Models for Enterprise AI Syed Abdul Gaffar Shakhadri et.al. 2502.17092 null
2025-02-24 Understanding the Uncertainty of LLM Explanations: A Perspective Based on Reasoning Topology Longchao Da et.al. 2502.17026 null
2025-02-24 All-in-one: Understanding and Generation in Multimodal Reasoning with the MAIA Benchmark Davide Testa et.al. 2502.16989 null
2025-02-24 AutoLogi: Automated Generation of Logic Puzzles for Evaluating Reasoning Abilities of Large Language Models Qin Zhu et.al. 2502.16906 link
2025-02-24 The Blessing of Reasoning: LLM-Based Contrastive Explanations in Black-Box Recommender Systems Yuyan Wang et.al. 2502.16759 null
2025-02-23 Reasoning about Affordances: Causal and Compositional Reasoning in LLMs Magnus F. Gjerde et.al. 2502.16606 null
2025-02-22 ThinkBench: Dynamic Out-of-Distribution Evaluation for Robust LLM Reasoning Shulin Huang et.al. 2502.16268 null
2025-02-27 Dynamic Parallel Tree Search for Efficient LLM Reasoning Yifu Ding et.al. 2502.16235 null
2025-02-22 Patterns Over Principles: The Fragility of Inductive Reasoning in LLMs under Noisy Observations Chunyang Li et.al. 2502.16169 link
2025-02-22 Multimodal Inconsistency Reasoning (MMIR): A New Benchmark for Multimodal Reasoning Models Qianqi Yan et.al. 2502.16033 null
2025-02-21 MutaGReP: Execution-Free Repository-Grounded Plan Search for Code-Use Zaid Khan et.al. 2502.15872 null
2025-02-21 Do Multilingual LLMs Think In English? Lisa Schut et.al. 2502.15603 null
2025-02-21 Evaluating Social Biases in LLM Reasoning Xuyang Wu et.al. 2502.15361 null
2025-02-21 Stepwise Informativeness Search for Improving LLM Reasoning Siyuan Wang et.al. 2502.15335 null
2025-02-21 Latent Factor Models Meets Instructions:Goal-conditioned Latent Factor Discovery without Task Supervision Zhouhang Xie et.al. 2502.15147 null
2025-02-19 SIFT: Grounding LLM Reasoning in Contexts via Stickers Zihao Zeng et.al. 2502.14922 null
2025-02-18 Think Inside the JSON: Reinforcement Strategy for Strict LLM Schema Adherence Bhavik Agarwal et.al. 2502.14905 null
2025-02-20 Exploring Advanced Techniques for Visual Question Answering: A Comprehensive Comparison Aiswarya Baby et.al. 2502.14827 null
2025-02-20 Logic-RL: Unleashing LLM Reasoning with Rule-Based Reinforcement Learning Tian Xie et.al. 2502.14768 link
2025-02-19 Enhancing LLM-Based Recommendations Through Personalized Reasoning Jiahao Liu et.al. 2502.13845 null
2025-02-19 MCTS-KBQA: Monte Carlo Tree Search for Knowledge Base Question Answering Guanming Xiong et.al. 2502.13428 null
2025-02-19 MM-Verify: Enhancing Multimodal Reasoning with Chain-of-Thought Verification Linzhuang Sun et.al. 2502.13383 link
2025-02-22 Grounding LLM Reasoning with Knowledge Graphs Alfonso Amayuelas et.al. 2502.13247 null
2025-02-18 Theorem Prover as a Judge for Synthetic Data Generation Joshua Ong Jun Leang et.al. 2502.13137 null
2025-02-18 Flow-of-Options: Diversified and Improved LLM Reasoning by Thinking Through Options Lakshmi Nair et.al. 2502.12929 link
2025-02-18 S $^2$ R: Teaching LLMs to Self-verify and Self-correct via Reinforcement Learning Ruotian Ma et.al. 2502.12853 link
2025-02-18 CutPaste&Find: Efficient Multimodal Hallucination Detector with Visual-aid Knowledge Base Cong-Duy Nguyen et.al. 2502.12591 null
2025-02-18 Inference-Time Computations for LLM Reasoning and Planning: A Benchmark and Insights Shubham Parashar et.al. 2502.12521 null
2025-02-18 HopRAG: Multi-Hop Reasoning for Logic-Aware Retrieval-Augmented Generation Hao Liu et.al. 2502.12442 null
2025-02-17 Evaluating Step-by-step Reasoning Traces: A Survey Jinu Lee et.al. 2502.12289 null
2025-02-17 SoftCoT: Soft Chain-of-Thought for Efficient Reasoning with LLMs Yige Xu et.al. 2502.12134 null
2025-02-17 TokenSkip: Controllable Chain-of-Thought Compression in LLMs Heming Xia et.al. 2502.12067 link
2025-02-17 Hypothesis-Driven Theory-of-Mind Reasoning for Large Language Models Hyunwoo Kim et.al. 2502.11881 null
2025-02-17 Code-Vision: Evaluating Multimodal LLMs Logic Understanding and Code Generation Capabilities Hanbin Wang et.al. 2502.11829 link
2025-02-17 Language Models Can See Better: Visual Contrastive Decoding For LLM Multimodal Reasoning Yuqi Pang et.al. 2502.11751 link
2025-02-17 DeFiScope: Detecting Various DeFi Price Manipulations with LLM Reasoning Juantao Zhong et.al. 2502.11521 null
2025-02-16 Don't Get Lost in the Trees: Streamlining LLM Reasoning by Overcoming Tree Search Exploration Pitfalls Ante Wang et.al. 2502.11183 null
2025-02-16 LogiDynamics: Unraveling the Dynamics of Logical Inference in Large Language Model Reasoning Tianshi Zheng et.al. 2502.11176 null
2025-02-15 A Tutorial on LLM Reasoning: Relevant Methods behind ChatGPT o1 Jun Wang et.al. 2502.10867 null
2025-02-15 USER-VLM 360: Personalized Vision Language Models with User-aware Tuning for Social Human-Robot Interactions Hamed Rahimi et.al. 2502.10636 null
2025-02-14 Do Large Language Models Reason Causally Like Us? Even Better? Hanna M. Dettki et.al. 2502.10215 null
2025-02-14 MathConstruct: Challenging LLM Reasoning with Constructive Proofs Mislav Balunović et.al. 2502.10197 null
2025-02-13 MME-CoT: Benchmarking Chain-of-Thought in Large Multimodal Models for Reasoning Quality, Robustness, and Efficiency Dongzhi Jiang et.al. 2502.09621 null
2025-02-14 EnigmaEval: A Benchmark of Long Multimodal Reasoning Challenges Clinton J. Wang et.al. 2502.08859 null
2025-02-11 CIRCUIT: A Benchmark for Circuit Interpretation and Reasoning Capabilities of LLMs Lejla Skelic et.al. 2502.07980 null
2025-02-05 Reasoning-as-Logic-Units: Scaling Test-Time Reasoning in Large Language Models Through Logic Unit Alignment Cheryl Li et.al. 2502.07803 null
2025-02-17 Bag of Tricks for Inference-time Computation of LLM Reasoning Fan Liu et.al. 2502.07191 null
2025-02-15 Self-Supervised Prompt Optimization Jinyu Xiang et.al. 2502.06855 link
2025-02-06 Vision-Integrated LLMs for Autonomous Driving Assistance : Human Performance Comparison and Trust Evaluation Namhee Kim et.al. 2502.06843 null
2025-02-04 Policy Guided Tree Search for Enhanced LLM Reasoning Yang Li et.al. 2502.06813 null
2025-02-10 ReasonFlux: Hierarchical LLM Reasoning via Scaling Thought Templates Ling Yang et.al. 2502.06772 link
2025-02-10 Resurrecting saturated LLM benchmarks with adversarial encoding Igor Ivanov et.al. 2502.06738 null
2025-02-13 LawGPT: Knowledge-Guided Data Generation and Its Application to Legal LLM Zhi Zhou et.al. 2502.06572 link
2025-02-09 A Generative Framework for Bidirectional Image-Report Understanding in Chest Radiography Nicholas Evans et.al. 2502.05926 null
2025-02-08 Evaluating Vision-Language Models for Emotion Recognition Sree Bhattacharyya et.al. 2502.05660 null
2025-02-07 GSM-Infinite: How Do Your LLMs Behave over Infinitely Increasing Context Length and Reasoning Complexity? Yang Zhou et.al. 2502.05252 link
2025-02-07 Adaptive Graph of Thoughts: Test-Time Adaptive Reasoning Unifying Chain, Tree, and Graph Structures Tushar Pandey et.al. 2502.05078 link
2025-02-07 Agentic Reasoning: Reasoning LLMs with Tools for the Deep Research Junde Wu et.al. 2502.04644 link
2025-02-05 Enhancing Reasoning to Adapt Large Language Models for Domain-Specific Applications Bo Wen et.al. 2502.04384 link
2025-02-05 Limitations of Large Language Models in Clinical Problem-Solving Arising from Inflexible Reasoning Jonathan Kim et.al. 2502.04381 null
2025-02-04 Investigating the Robustness of Deductive Reasoning with Large Language Models Fabian Hoppe et.al. 2502.04352 null
2025-02-04 Satori: Reinforcement Learning with Chain-of-Action-Thought Enhances LLM Reasoning via Autoregressive Search Maohao Shen et.al. 2502.02508 null
2025-02-04 CoAT: Chain-of-Associated-Thoughts Framework for Enhancing Large Language Models Reasoning Jianfeng Pan et.al. 2502.02390 null
2025-02-08 Boosting Multimodal Reasoning with MCTS-Automated Structured Thinking Jinyang Wu et.al. 2502.02339 null
2025-02-04 Mitigating Object Hallucinations in Large Vision-Language Models via Attention Calibration Younan Zhu et.al. 2502.01969 null
2025-01-31 Improving Rule-based Reasoning in LLMs via Neurosymbolic Representations Varun Dhanraj et.al. 2502.01657 null
2025-02-03 Position: Empowering Time Series Reasoning with Multimodal LLMs Yaxuan Kong et.al. 2502.01477 null
2025-02-03 ZebraLogic: On the Scaling Limits of LLMs for Logical Reasoning Bill Yuchen Lin et.al. 2502.01100 null
2025-02-16 Learning Autonomous Code Integration for Math Language Models Haozhe Wang et.al. 2502.00691 null
2025-02-13 Bridging Internal Probability and Self-Consistency for Effective and Efficient LLM Reasoning Zhi Zhou et.al. 2502.00511 null
2025-02-14 Reward-Guided Speculative Decoding for Efficient LLM Reasoning Baohao Liao et.al. 2501.19324 null
2025-01-31 Efficient Reasoning with Hidden Thinking Xuan Shen et.al. 2501.19201 link
2025-01-31 BRiTE: Bootstrapping Reinforced Thinking Process to Enhance Language Model Reasoning Han Zhong et.al. 2501.18858 null
2025-01-28 A Stochastic Dynamical Theory of LLM Self-Adversariality: Modeling Severity Drift as a Critical Process Jack David Carson et.al. 2501.16783 null
2025-01-27 Explaining GitHub Actions Failures with Large Language Models: Challenges, Insights, and Limitations Pablo Valenzuela-Toledo et.al. 2501.16495 null
2025-01-27 Large Models in Dialogue for Active Perception and Anomaly Detection Tzoulio Chamiti et.al. 2501.16300 link
2025-01-26 TensorLLM: Tensorising Multi-Head Attention for Enhanced Reasoning and Compression in LLMs Yuxuan Gu et.al. 2501.15674 null
2025-01-28 Rethinking External Slow-Thinking: From Snowball Errors to Probability of Correct Reasoning Zeyu Gan et.al. 2501.15602 link
2025-01-26 Error Classification of Large Language Models on Math Word Problems: A Dynamically Adaptive Framework Yuhong Sun et.al. 2501.15581 null
2025-02-15 Option-ID Based Elimination For Multiple Choice Questions Zhenhao Zhu et.al. 2501.15175 null
2025-01-24 Domaino1s: Guiding LLM Reasoning for Explainable Answers in High-Stakes Domains Xu Chu et.al. 2501.14431 null
2025-02-12 GraphSOS: Graph Sampling and Order Selection to Help LLMs Understand Graphs Better Xu Chu et.al. 2501.14427 null
2025-01-23 Pseudocode-Injection Magic: Enabling LLMs to Tackle Graph Computational Tasks Chang Gong et.al. 2501.13731 null
2025-02-10 Cognitive Paradigms for Evaluating VLMs on Visual Reasoning Task Mohit Vaishnav et.al. 2501.13620 null
2025-01-22 EvidenceMap: Unleashing the Power of Small Language Models with Evidence Analysis for Biomedical Question Answering Chang Zong et.al. 2501.12746 null
2025-01-17 LLM Reasoner and Automated Planner: A new NPC approach Israel Puerta-Merino et.al. 2501.10106 null
2025-01-22 FRAG: A Flexible Modular Framework for Retrieval-Augmented Generation based on Knowledge Graphs Zengyi Gao et.al. 2501.09957 null
2025-01-17 Evolving Deeper LLM Thinking Kuang-Huei Lee et.al. 2501.09891 null
2025-01-23 Towards Large Reasoning Models: A Survey of Reinforced Reasoning with Large Language Models Fengli Xu et.al. 2501.09686 null
2025-01-15 Multimodal LLMs Can Reason about Aesthetics in Zero-Shot Ruixiang Jiang et.al. 2501.09012 link
2025-02-10 Ensemble of Large Language Models for Curated Labeling and Rating of Free-text Data Jiaxing Qiu et.al. 2501.08413 link
2025-01-14 Reasoning with Graphs: Structuring Implicit Knowledge to Enhance LLMs Reasoning Haoyu Han et.al. 2501.07845 null
2025-01-09 Can MLLMs Reason in Multimodality? EMMA: An Enhanced MultiModal ReAsoning Benchmark Yunzhuo Hao et.al. 2501.05444 link
2025-01-08 Enhancing Financial VQA in Vision Language Models using Intermediate Structured Representations Archita Srivastava et.al. 2501.04675 null
2025-01-08 DRIVINGVQA: Analyzing Visual Chain-of-Thought Reasoning of Vision Language Models in Real-World Scenarios with Driving Theory Tests Charles Corbière et.al. 2501.04671 null
2025-01-08 Understanding Before Reasoning: Enhancing Chain-of-Thought with Iterative Summarization Pre-Prompting Dong-Hai Zhu et.al. 2501.04341 link
2025-01-07 Reasoning-Enhanced Self-Training for Long-Form Personalized Text Generation Alireza Salemi et.al. 2501.04167 null
2025-01-07 Socratic Questioning: Learn to Self-guide Multimodal Reasoning in the Wild Wanpeng Hu et.al. 2501.02964 link
2025-01-06 KG-CF: Knowledge Graph Completion with Context Filtering under the Guidance of Large Language Models Zaiyi Zheng et.al. 2501.02711 null
2025-01-04 Table as Thought: Exploring Structured Thoughts in LLM Reasoning Zhenjie Sun et.al. 2501.02152 null
2025-01-03 Recursive Decomposition of Logical Thoughts: Framework for Superior Reasoning and Knowledge Propagation in Large Language Models Kaleem Ullah Qasim et.al. 2501.02026 null
2025-01-02 Enhancing Reasoning through Process Supervision with Monte Carlo Tree Search Shuangtao Li et.al. 2501.01478 null
2025-01-02 HetGCoT-Rec: Heterogeneous Graph-Enhanced Chain-of-Thought LLM Reasoning for Journal Recommendation Runsong Jia et.al. 2501.01203 null
2025-01-03 Enhancing LLM Reasoning with Multi-Path Collaborative Reactive and Reflection agents Chengbo He et.al. 2501.00430 null
2024-12-31 EQUATOR: A Deterministic Framework for Evaluating LLM Reasoning with Open-Ended Questions. # v1.0.0-beta Raymond Bernard et.al. 2501.00257 null
2024-12-30 Efficiently Serving LLM Reasoning Programs with Certaindex Yichao Fu et.al. 2412.20993 null
2024-12-28 LLM Reasoning Engine: Specialized Training for Enhanced Mathematical Reasoning Shuguang Chen et.al. 2412.20227 null
2025-02-17 Token-Budget-Aware LLM Reasoning Tingxu Han et.al. 2412.18547 link
2024-12-23 StructTest: Benchmarking LLMs' Reasoning through Compositional Structured Outputs Hailin Chen et.al. 2412.18011 null
2025-02-09 Evaluating LLM Reasoning in the Operations Research Domain with ORQA Mahdi Mostajabdaveh et.al. 2412.17874 link
2024-12-23 Diving into Self-Evolving Training for Multimodal Reasoning Wei Liu et.al. 2412.17451 null
2024-12-21 SilVar: Speech Driven Multimodal Model for Reasoning Visual Question Answering and Object Localization Tan-Hanh Pham et.al. 2412.16771 null
2024-12-20 PruneVid: Visual Token Pruning for Efficient Video Large Language Models Xiaohu Huang et.al. 2412.16117 link
2024-12-19 Eliciting Causal Abilities in Large Language Models for Reasoning Tasks Yajing Wang et.al. 2412.15314 link
2024-12-19 Critical-Questions-of-Thought: Steering LLM reasoning with Argumentative Querying Federico Castagna et.al. 2412.15177 link
2024-12-19 Progressive Multimodal Reasoning via Active Retrieval Guanting Dong et.al. 2412.14835 null
2024-12-19 FiVL: A Framework for Improved Vision-Language Alignment Estelle Aflalo et.al. 2412.14672 null
2024-12-19 FaultExplainer: Leveraging Large Language Models for Interpretable Fault Detection and Diagnosis Abdullah Khan et.al. 2412.14492 link
2024-12-18 Cognition Chain for Explainable Psychological Stress Detection on Social Media Xin Wang et.al. 2412.14009 null
2024-12-27 Cracking the Code of Hallucination in LVLMs with Vision-aware Head Divergence Jinghan He et.al. 2412.13949 null
2025-02-16 Do Language Models Understand Time? Xi Ding et.al. 2412.13845 link
2024-12-18 Beyond Outcomes: Transparent Assessment of LLM Reasoning in Games Wenye Lin et.al. 2412.13602 null
2024-12-17 ClarityEthic: Explainable Moral Judgment Utilizing Contrastive Ethical Insights from Large Language Models Yuxi Sun et.al. 2412.12848 null
2024-12-12 A NotSo Simple Way to Beat Simple Bench Soham Sane et.al. 2412.12173 null
2024-12-11 What Makes In-context Learning Effective for Mathematical Reasoning: A Theoretical Analysis Jiayu Liu et.al. 2412.12157 null
2025-02-18 A Survey of Mathematical Reasoning in the Era of Multimodal Large Language Model: Benchmark, Method & Challenges Yibo Yan et.al. 2412.11936 null
2024-12-24 Stepwise Reasoning Error Disruption Attack of LLMs Jingyu Peng et.al. 2412.11934 null
2024-12-16 Leveraging Retrieval-Augmented Tags for Large Vision-Language Understanding in Complex Scenes Antonio Carlos Rivera et.al. 2412.11396 null
2024-12-15 SceneLLM: Implicit Language Reasoning in LLM for Dynamic Scene Graph Generation Hang Zhang et.al. 2412.11026 null
2024-12-15 Entropy-Regularized Process Reward Model Hanning Zhang et.al. 2412.11006 link
2024-12-14 Optimizing Vision-Language Interactions Through Decoder-Only Models Kaito Tanaka et.al. 2412.10758 null
2024-12-14 Chasing Progress, Not Perfection: Revisiting Strategies for End-to-End LLM Plan Generation Sukai Huang et.al. 2412.10675 null
2024-12-14 Thinking with Knowledge Graphs: Enhancing LLM Reasoning Through Structured Data Xue Wu et.al. 2412.10654 null
2024-12-13 EVLM: Self-Reflective Multimodal Reasoning for Cross-Dimensional Visual Editing Umar Khalid et.al. 2412.10566 null
2024-12-13 Atomic Learning Objectives Labeling: A High-Resolution Approach for Physics Education Naiming Liu et.al. 2412.09914 null
2025-01-18 Neptune: The Long Orbit to Benchmarking Long Video Understanding Arsha Nagrani et.al. 2412.09582 link
2025-02-14 Forest-of-Thought: Scaling Test-Time Compute for Enhancing LLM Reasoning Zhenni Bi et.al. 2412.09078 link
2024-12-11 Training Large Language Models to Reason in a Continuous Latent Space Shibo Hao et.al. 2412.06769 link
2025-01-23 GameArena: Evaluating LLM Reasoning through Live Computer Games Lanxiang Hu et.al. 2412.06394 null
2024-12-08 Language hooks: a modular framework for augmenting LLM reasoning that decouples tool usage from the model and its prompt Damien de Mijolla et.al. 2412.05967 null
2024-12-06 MAmmoTH-VL: Eliciting Multimodal Reasoning with Instruction Tuning at Scale Jarvis Guo et.al. 2412.05237 null
2024-12-05 Aguvis: Unified Pure Vision Agents for Autonomous GUI Interaction Yiheng Xu et.al. 2412.04454 null
2024-12-05 SocialMind: LLM-based Proactive AR Social Assistive System with Human-like Perception for In-situ Live Interactions Bufang Yang et.al. 2412.04036 null
2024-12-04 DynamicControl: Adaptive Condition Selection for Improved Text-to-Image Generation Qingdong He et.al. 2412.03255 null
2024-12-03 Explainable CTR Prediction via LLM Reasoning Xiaohan Yu et.al. 2412.02588 null
2025-02-12 NYT-Connections: A Deceptively Simple Text Classification Task that Stumps System-1 Thinkers Angel Yahir Loredo Lopez et.al. 2412.01621 null
2025-01-13 Critical Tokens Matter: Token-Level Contrastive Estimation Enhances LLM's Reasoning Capability Zicheng Lin et.al. 2411.19943 link
2024-11-29 TQA-Bench: Evaluating LLMs for Multi-Table Question Answering with Scalable Context and Symbolic Extension Zipeng Qiu et.al. 2411.19504 link
2024-11-29 COLD: Causal reasOning in cLosed Daily activities Abhinav Joshi et.al. 2411.19500 link
2024-12-16 Critic-V: VLM Critics Help Catch VLM Errors in Multimodal Reasoning Di Zhang et.al. 2411.18203 null
2024-11-26 NEMO: Can Multimodal LLMs Identify Attribute-Modified Objects? Jiaxuan Li et.al. 2411.17794 null
2024-11-25 Enhancing LLM Reasoning via Critique Models with Test-Time and Training-Time Supervision Zhiheng Xi et.al. 2411.16579 null
2024-11-22 On the Impact of Fine-Tuning on Chain-of-Thought Reasoning Elita Lobo et.al. 2411.15382 null
2024-11-21 Insight-V: Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models Yuhao Dong et.al. 2411.14432 link
2024-11-20 BALROG: Benchmarking Agentic LLM and VLM Reasoning On Games Davide Paglieri et.al. 2411.13543 null
2024-11-20 Hints of Prompt: Enhancing Visual Representation for Multimodal LLMs in Autonomous Driving Hao Zhou et.al. 2411.13076 null
2024-11-15 Thinking Before Looking: Improving Multimodal LLM Reasoning via Mitigating Visual Hallucination Haojie Zheng et.al. 2411.12591 link
2024-12-23 Enhancing Reasoning Capabilities of LLMs via Principled Synthetic Logic Corpus Terufumi Morishita et.al. 2411.12498 link
2024-11-18 Semantic-Geometric-Physical-Driven Robot Manipulation Skill Transfer via Skill Library and Tactile Representation Mingchao Qi et.al. 2411.11714 link
2024-12-31 Enhancing LLM Reasoning with Reward-guided Tree Search Jinhao Jiang et.al. 2411.11694 null
2024-12-15 A dataset of questions on decision-theoretic reasoning in Newcomb-like problems Caspar Oesterheld et.al. 2411.10588 link
2024-11-15 Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization Weiyun Wang et.al. 2411.10442 null
2025-01-09 LLaVA-CoT: Let Vision Language Models Reason Step-by-Step Guowei Xu et.al. 2411.10440 link
2024-11-15 Motion-Grounded Video Reasoning: Understanding and Perceiving Motion at Pixel Level Andong Deng et.al. 2411.09921 null
2024-11-14 Comprehensive and Practical Evaluation of Retrieval-Augmented Generation Systems for Medical Question Answering Nghia Trung Ngo et.al. 2411.09213 null
2024-11-13 Tree-of-Table: Unleashing the Power of LLMs for Enhanced Large-Scale Table Understanding Deyi Ji et.al. 2411.08516 null
2024-11-18 What Do Learning Dynamics Reveal About Generalization in LLM Reasoning? Katie Kang et.al. 2411.07681 link
2024-11-27 Self-Training Meets Consistency: Improving LLMs' Reasoning With Consistency-Driven Rationale Evaluation Jaehyeok Lee et.al. 2411.06387 link
2024-11-09 A Picture is Worth A Thousand Numbers: Enabling LLMs Reason about Time Series via Visualization Haoxin Liu et.al. 2411.06018 null
2024-11-11 LLMs as Method Actors: A Model for Prompt Engineering and Architecture Colin Doyle et.al. 2411.05778 link
2024-11-12 Kwai-STaR: Transform LLMs into State-Transition Reasoners Xingyu Lu et.al. 2411.04799 null
2024-11-21 Language Models are Hidden Reasoners: Unlocking Latent Reasoning Capabilities via Self-Rewarding Haolin Chen et.al. 2411.04282 link
2024-11-05 CrowdGenUI: Enhancing LLM-Based UI Widget Generation with a Crowdsourced Preference Library Yimeng Liu et.al. 2411.03477 null
2025-01-27 MetRex: A Benchmark for Verilog Code Metric Reasoning Using LLMs Manar Abdelatty et.al. 2411.03471 link
2024-11-04 RuAG: Learned-rule-augmented Generation for Large Language Models Yudi Zhang et.al. 2411.03349 null
2024-10-30 Vision-Language Models Can Self-Improve Reasoning via Reflection Kanzhi Cheng et.al. 2411.00855 null
2024-11-01 Mitigating Tail Narrowing in LLM Self-Improvement via Socratic-Guided Sampling Yiwen Ding et.al. 2411.00750 link
2024-11-01 STEM-POM: Evaluating Language Models Math-Symbol Reasoning in Document Parsing Jiaru Zou et.al. 2411.00387 null
2024-11-08 GRS-QA -- Graph Reasoning-Structured Question Answering Dataset Anish Pahilajani et.al. 2411.00369 null
2024-10-31 Thought Space Explorer: Navigating and Expanding Thought Space for Large Language Model Reasoning Jinghan Zhang et.al. 2410.24155 null
2024-10-31 RL-STaR: Theoretical Analysis of Reinforcement Learning Frameworks for Self-Taught Reasoner Fu-Chieh Chang et.al. 2410.23912 null
2024-10-31 OCEAN: Offline Chain-of-thought Evaluation and Alignment in Large Language Models Junda Wu et.al. 2410.23703 null
2024-10-30 ReasoningRec: Bridging Personalized Recommendations and Human-Interpretable Explanations through LLM Reasoning Millennium Bismay et.al. 2410.23180 link
2024-10-30 On Memorization of Large Language Models in Logical Reasoning Chulin Xie et.al. 2410.23123 null
2024-10-28 Causal Interventions on Causal Paths: Mapping GPT-2's Reasoning From Syntax to Semantics Isabelle Lee et.al. 2410.21353 null
2024-10-28 Guide-LLM: An Embodied LLM Agent and Text-Based Topological Map for Robotic Guidance of People with Visual Impairments Sangmim Song et.al. 2410.20666 null
2024-10-25 Cooperative Strategic Planning Enhances Reasoning Capabilities in Large Language Models Danqing Wang et.al. 2410.20007 null
2024-10-25 Can Stories Help LLMs Reason? Curating Information Space Through Narrative Vahid Sadiri Javadi et.al. 2410.19221 null
2024-10-18 Make LLMs better zero-shot reasoners: Structure-orientated autonomous reasoning Pengfei He et.al. 2410.19000 link
2024-10-25 CLR-Bench: Evaluating Large Language Models in College-level Reasoning Junnan Dong et.al. 2410.17558 null
2024-10-28 Non-myopic Generation of Language Models for Reasoning and Planning Chang Ma et.al. 2410.17195 link
2024-11-06 Improving Causal Reasoning in Large Language Models: A Survey Longxuan Yu et.al. 2410.16676 link
2024-10-22 A Statistical Analysis of LLMs' Self-Evaluation Using Proverbs Ryosuke Sonoda et.al. 2410.16640 null
2024-10-21 Rulebreakers Challenge: Revealing a Blind Spot in Large Language Models' Reasoning with Formal Logic Jason Chan et.al. 2410.16502 null
2024-11-27 On Designing Effective RL Reward at Training Time for LLM Reasoning Jiaxuan Gao et.al. 2410.15115 null
2025-01-28 Paths-over-Graph: Knowledge Graph Empowered Large Language Model Reasoning Xingyu Tan et.al. 2410.14211 null
2024-10-21 Unconstrained Model Merging for Enhanced LLM Reasoning Yiming Zhang et.al. 2410.13699 null
2024-10-16 Graph-constrained Reasoning: Faithful Reasoning on Knowledge Graphs with Large Language Models Linhao Luo et.al. 2410.13080 link
2024-10-16 KcMF: A Knowledge-compliant Framework for Schema and Entity Matching with Fine-tuning-free LLMs Yongqin Xu et.al. 2410.12480 null
2024-10-17 Enhancing LLM Trading Performance with Fact-Subjectivity Aware Reasoning Qian Wang et.al. 2410.12464 null
2024-10-16 Reversal of Thought: Enhancing Large Language Models with Preference-Guided Reverse Reasoning Warm-up Jiahao Yuan et.al. 2410.12323 link
2024-10-16 Exploiting LLMs' Reasoning Capability to Infer Implicit Concepts in Legal Information Retrieval Hai-Long Nguyen et.al. 2410.12154 null
2024-10-15 Planning Anything with Rigor: General-Purpose Zero-Shot Planning with LLM-based Formalized Programming Yilun Hao et.al. 2410.12112 null
2024-10-12 OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models Jun Wang et.al. 2410.09671 null
2024-10-11 P-FOLIO: Evaluating and Improving Logical Reasoning with Abundant Human-Written Reasoning Chains Simeng Han et.al. 2410.09207 null
2024-10-11 Aerial Vision-and-Language Navigation via Semantic-Topo-Metric Representation Guided LLM Reasoning Yunpeng Gao et.al. 2410.08500 null
2024-10-10 SG-Nav: Online 3D Scene Graph Prompting for LLM-based Zero-shot Object Navigation Hang Yin et.al. 2410.08189 null
2024-10-10 Rewarding Progress: Scaling Automated Process Verifiers for LLM Reasoning Amrith Setlur et.al. 2410.08146 null
2024-10-10 Automatic Curriculum Expert Iteration for Reliable LLM Reasoning Zirui Zhao et.al. 2410.07627 null
2024-10-09 Boosting Few-Shot Detection with Large Language Models and Layout-to-Image Synthesis Ahmed Abdullah et.al. 2410.06841 null
2024-10-09 Towards Self-Improvement of LLMs via MCTS: Leveraging Stepwise Knowledge with Curriculum Preference Learning Xiyao Wang et.al. 2410.06508 null
2025-01-02 Filtering Discomforting Recommendations with Large Language Models Jiahao Liu et.al. 2410.05411 null
2024-10-05 Improving LLM Reasoning through Scaling Inference Computation with Collaborative Verification Zhenwen Liang et.al. 2410.05318 null
2024-10-06 Reasoning-Enhanced Healthcare Predictions with Knowledge Graph Community Retrieval Pengcheng Jiang et.al. 2410.04585 link
2024-10-03 The Role of Deductive and Inductive Reasoning in Large Language Models Chengkun Cai et.al. 2410.02892 null
2024-10-02 Not All LLM Reasoners Are Created Equal Arian Hosseini et.al. 2410.01748 null
2024-12-25 Interpretable Contrastive Monte Carlo Tree Search Reasoning Zitian Gao et.al. 2410.01707 link
2024-10-02 VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment Amirhossein Kazemnejad et.al. 2410.01679 link
2024-10-02 AHP-Powered LLM Reasoning for Multi-Criteria Evaluation of Open-Ended Responses Xiaotian Lu et.al. 2410.01246 null
2024-10-01 Self-controller: Controlling LLMs with Multi-round Step-by-step Self-awareness Xiao Peng et.al. 2410.00359 null
2024-10-01 Insight: A Multi-Modal Diagnostic Pipeline using LLMs for Ocular Surface Disease Diagnosis Chun-Hsiao Yeh et.al. 2410.00292 null
2024-10-08 GUNDAM: Aligning Large Language Models with Graph Understanding Sheng Ouyang et.al. 2409.20053 null
2024-09-27 Open-Nav: Exploring Zero-Shot Vision-and-Language Navigation in Continuous Environment with Open-Source LLMs Yanyuan Qiao et.al. 2409.18794 null
2024-10-23 Proof of Thought : Neurosymbolic Program Synthesis allows Robust and Interpretable Reasoning Debargha Ganguly et.al. 2409.17270 null
2024-09-20 CSCE: Boosting LLM Reasoning by Simultaneous Enhancing of Casual Significance and Consistency Kangsheng Wang et.al. 2409.17174 null
2024-09-20 Mufu: Multilingual Fused Learning for Low-Resource Translation with LLM Zheng Wei Lim et.al. 2409.13949 null
2024-09-19 SituationAdapt: Contextual UI Optimization in Mixed Reality with Situation Awareness via LLM Reasoning Zhipeng Li et.al. 2409.12836 null
2024-10-04 Unlocking Reasoning Potential in Large Langauge Models by Scaling Code-form Planning Jiaxin Wen et.al. 2409.12452 link
2024-12-16 Enhancing Logical Reasoning in Large Language Models through Graph-based Synthetic Data Jiaming Zhou et.al. 2409.12437 link
2024-09-18 MAgICoRe: Multi-Agent, Iterative, Coarse-to-Fine Refinement for Reasoning Justin Chih-Yao Chen et.al. 2409.12147 link
2024-11-05 Improving LLM Reasoning with Multi-Agent Tree-of-Thought Validator Agent Fatemeh Haji et.al. 2409.11527 link
2024-09-16 Enhancing RL Safety with Counterfactual LLM Reasoning Dennis Gross et.al. 2409.10188 link
2024-09-11 Think Together and Work Better: Combining Humans' and LLMs' Think-Aloud Outcomes for Effective Text Evaluation SeongYeub Chu et.al. 2409.07355 link

(back to top)

LLM Evaluation

Publish Date Title Authors PDF Code
2025-02-26 Exploring Graph Tasks with Pure LLMs: A Comprehensive Benchmark and Investigation Yuxiang Wang et.al. 2502.18771 link
2025-02-23 Recent Advances in Large Langauge Model Benchmarks against Data Contamination: From Static to Dynamic Evaluation Simin Chen et.al. 2502.17521 link
2025-02-24 Unveiling Downstream Performance Scaling of LLMs: A Clustering-Based Perspective Chengyin Xu et.al. 2502.17262 null
2025-02-24 Detecting Benchmark Contamination Through Watermarking Tom Sander et.al. 2502.17259 null
2025-02-24 Predicting Liquidity-Aware Bond Yields using Causal GANs and Deep Reinforcement Learning with LLM Evaluation Jaskaran Singh Walia et.al. 2502.17011 null
2025-02-24 AlphaAgent: LLM-Driven Alpha Mining with Regularized Exploration to Counteract Alpha Decay Ziyi Tang et.al. 2502.16789 null
2025-01-30 Retrieval Augmented Generation Based LLM Evaluation For Protocol State Machine Inference With Chain-of-Thought Reasoning Youssef Maklad et.al. 2502.15727 null
2025-02-20 Prompt-to-Leaderboard Evan Frick et.al. 2502.14855 link
2025-02-27 SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines M-A-P Team et.al. 2502.14739 null
2025-02-20 SEA-HELM: Southeast Asian Holistic Evaluation of Language Models Yosephine Susanto et.al. 2502.14301 null
2025-02-20 Transfer-Prompting: Enhancing Cross-Task Adaptation in Large Language Models via Dual-Stage Prompts Optimization Yupeng Chang et.al. 2502.14211 link
2025-02-19 Which of These Best Describes Multiple Choice Evaluation with LLMs? A) Forced B) Flawed C) Fixable D) All of the Above Nishant Balepur et.al. 2502.14127 null
2025-02-19 STEER-ME: Assessing the Microeconomic Reasoning of Large Language Models Narun Raman et.al. 2502.13119 null
2025-02-18 HPSS: Heuristic Prompting Strategy Search for LLM Evaluators Bosi Wen et.al. 2502.13031 null
2025-02-18 None of the Others: a General Technique to Distinguish Reasoning from Memorization in Multiple-Choice LLM Evaluation Benchmarks Eva Sánchez Salido et.al. 2502.12896 null
2025-02-18 Safe at the Margins: A General Approach to Safety Alignment in Low-Resource English Languages -- A Singlish Case Study Isaac Lim et.al. 2502.12485 null
2025-02-17 Deviation Ratings: A General, Clone-Invariant Rating Method Luke Marris et.al. 2502.11645 null
2025-02-21 TituLLMs: A Family of Bangla LLMs with Comprehensive Benchmarking Shahriar Kabir Nahin et.al. 2502.11187 null
2025-02-15 Rule-Bottleneck Reinforcement Learning: Joint Explanation and Decision Optimization for Resource Allocation with Language Agents Mauricio Tec et.al. 2502.10732 null
2025-02-15 An Empirical Analysis of Uncertainty in Large Language Model Evaluations Qiujie Xie et.al. 2502.10709 link
2025-02-25 Accelerating Unbiased LLM Evaluation via Synthetic Feedback Zhaoyi Zhou et.al. 2502.10563 link
2025-02-14 MathConstruct: Challenging LLM Reasoning with Constructive Proofs Mislav Balunović et.al. 2502.10197 null
2025-02-13 Enhancing Jailbreak Attacks via Compliance-Refusal-Based Initialization Amit Levi et.al. 2502.09755 null
2025-02-13 NestQuant: Nested Lattice Quantization for Matrix Products and LLMs Semyon Savkin et.al. 2502.09720 null
2025-02-12 The Science of Evaluating Foundation Models Jiayi Yuan et.al. 2502.09670 null
2025-02-13 Copilot Arena: A Platform for Code LLM Evaluation in the Wild Wayne Chi et.al. 2502.09328 null
2025-02-12 Revisiting 3D LLM Benchmarks: Are We Really Testing 3D Capabilities? Jiahe Jin et.al. 2502.08503 link
2025-02-11 Forget What You Know about LLMs Evaluations -- LLMs are Like a Chameleon Nurit Cohen-Inger et.al. 2502.07445 link
2025-02-10 Evaluating the Systematic Reasoning Abilities of Large Language Models through Graph Coloring Alex Heyman et.al. 2502.07087 link
2025-02-10 Multi-turn Evaluation of Anthropomorphic Behaviours in Large Language Models Lujain Ibrahim et.al. 2502.07077 null
2025-02-07 LLM-Supported Natural Language to Bash Translation Finnian Westenfelder et.al. 2502.06858 link
2025-02-15 Self-Supervised Prompt Optimization Jinyu Xiang et.al. 2502.06855 link
2025-02-10 Resurrecting saturated LLM benchmarks with adversarial encoding Igor Ivanov et.al. 2502.06738 null
2025-02-10 Automatic Evaluation of Healthcare LLMs Beyond Question-Answering Anna Arias-Duart et.al. 2502.06666 null
2025-02-10 Unbiased Evaluation of Large Language Models from a Causal Perspective Meilin Chen et.al. 2502.06655 null
2025-02-10 LessLeak-Bench: A First Investigation of Data Leakage in LLMs Across 83 Software Engineering Benchmarks Xin Zhou et.al. 2502.06215 null
2025-02-05 Aero-LLM: A Distributed Framework for Secure UAV Communication and Intelligent Decision-Making Balakrishnan Dharmalingam et.al. 2502.05220 null
2025-02-06 TruthFlow: Truthful LLM Generation via Representation Flow Correction Hanyu Wang et.al. 2502.04556 null
2025-02-05 How do Humans and Language Models Reason About Creativity? A Comparative Analysis Antonio Laverghetta Jr. et.al. 2502.03253 null
2025-02-05 On Zero-Initialized Attention: Optimal Prompt and Gating Factor Estimation Nghiem T. Diep et.al. 2502.03029 null
2025-02-02 LLM-Powered Benchmark Factory: Reliable, Generic, and Efficient Peiwen Yuan et.al. 2502.01683 link
2025-02-02 HASSLE-free: A unified Framework for Sparse plus Low-Rank Matrix Decomposition for LLMs Mehdi Makni et.al. 2502.00899 null
2025-02-01 DUET: Optimizing Training Data Mixtures via Feedback from Unseen Evaluation Tasks Zhiliang Chen et.al. 2502.00270 null
2025-01-30 Overestimation in LLM Evaluation: A Controlled Large-Scale Study on Data Contamination's Impact on Machine Translation Muhammed Yusuf Kocyigit et.al. 2501.18771 null
2025-01-31 ExeCoder: Empowering Large Language Models with Executability Representation for Code Translation Minghua He et.al. 2501.18460 null
2025-02-01 LLM Evaluation Based on Aerospace Manufacturing Expertise: Automated Generation and Multi-Model Question Answering Beiming Liu et.al. 2501.17183 null
2025-01-28 An LLM Benchmark for Addressee Recognition in Multi-modal Multi-party Dialogue Koji Inoue et.al. 2501.16643 null
2025-01-26 HardML: A Benchmark For Evaluating Data Science And Machine Learning knowledge and reasoning in AI Tidor-Vlad Pricope et.al. 2501.15627 null
2025-01-23 Question Answering on Patient Medical Records with Private Fine-Tuned LLMs Sara Kothari et.al. 2501.13687 null
2025-01-10 CodEv: An Automated Grading Framework Leveraging Large Language Models for Consistent and Constructive Feedback En-Qi Tseng et.al. 2501.10421 null
2025-01-15 Towards Multilingual LLM Evaluation for Baltic and Nordic languages: A study on Lithuanian History Yevhen Kostiuk et.al. 2501.09154 null
2025-01-13 Benchmarking Abstractive Summarisation: A Dataset of Human-authored Summaries of Norwegian News Articles Samia Touileb et.al. 2501.07718 null
2025-01-03 FLAME: Financial Large-Language Model Assessment and Metrics Evaluation Jiayu Guo et.al. 2501.06211 link
2025-01-07 MTRAG: A Multi-Turn Conversational Benchmark for Evaluating Retrieval-Augmented Generation Systems Yannis Katsis et.al. 2501.03468 link
2025-01-05 Evaluating Large Language Models Against Human Annotators in Latent Content Analysis: Sentiment, Political Leaning, Emotional Intensity, and Sarcasm Ljubisa Bojic et.al. 2501.02532 null
2025-01-04 LLMzSzŁ: a comprehensive LLM benchmark for Polish Krzysztof Jassem et.al. 2501.02266 null
2025-01-08 VideoRefer Suite: Advancing Spatial-Temporal Object Understanding with Video LLM Yuqian Yuan et.al. 2501.00599 link
2025-01-04 Setting Standards in Turkish NLP: TR-MMLU for Large Language Model Evaluation M. Ali Bayram et.al. 2501.00593 null
2024-12-31 Echoes in AI: Quantifying Lack of Plot Diversity in LLM Outputs Weijia Xu et.al. 2501.00273 null
2024-12-30 EVOLVE: Emotion and Visual Output Learning via LLM Evaluation Jordan Sinclair et.al. 2412.20632 null
2024-12-24 Muse: A Multimodal Conversational Recommendation Dataset with Scenario-Grounded User Profiles Zihan Wang et.al. 2412.18416 null
2024-12-24 A Statistical Framework for Ranking LLM-Based Chatbots Siavash Ameli et.al. 2412.18407 link
2025-01-25 DeepCRCEval: Revisiting the Evaluation of Code Review Comment Generation Junyi Lu et.al. 2412.18291 null
2024-12-23 CARL-GT: Evaluating Causal Reasoning Capabilities of Large Language Models Ruibo Tu et.al. 2412.17970 link
2025-01-02 Baichuan4-Finance Technical Report Hanyu Zhang et.al. 2412.15270 null
2024-12-19 ObjVariantEnsemble: Advancing Point Cloud LLM Evaluation in Challenging Scenes with Subtly Distinguished Objects Qihang Cao et.al. 2412.14837 null
2024-12-18 AntiLeak-Bench: Preventing Data Contamination by Automatically Constructing Benchmarks with Updated Real-World Knowledge Xiaobao Wu et.al. 2412.13670 link
2025-02-16 Mind Your Theory: Theory of Mind Goes Deeper Than Reasoning Eitan Wagner et.al. 2412.13631 null
2025-02-17 OmniEval: An Omnidirectional and Automatic RAG Evaluation Benchmark in Financial Domain Shuting Wang et.al. 2412.13018 link
2024-12-10 How to Choose a Threshold for an Evaluation Metric for Large Language Models Bhaskarjit Sarmah et.al. 2412.12148 null
2024-12-15 Dual Traits in Probabilistic Reasoning of Large Language Models Shenxiong Li et.al. 2412.11009 link
2024-12-30 LLM-as-an-Interviewer: Beyond Static Testing Through Dynamic LLM Evaluation Eunsu Kim et.al. 2412.10424 null
2024-12-13 Cultural Evolution of Cooperation among LLM Agents Aron Vallinder et.al. 2412.10270 null
2024-12-12 Towards Understanding the Robustness of LLM-based Evaluations under Perturbations Manav Chaudhary et.al. 2412.09269 null
2024-12-10 BiMediX2: Bio-Medical EXpert LMM for Diverse Medical Modalities Sahal Shaji Mullappilly et.al. 2412.07769 link
2024-12-12 PediaBench: A Comprehensive Chinese Pediatric Dataset for Benchmarking Large Language Models Qian Zhang et.al. 2412.06287 link
2024-12-02 AI Benchmarks and Datasets for LLM Evaluation Todor Ivanov et.al. 2412.01020 null
2024-11-30 Evaluating the Consistency of LLM Evaluators Noah Lee et.al. 2412.00543 null
2024-11-29 MIMDE: Exploring the Use of Synthetic vs Human Data for Evaluating Multi-Insight Multi-Document Extraction Tasks John Francis et.al. 2411.19689 null
2024-11-29 Beyond Surface Structure: A Causal Assessment of LLMs' Comprehension Ability Yujin Han et.al. 2411.19456 link
2024-11-27 Is my Meeting Summary Good? Estimating Quality with a Multi-LLM Evaluator Frederic Kirstein et.al. 2411.18444 null
2025-01-17 CS-Eval: A Comprehensive Large Language Model Benchmark for CyberSecurity Zhengmin Yu et.al. 2411.16239 link
2024-11-25 SAGEval: The frontiers of Satisfactory Agent based NLG Evaluation for reference-free open-ended text Reshmi Ghosh et.al. 2411.16077 null
2024-11-26 Do LLMs Agree on the Creativity Evaluation of Alternative Uses? Abdullah Al Rabeyah et.al. 2411.15560 null
2025-02-17 Ranking Unraveled: Recipes for LLM Rankings in Head-to-Head AI Combat Roland Daynauth et.al. 2411.14483 link
2024-11-21 Lost in Inference: Rediscovering the Role of Natural Language Inference for Large Language Models Lovish Madaan et.al. 2411.14103 null
2024-11-21 An Evaluation-Driven Approach to Designing LLM Agents: Process and Architecture Boming Xia et.al. 2411.13768 null
2024-11-21 A Framework for Evaluating LLMs Under Task Indeterminacy Luke Guerdan et.al. 2411.13760 null
2024-11-12 Large Language Models as Neurolinguistic Subjects: Identifying Internal Representations for Form and Meaning Linyang He et.al. 2411.07533 null
2024-11-13 Chinese SimpleQA: A Chinese Factuality Evaluation for Large Language Models Yancheng He et.al. 2411.07140 null
2024-11-09 Golden Touchstone: A Comprehensive Bilingual Benchmark for Evaluating Financial Large Language Models Xiaojun Wu et.al. 2411.06272 link
2025-02-09 ProverbEval: Exploring LLM Evaluation Challenges for Low-resource Language Understanding Israel Abebe Azime et.al. 2411.05049 null
2024-11-07 Bayesian Calibration of Win Rate Estimation with LLM Evaluators Yicheng Gao et.al. 2411.04424 link
2024-11-05 Enhancing LLM Evaluations: The Garbling Trick William F. Bradley et.al. 2411.01533 null
2025-02-19 Varco Arena: A Tournament Approach to Reference-Free Benchmarking Large Language Models Seonil Son et.al. 2411.01281 null
2025-02-07 Mastering the Craft of Data Synthesis for CodeLLMs Meng Chen et.al. 2411.00005 link
2024-10-28 Project MPG: towards a generalized performance benchmark for LLM capabilities Lucas Spangher et.al. 2410.22368 null
2024-10-29 Self-Preference Bias in LLM-as-a-Judge Koki Wataoka et.al. 2410.21819 null
2024-10-28 Unveiling Context-Aware Criteria in Self-Assessing LLMs Taneesh Gupta et.al. 2410.21545 null
2024-10-27 LoRA Done RITE: Robust Invariant Transformation Equilibration for LoRA Optimization Jui-Nan Yen et.al. 2410.20625 null
2024-10-26 Limitations of the LLM-as-a-Judge Approach for Evaluating LLM Outputs in Expert Knowledge Tasks Annalisa Szymanski et.al. 2410.20266 null
2024-10-23 MiLoRA: Efficient Mixture of Low-Rank Adaptation for Large Language Models Fine-tuning Jingfan Zhang et.al. 2410.18035 null
2025-02-21 Towards Automated Penetration Testing: Introducing LLM Benchmark, Analysis, and Improvements Isamu Isozaki et.al. 2410.17141 link
2024-10-21 CompassJudger-1: All-in-one Judge Model Helps Model Evaluation and Evolution Maosong Cao et.al. 2410.16256 link
2025-01-26 mHumanEval -- A Multilingual Benchmark to Evaluate Large Language Models for Code Generation Nishat Raihan et.al. 2410.15037 link
2024-10-19 CAP: Data Contamination Detection via Consistency Amplification Yi Zhao et.al. 2410.15005 null
2024-10-18 Enabling Scalable Evaluation of Bias Patterns in Medical LLMs Hamed Fayyaz et.al. 2410.14763 link
2024-11-06 Diverging Preferences: When do Annotators Disagree and do Models Know? Michael JQ Zhang et.al. 2410.14632 null
2024-10-18 Combining Entropy and Matrix Nuclear Norm for Enhanced Evaluation of Language Models James Vo et.al. 2410.14480 null
2024-10-21 BenTo: Benchmark Task Reduction with In-Context Transferability Hongyu Zhao et.al. 2410.13804 link
2024-10-16 BenchmarkCards: Large Language Model and Risk Reporting Anna Sokol et.al. 2410.12974 null
2025-02-01 Language Model Preference Evaluation with Multiple Weak Evaluators Zhengyu Hu et.al. 2410.12869 link
2024-10-11 Enterprise Benchmarks for Large Language Model Evaluation Bing Zhang et.al. 2410.12857 link
2024-10-16 An Automatic and Cost-Efficient Peer-Review Framework for Language Generation Evaluation Junjie Chen et.al. 2410.12265 null
2024-10-15 Leaving the barn door open for Clever Hans: Simple features predict LLM benchmark answers Lorenzo Pacchiardi et.al. 2410.11672 link
2024-10-15 Black-box Uncertainty Quantification Method for LLM-as-a-Judge Nico Wagner et.al. 2410.11594 null
2024-10-14 Jailbreak Instruction-Tuned LLMs via end-of-sentence MLP Re-weighting Yifan Luo et.al. 2410.10150 null
2024-12-13 HARDMath: A Benchmark Dataset for Challenging Problems in Applied Mathematics Jingxuan Fan et.al. 2410.09988 link
2024-10-15 LongHalQA: Long-Context Hallucination Evaluation for MultiModal Large Language Models Han Qiu et.al. 2410.09962 link
2024-10-17 Towards Multilingual LLM Evaluation for European Languages Klaudia Thellmann et.al. 2410.08928 null
2024-10-11 Test-driven Software Experimentation with LASSO: an LLM Benchmarking Example Marcus Kessel et.al. 2410.08911 null
2024-10-10 Assessing Episodic Memory in LLMs with Sequence Order Recall Tasks Mathis Pink et.al. 2410.08133 null
2025-02-03 COMPL-AI Framework: A Technical Interpretation and LLM Benchmarking Suite for the EU Artificial Intelligence Act Philipp Guldimann et.al. 2410.07959 link
2024-11-06 News Reporter: A Multi-lingual LLM Framework for Broadcast T.V News Tarun Jain et.al. 2410.07520 null
2024-10-09 Cheating Automatic LLM Benchmarks: Null Models Achieve High Win Rates Xiaosen Zheng et.al. 2410.07137 link
2024-10-09 ReIFE: Re-evaluating Instruction-Following Evaluation Yixin Liu et.al. 2410.07069 link
2024-10-08 Active Evaluation Acquisition for Efficient LLM Benchmarking Yang Li et.al. 2410.05952 null
2024-10-07 TurtleBench: Evaluating Top Language Models via Real-World Yes/No Puzzles Qingchen Yu et.al. 2410.05262 link
2024-10-01 Language Enhanced Model for Eye (LEME): An Open-Source Ophthalmology-Specific Large Language Model Aidan Gilson et.al. 2410.03740 null
2024-10-04 TICKing All the Boxes: Generated Checklists Improve LLM Evaluation and Generation Jonathan Cook et.al. 2410.03608 null
2024-10-04 Towards Reproducible LLM Evaluation: Quantifying Uncertainty in LLM Benchmark Scores Robert E. Blackwell et.al. 2410.03492 null
2024-10-29 AIME: AI System Optimization via Multiple LLM Evaluators Bhrij Patel et.al. 2410.03131 null
2024-10-02 Comparing Criteria Development Across Domain Experts, Lay Users, and Models in Large Language Model Evaluation Annalisa Szymanski et.al. 2410.02054 null
2024-10-02 Knowledge-Driven Feature Selection and Engineering for Genotype Data with Large Language Models Joseph Lee et.al. 2410.01795 link
2024-10-03 Extending Context Window of Large Language Models from a Distributional Perspective Yingsheng Wu et.al. 2410.01490 null
2024-10-02 ConServe: Harvesting GPUs for Low-Latency and High-Throughput Large Language Model Serving Yifan Qiao et.al. 2410.01228 null
2024-10-01 ViDAS: Vision-based Danger Assessment and Scoring Pranav Gupta et.al. 2410.00477 null
2024-10-01 PclGPT: A Large Language Model for Patronizing and Condescending Language Detection Hongbo Wang et.al. 2410.00361 link
2024-11-26 LexEval: A Comprehensive Chinese Legal Benchmark for Evaluating Large Language Models Haitao Li et.al. 2409.20288 link
2024-09-29 Does RAG Introduce Unfairness in LLMs? Evaluating Fairness in Retrieval-Augmented Generation Systems Xuyang Wu et.al. 2409.19804 null
2024-10-19 Can Large Language Models Analyze Graphs like Professionals? A Benchmark, Datasets and Models Xin Li et.al. 2409.19667 link
2024-10-05 IDGen: Item Discrimination Induced Prompt Generation for LLM Evaluation Fan Lin et.al. 2409.18892 link
2024-12-13 A Character-Centric Creative Story Generation via Imagination Kyeongman Park et.al. 2409.16667 null
2024-09-25 Judgment of Thoughts: Courtroom of the Binary Logical Reasoning in Large Language Models Sungjune Park et.al. 2409.16635 null
2024-12-18 Kalahi: A handcrafted, grassroots cultural LLM evaluation suite for Filipino Jann Railey Montalan et.al. 2409.15380 link
2024-12-16 MQM-APE: Toward High-Quality Error Annotation Predictors with Automatic Post-Editing in LLM Translation Evaluators Qingyu Lu et.al. 2409.14335 link
2024-09-21 ChemEval: A Comprehensive Multi-Level Chemical Evaluation for Large Language Models Yuqing Huang et.al. 2409.13989 link
2024-12-17 AraDiCE: Benchmarks for Dialectal and Cultural Capabilities in LLMs Basel Mousi et.al. 2409.11404 null
2024-10-02 LLM-as-a-Judge & Reward Model: What They Can and Cannot Do Guijin Son et.al. 2409.11239 null
2024-12-08 Towards Data Contamination Detection for Modern Large Language Models: Limitations, Inconsistencies, and Oracle Challenges Vinay Samuel et.al. 2409.09927 link
2024-09-13 Cracking the Code: Multi-domain LLM Evaluation on Real-World Professional Exams in Indonesia Fajri Koto et.al. 2409.08564 null
2024-09-09 Assessing SPARQL capabilities of Large Language Models Lars-Peter Meyer et.al. 2409.05925 link
2024-10-08 LongGenBench: Benchmarking Long-Form Generation in Long Context LLMs Yuhao Wu et.al. 2409.02076 link
2024-10-14 Polyrating: A Cost-Effective and Bias-Aware Rating System for LLM Evaluation Jasper Dekoninck et.al. 2409.00696 null
2024-08-26 Evaluating ChatGPT on Nuclear Domain-Specific Data Muhammad Anwar et.al. 2409.00090 null
2024-08-28 LLMSecCode: Evaluating Large Language Models for Secure Coding Anton Rydén et.al. 2408.16100 link
2024-08-26 LLM-3D Print: Large Language Models To Monitor and Control 3D Printing Yayati Jadhav et.al. 2408.14307 null
2024-08-26 Epidemic Information Extraction for Event-Based Surveillance using Large Language Models Sergio Consoli et.al. 2408.14277 null
2024-10-04 MobileQuant: Mobile-friendly Quantization for On-device Language Models Fuwen Tan et.al. 2408.13933 link
2024-08-23 LalaEval: A Holistic Human Evaluation Framework for Domain-Specific Large Language Models Chongyan Sun et.al. 2408.13338 null
2024-08-23 Open Llama2 Model for the Lithuanian Language Artūras Nakvosas et.al. 2408.12963 null
2024-08-23 LIMP: Large Language Model Enhanced Intent-aware Mobility Prediction Songwei Li et.al. 2408.12832 link
2024-12-20 Recording for Eyes, Not Echoing to Ears: Contextualized Spoken-to-Written Conversion of ASR Transcripts Jiaqing Liu et.al. 2408.09688 null
2024-08-20 Constructing Domain-Specific Evaluation Sets for LLM-as-a-judge Ravi Raju et.al. 2408.08808 null
2024-10-16 The Fellowship of the LLMs: Multi-Agent Workflows for Synthetic Preference Optimization Dataset Generation Samee Arif et.al. 2408.08688 link
2024-10-19 Persona is a Double-edged Sword: Mitigating the Negative Impact of Role-playing Prompts in Zero-shot Reasoning Tasks Junseok Kim et.al. 2408.08631 null

(back to top)

LLM MLLM

Publish Date Title Authors PDF Code
2025-02-27 R2-T2: Re-Routing in Test-Time for Multimodal Mixture-of-Experts Zhongyang Li et.al. 2502.20395 null
2025-02-27 InterMimic: Towards Universal Whole-Body Control for Physics-Based Human-Object Interactions Sirui Xu et.al. 2502.20390 null
2025-02-27 Beyond Next-Token: Next-X Prediction for Autoregressive Visual Generation Sucheng Ren et.al. 2502.20388 null
2025-02-27 Why Are Web AI Agents More Vulnerable Than Standalone LLMs? A Security Analysis Jeffrey Yang Fan Chiang et.al. 2502.20383 null
2025-02-27 Multi-Agent Verification: Scaling Test-Time Compute with Multiple Verifiers Shalev Lifshitz et.al. 2502.20379 null
2025-02-27 PhantomWiki: On-Demand Datasets for Reasoning and Retrieval Evaluation Albert Gong et.al. 2502.20377 null
2025-02-27 Constrained Generative Modeling with Manually Bridged Diffusion Models Saeid Naderiparizi et.al. 2502.20371 null
2025-02-27 Bridging Legal Knowledge and AI: Retrieval-Augmented Generation with Vector Stores, Knowledge Graphs, and Hierarchical Non-negative Matrix Factorization Ryan C. Barron et.al. 2502.20364 null
2025-02-27 Bridging the Creativity Understanding Gap: Small-Scale Human Alignment Enables Expert-Level Humor Ranking in LLMs Kuan Lok Zhou et.al. 2502.20356 null
2025-02-27 KEDRec-LM: A Knowledge-distilled Explainable Drug Recommendation Large Language Model Kai Zhang et.al. 2502.20350 null
2025-02-27 Sparse Auto-Encoder Interprets Linguistic Features in Large Language Models Yi Jing et.al. 2502.20344 null
2025-02-27 Thinking Slow, Fast: Scaling Inference Compute with Distilled Reasoners Daniele Paliotta et.al. 2502.20339 null
2025-02-27 Expertise Is What We Want Alan Ashworth et.al. 2502.20335 null
2025-02-27 Emergent Symbolic Mechanisms Support Abstract Reasoning in Large Language Models Yukang Yang et.al. 2502.20332 null
2025-02-27 Long-Context Inference with Retrieval-Augmented Speculative Decoding Guanzheng Chen et.al. 2502.20330 null
2025-02-27 EAIRA: Establishing a Methodology for Evaluating AI Models as Scientific Research Assistants Franck Cappello et.al. 2502.20309 null
2025-02-27 M^3Builder: A Multi-Agent System for Automated Machine Learning in Medical Imaging Jinghao Feng et.al. 2502.20301 null
2025-02-27 An exploration of features to improve the generalisability of fake news detection models Nathaniel Hoy et.al. 2502.20299 null
2025-02-27 Judge a Book by its Cover: Investigating Multi-Modal LLMs for Multi-Page Handwritten Document Transcription Benjamin Gutteridge et.al. 2502.20295 null
2025-02-27 Conformal Tail Risk Control for Large Language Model Alignment Catherine Yu-Chi Chen et.al. 2502.20285 null
2025-02-27 Evaluating Human Trust in LLM-Based Planners: A Preliminary Study Shenghui Chen et.al. 2502.20284 null
2025-02-27 Large Language Models as Attribution Regularizers for Efficient Model Training Davor Vukadin et.al. 2502.20268 null
2025-02-27 Vector-Quantized Vision Foundation Models for Object-Centric Learning Rongzhen Zhao et.al. 2502.20263 null
2025-02-27 LLM as a Broken Telephone: Iterative Generation Distorts Information Amr Mohamed et.al. 2502.20258 null
2025-02-27 Do computer vision foundation models learn the low-level characteristics of the human visual system? Yancheng Cai et.al. 2502.20256 null
2025-02-27 Beyond Natural Language Perplexity: Detecting Dead Code Poisoning in Code Generation Datasets Chichien Tsai et.al. 2502.20246 null
2025-02-27 From Retrieval to Generation: Comparing Different Approaches Abdelrahman Abdallah et.al. 2502.20245 null
2025-02-27 FINEREASON: Evaluating and Improving LLMs' Deliberate Reasoning through Reflective Puzzle Solving Guizhen Chen et.al. 2502.20238 null
2025-02-27 AI Will Always Love You: Studying Implicit Biases in Romantic AI Companions Clare Grogan et.al. 2502.20231 null
2025-02-27 Avat3r: Large Animatable Gaussian Reconstruction Model for High-fidelity 3D Head Avatars Tobias Kirschstein et.al. 2502.20220 null
2025-02-27 ChineseEcomQA: A Scalable E-commerce Concept Evaluation Benchmark for Large Language Models Haibin Chen et.al. 2502.20196 null
2025-02-27 Model Checking Linear Temporal Logic with Standpoint Modalities Rajab Aghamov et.al. 2502.20193 null
2025-02-27 Layer-Aware Task Arithmetic: Disentangling Task-Specific and Instruction-Following Knowledge Yan-Lun Chen et.al. 2502.20186 null
2025-02-27 DGFM: Full Body Dance Generation Driven by Music Foundation Models Xinran Liu et.al. 2502.20176 null
2025-02-27 An Extensive Evaluation of PDDL Capabilities in off-the-shelf LLMs Kaustubh Vyas et.al. 2502.20175 null
2025-02-27 Multimodal Representation Alignment for Image Generation: Text-Image Interleaved Control Is Easier Than You Think Liang Chen et.al. 2502.20172 null
2025-02-27 Re-evaluating Open-ended Evaluation of Large Language Models Siqi Liu et.al. 2502.20170 null
2025-02-27 Adaptive H&E-IHC information fusion staining framework based on feature extra Yifan Jia et.al. 2502.20156 null
2025-02-27 Telephone Surveys Meet Conversational AI: Evaluating a LLM-Based Telephone Survey System at Scale Max M. Lang et.al. 2502.20140 null
2025-02-27 Finite State Automata Inside Transformers with Chain-of-Thought: A Mechanistic Study on State Tracking Yifan Zhang et.al. 2502.20129 null
2025-02-27 Self-Training Elicits Concise Reasoning in Large Language Models Tergel Munkhbat et.al. 2502.20122 null
2025-02-27 LongRoPE2: Near-Lossless LLM Context Window Scaling Ning Shang et.al. 2502.20082 null
2025-02-27 Collab-Overcooked: Benchmarking and Evaluating Large Language Models as Collaborative Agents Haochen Sun et.al. 2502.20073 null
2025-02-27 A Generative Model Enhanced Multi-Agent Reinforcement Learning Method for Electric Vehicle Charging Navigation Tianyang Qi et.al. 2502.20068 null
2025-02-27 Polish-ASTE: Aspect-Sentiment Triplet Extraction Datasets for Polish Marta Lango et.al. 2502.20046 null
2025-02-27 3D-AffordanceLLM: Harnessing Large Language Models for Open-Vocabulary Affordance Detection in 3D Worlds Hengshuo Chu et.al. 2502.20041 null
2025-02-27 AsymLoRA: Harmonizing Data Conflicts and Commonalities in MLLMs Xuyang Wei et.al. 2502.20035 null
2025-02-27 Erasing Without Remembering: Safeguarding Knowledge Forgetting in Large Language Models Huazheng Wang et.al. 2502.19982 null
2025-02-27 The Lookahead Limitation: Why Multi-Operand Addition is Hard for LLMs Tanja Baeumel et.al. 2502.19981 null
2025-02-27 Can Large Language Models Unveil the Mysteries? An Exploration of Their Ability to Unlock Information in Complex Scenarios Chao Wang et.al. 2502.19973 null
2025-02-27 Deterministic or probabilistic? The psychology of LLMs as random number generators Javier Coronado-Blázquez et.al. 2502.19965 null
2025-02-27 SeisMoLLM: Advancing Seismic Monitoring via Cross-modal Transfer with Pre-trained Large Language Model Xinghao Wang et.al. 2502.19960 link
2025-02-27 Collaborative Stance Detection via Small-Large Language Model Consistency Verification Yu Yan et.al. 2502.19954 null
2025-02-27 GeoEdit: Geometric Knowledge Editing for Large Language Models Yujie Feng et.al. 2502.19953 null
2025-02-27 Algebraic Machine Learning: Learning as computing an algebraic decomposition of a task Fernando Martin-Maroto et.al. 2502.19944 null
2025-02-27 Alleviating Distribution Shift in Synthetic Data for Machine Translation Quality Estimation Xiang Geng et.al. 2502.19941 null
2025-02-27 Playing Pokémon Red via Deep Reinforcement Learning Marco Pleines et.al. 2502.19920 null
2025-02-27 Meta-Reasoner: Dynamic Guidance for Optimized Inference-time Reasoning in Large Language Models Yuan Sui et.al. 2502.19918 null
2025-02-27 Picking the Cream of the Crop: Visual-Centric Data Selection with Collaborative Agents Zhenyu Liu et.al. 2502.19917 null
2025-02-27 LLM-driven Effective Knowledge Tracing by Integrating Dual-channel Difficulty Jiahui Cen et.al. 2502.19915 null
2025-02-27 SkipPipe: Partial and Reordered Pipelining Framework for Training LLMs in Heterogeneous Networks Nikolay Blagoev et.al. 2502.19913 null
2025-02-27 Order Doesn't Matter, But Reasoning Does: Training LLMs with Order-Centric Augmentation Qianxi He et.al. 2502.19907 null
2025-02-27 Optimus-2: Multimodal Minecraft Agent with Goal-Observation-Action Conditioned Policy Zaijing Li et.al. 2502.19902 null
2025-02-27 GenPC: Zero-shot Point Cloud Completion via 3D Generative Priors An Li et.al. 2502.19896 null
2025-02-27 Beyond the Tip of Efficiency: Uncovering the Submerged Threats of Jailbreak Attacks in Small Language Models Sibo Yi et.al. 2502.19883 null
2025-02-27 Towards Multimodal Large-Language Models for Parent-Child Interaction: A Focus on Joint Attention Weiyan Shi et.al. 2502.19877 null
2025-02-27 MMKE-Bench: A Multimodal Editing Benchmark for Diverse Visual Knowledge Yuntao Du et.al. 2502.19870 link
2025-02-27 MIND: Towards Immersive Psychological Healing with Multi-agent Inner Dialogue Yujia Chen et.al. 2502.19860 null
2025-02-27 ConvCodeWorld: Benchmarking Conversational Code Generation in Reproducible Feedback Environments Hojae Han et.al. 2502.19852 null
2025-02-27 One-for-More: Continual Diffusion Model for Anomaly Detection Xiaofan Li et.al. 2502.19848 null
2025-02-27 ProAPO: Progressively Automatic Prompt Optimization for Visual Classification Xiangyan Qu et.al. 2502.19844 null
2025-02-27 Shared Stochastic Gaussian Process Latent Variable Models: A Multi-modal Generative Model for Quasar Spectra Vidhi Lalchand et.al. 2502.19824 null
2025-02-27 Foot-In-The-Door: A Multi-turn Jailbreak for LLMs Zixuan Weng et.al. 2502.19820 null
2025-02-27 Comet: Fine-grained Computation-communication Overlapping for Mixture-of-Experts Shulai Zhang et.al. 2502.19811 null
2025-02-27 Implicit Search via Discrete Diffusion: A Study on Chess Jiacheng Ye et.al. 2502.19805 null
2025-02-27 Developmental Support Approach to AI's Autonomous Growth: Toward the Realization of a Mutually Beneficial Stage Through Experiential Learning Taichiro Endo et.al. 2502.19798 null
2025-02-27 ChatMol: A Versatile Molecule Designer Based on the Numerically Enhanced Large Language Model Chuanliu Fan et.al. 2502.19794 null
2025-02-27 Mixtera: A Data Plane for Foundation Model Training Maximilian Böther et.al. 2502.19790 null
2025-02-27 Advancements in Natural Language Processing for Automatic Text Summarization Nevidu Jayatilleke et.al. 2502.19773 null
2025-02-27 Does Your Voice Assistant Remember? Analyzing Conversational Context Recall and Utilization in Voice Interaction Models Heeseung Kim et.al. 2502.19759 null
2025-02-27 PolyPrompt: Automating Knowledge Extraction from Multilingual Language Models with Dynamic Prompt Generation Nathan Roll et.al. 2502.19756 null
2025-02-27 Beneath the Surface: How Large Language Models Reflect Hidden Bias Jinhao Pan et.al. 2502.19749 null
2025-02-27 HaLoRA: Hardware-aware Low-Rank Adaptation for Large Language Models Based on Hybrid Compute-in-Memory Architecture Taiqiang Wu et.al. 2502.19747 null
2025-02-27 R1-T1: Fully Incentivizing Translation Capability in LLMs via Reasoning Learning Minggui He et.al. 2502.19735 null
2025-02-27 Preference Learning Unlocks LLMs' Psycho-Counseling Skills Mian Zhang et.al. 2502.19731 null
2025-02-27 Do Expressions Change Decisions? Exploring the Impact of AI's Explanation Tone on Decision-Making Ayano Okoso et.al. 2502.19730 null
2025-02-27 Tokens for Learning, Tokens for Unlearning: Mitigating Membership Inference Attacks in Large Language Models via Dual-Purpose Training Toan Tran et.al. 2502.19726 null
2025-02-27 Few-Shot Multilingual Open-Domain QA from 5 Examples Fan Jiang et.al. 2502.19722 null
2025-02-27 Sensing and Steering Stereotypes: Extracting and Applying Gender Representation Vectors in LLMs Hannah Cyberey et.al. 2502.19721 null
2025-02-27 Teaching Dense Retrieval Models to Specialize with Listwise Distillation and LLM Data Augmentation Manveer Singh Tamber et.al. 2502.19712 null
2025-02-27 AoECR: AI-ization of Elderly Care Robot Linkun Zhou et.al. 2502.19706 null
2025-02-27 You Only Click Once: Single Point Weakly Supervised 3D Instance Segmentation for Autonomous Driving Guangfeng Jiang et.al. 2502.19698 null
2025-02-27 M-LLM Based Video Frame Selection for Efficient Video Understanding Kai Hu et.al. 2502.19680 null
2025-02-27 Old Experience Helps: Leveraging Survey Methodology to Improve AI Text Annotation Reliability in Social Sciences Linzhuo li et.al. 2502.19679 null
2025-02-27 Improving Adversarial Transferability in MLLMs via Dynamic Vision-Language Alignment Attack Chenhe Gu et.al. 2502.19672 null
2025-02-27 SuPreME: A Supervised Pre-training Framework for Multimodal ECG Representation Learning Mingsheng Cai et.al. 2502.19668 null
2025-02-27 Taxonomy, Opportunities, and Challenges of Representation Engineering for Large Language Models Jan Wehner et.al. 2502.19649 null
2025-02-27 cMIM: A Contrastive Mutual Information Framework for Unified Generative and Discriminative Representation Learning Micha Livne et.al. 2502.19642 null
2025-02-26 Agentic Mixture-of-Workflows for Multi-Modal Chemical Search Tiffany J. Callahan et.al. 2502.19629 null
2025-02-26 Treatment Non-Adherence Bias in Clinical Machine Learning: A Real-World Study on Hypertension Medication Zhongyuan Liang et.al. 2502.19625 null
2025-02-26 Norm Growth and Stability Challenges in Localized Sequential Knowledge Editing Akshat Gupta et.al. 2502.19416 null
2025-02-26 Code to Think, Think to Code: A Survey on Code-Enhanced Reasoning and Reasoning-Driven Code Intelligence in LLMs Dayu Yang et.al. 2502.19411 null
2025-02-26 Less or More: Towards Glanceable Explanations for LLM Recommendations Using Ultra-Small Devices Xinru Wang et.al. 2502.19410 null
2025-02-26 ImageChain: Advancing Sequential Image-to-Text Reasoning in Multimodal Large Language Models Danae Sánchez Villegas et.al. 2502.19409 null
2025-02-26 Learning Code-Edit Embedding to Model Student Debugging Behavior Hasnain Heickal et.al. 2502.19407 null
2025-02-26 General Reasoning Requires Learning to Reason from the Get-go Seungwook Han et.al. 2502.19402 null
2025-02-26 TheoremExplainAgent: Towards Multimodal Explanations for LLM Theorem Understanding Max Ku et.al. 2502.19400 null
2025-02-26 Multi-modal Contrastive Learning for Tumor-specific Missing Modality Synthesis Minjoo Lim et.al. 2502.19390 null
2025-02-26 LiDAR Registration with Visual Foundation Models Niclas Vödisch et.al. 2502.19374 null
2025-02-26 Deep Learning For Time Series Analysis With Application On Human Motion Ali Ismail-Fawaz et.al. 2502.19364 null
2025-02-26 DataMan: Data Manager for Pre-training Large Language Models Ru Peng et.al. 2502.19363 null
2025-02-26 Can Large Language Models Detect Errors in Long Chain-of-Thought Reasoning? Yancheng He et.al. 2502.19361 null
2025-02-26 Evaluating LLMs and Pre-trained Models for Text Summarization Across Diverse Datasets Tohida Rehman et.al. 2502.19339 null
2025-02-26 Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systems Hao Peng et.al. 2502.19328 null
2025-02-26 Shh, don't say that! Domain Certification in LLMs Cornelius Emde et.al. 2502.19320 null
2025-02-26 Rethinking LLM Unlearning Objectives: A Gradient Perspective and Go Beyond Qizhou Wang et.al. 2502.19301 null
2025-02-26 Agent-centric Information Access Evangelos Kanoulas et.al. 2502.19298 null
2025-02-26 Complex LLM Planning via Automated Heuristics Discovery Hongyi Ling et.al. 2502.19295 null
2025-02-26 Efficient Federated Search for Retrieval-Augmented Generation Rachid Guerraoui et.al. 2502.19280 null
2025-02-26 ArtInsight: Enabling AI-Powered Artwork Engagement for Mixed Visual-Ability Families Arnavi Chheda-Kothary et.al. 2502.19263 null
2025-02-26 AI-Powered Bayesian Inference Veronika Ročková et.al. 2502.19231 null
2025-02-26 Two Heads Are Better Than One: Dual-Model Verbal Reflection at Inference-Time Jiazheng Li et.al. 2502.19230 null
2025-02-26 A Lightweight and Extensible Cell Segmentation and Classification Model for Whole Slide Images Nikita Shvetsov et.al. 2502.19217 null
2025-02-26 A Hybrid Transformer Architecture with a Quantized Self-Attention Mechanism Applied to Molecular Generation Anthony M. Smaldone et.al. 2502.19214 null
2025-02-26 Negation-Induced Forgetting in LLMs Francesca Capuano et.al. 2502.19211 null
2025-02-26 Bi'an: A Bilingual Benchmark and Model for Hallucination Detection in Retrieval-Augmented Generation Zhouyu Jiang et.al. 2502.19209 null
2025-02-26 Simulation of Language Evolution under Regulated Social Media Platforms: A Synergistic Approach of Large Language Models and Genetic Algorithms Jinyu Cai et.al. 2502.19193 null
2025-02-26 BIG-Bench Extra Hard Mehran Kazemi et.al. 2502.19187 null
2025-02-26 INFO-SEDD: Continuous Time Markov Chains as Scalable Information Metrics Estimators Alberto Foresti et.al. 2502.19183 null
2025-02-26 UQABench: Evaluating User Embedding for Prompting LLMs in Personalized Question Answering Langming Liu et.al. 2502.19178 null
2025-02-26 MEDDxAgent: A Unified Modular Agent Framework for Explainable Automatic Differential Diagnosis Daniel Rose et.al. 2502.19175 null
2025-02-26 A Model-Centric Review of Deep Learning for Protein Design Gregory W. Kyro et.al. 2502.19173 null
2025-02-26 CodeIF: Benchmarking the Instruction-Following Capabilities of Large Language Models for Code Generation Kaiwen Yan et.al. 2502.19166 null
2025-02-26 TestNUC: Enhancing Test-Time Computing Approaches through Neighboring Unlabeled Data Consistency Henry Peng Zou et.al. 2502.19163 null
2025-02-26 Detecting Linguistic Indicators for Stereotype Assessment with Large Language Models Rebekka Görge et.al. 2502.19160 null
2025-02-26 A Sliding Layer Merging Method for Efficient Depth-Wise Pruning in LLMs Xuan Ding et.al. 2502.19159 null
2025-02-26 When Personalization Meets Reality: A Multi-Faceted Analysis of Personalized Preference Learning Yijiang River Dong et.al. 2502.19158 null
2025-02-26 Isolating Language-Coding from Problem-Solving: Benchmarking LLMs with PseudoEval Jiarong Wu et.al. 2502.19149 null
2025-02-26 Amulet: ReAlignment During Test Time for Personalized Preference Adaptation of LLMs Zhaowei Zhang et.al. 2502.19148 null
2025-02-26 Identification Under the Semantic Effective Secrecy Constraint Abdalla Ibrahim et.al. 2502.19142 null
2025-02-26 A Temporal Planning Framework for Multi-Agent Systems via LLM-Aided Knowledge Base Management Enrico Saccon et.al. 2502.19135 null
2025-02-26 Self-Memory Alignment: Mitigating Factual Hallucinations with Generalized Improvement Siyuan Zhang et.al. 2502.19127 null
2025-02-26 A Survey on Foundation-Model-Based Industrial Defect Detection Tianle Yang et.al. 2502.19106 null
2025-02-26 Evaluating Gender Bias in German Machine Translation Michelle Kappl et.al. 2502.19104 null
2025-02-26 LongEval: A Comprehensive Analysis of Long-Text Generation Through a Plan-based Paradigm Siwei Wu et.al. 2502.19103 null
2025-02-26 Nexus: A Lightweight and Scalable Multi-Agent Framework for Complex Tasks Automation Humza Sami et.al. 2502.19091 link
2025-02-26 EndoMamba: An Efficient Foundation Model for Endoscopic Videos Qingyao Tian et.al. 2502.19090 null
2025-02-26 Sparse Brains are Also Adaptive Brains: Cognitive-Load-Aware Dynamic Activation for LLMs Yiheng Yang et.al. 2502.19078 null
2025-02-26 IndicEval-XL: Bridging Linguistic Diversity in Code Generation Across Indic Languages Ujjwal Singh et.al. 2502.19067 null
2025-02-26 Can Large Language Models Outperform Non-Experts in Poetry Evaluation? A Comparative Study Using the Consensual Assessment Technique Piotr Sawicki et.al. 2502.19064 null
2025-02-26 MathClean: A Benchmark for Synthetic Mathematical Data Cleaning Hao Liang et.al. 2502.19058 null
2025-02-26 Beyond Surface-Level Patterns: An Essence-Driven Defense Framework Against Jailbreak Attacks in LLMs Shiyu Xiang et.al. 2502.19041 null
2025-02-26 FungalZSL: Zero-Shot Fungal Classification with Image Captioning Using a Synthetic Data Approach Anju Rani et.al. 2502.19038 null
2025-02-26 InternVQA: Advancing Compressed Video Quality Assessment with Distilling Large Foundation Model Fengbin Guan et.al. 2502.19026 null
2025-02-26 Binary Neural Networks for Large Language Model: A Survey Liangdong Liu et.al. 2502.19008 null
2025-02-26 The Sharpness Disparity Principle in Transformers for Accelerating Language Model Pre-Training Jinbo Wang et.al. 2502.19002 null
2025-02-26 MEBench: Benchmarking Large Language Models for Cross-Document Multi-Entity Question Answering Teng Lin et.al. 2502.18993 null
2025-02-26 OntologyRAG: Better and Faster Biomedical Code Mapping with Retrieval-Augmented Generation (RAG) Leveraging Ontology Knowledge Graphs and Large Language Models Hui Feng et.al. 2502.18992 null
2025-02-26 GenTool: Enhancing Tool Generalization in Language Models through Zero-to-One and Weak-to-Strong Simulation Jie He et.al. 2502.18990 null
2025-02-26 PEToolLLM: Towards Personalized Tool Learning in Large Language Models Qiancheng Xu et.al. 2502.18980 null
2025-02-26 Low-Confidence Gold: Refining Low-Confidence Samples for Efficient Instruction Tuning Hongyi Cal et.al. 2502.18978 null
2025-02-26 (Mis)Fitting: A Survey of Scaling Laws Margaret Li et.al. 2502.18969 null
2025-02-26 Know You First and Be You Better: Modeling Human-Like User Simulators via Implicit Profiles Kuang Wang et.al. 2502.18968 link
2025-02-26 OneRec: Unifying Retrieve and Rank with Generative Recommender and Iterative Preference Alignment Jiaxin Deng et.al. 2502.18965 null
2025-02-26 DualSpec: Text-to-spatial-audio Generation via Dual-Spectrogram Guided Diffusion Model Lei Zhao et.al. 2502.18952 null
2025-02-26 Towards Label-Only Membership Inference Attack against Pre-trained Large Language Models Yu He et.al. 2502.18943 null
2025-02-26 JailBench: A Comprehensive Chinese Security Assessment Benchmark for Large Language Models Shuyi Liu et.al. 2502.18935 null
2025-02-26 Talking like Piping and Instrumentation Diagrams (P&IDs) Achmad Anggawirya Alimin et.al. 2502.18928 null
2025-02-26 ClassInvGen: Class Invariant Synthesis using Large Language Models Chuyue Sun et.al. 2502.18917 null
2025-02-26 END: Early Noise Dropping for Efficient and Effective Context Denoising Hongye Jin et.al. 2502.18915 null
2025-02-26 CLLoRA: An Approach to Measure the Effects of the Context Length for LLM Fine-Tuning Ping Zhang et.al. 2502.18910 null
2025-02-26 An Empirical Study on Commit Message Generation using LLMs via In-Context Learning Yifan Wu et.al. 2502.18904 null
2025-02-26 From Hours to Minutes: Lossless Acceleration of Ultra Long Sequence Generation up to 100K Tokens Tong Wu et.al. 2502.18890 null
2025-02-26 Letters from Future Self: Augmenting the Letter-Exchange Exercise with LLM-based Agents to Enhance Young Adults' Career Exploration Hayeon Jeon et.al. 2502.18881 null
2025-02-26 Learning to Generate Structured Output with Schema Reinforcement Learning Yaxi Lu et.al. 2502.18878 null
2025-02-26 Learning to Align Multi-Faceted Evaluation: A Unified and Robust Framework Kaishuai Xu et.al. 2502.18874 null
2025-02-26 Multi-LLM Collaborative Search for Complex Problem Solving Sen Yang et.al. 2502.18873 null
2025-02-26 A Theoretical Perspective: How to Prevent Model Collapse in Self-consuming Training Loops Shi Fu et.al. 2502.18865 null
2025-02-26 Sherlock: Towards Multi-scene Video Abnormal Event Extraction and Localization via a Global-local Spatial-sensitive LLM Junxiao Ma et.al. 2502.18863 null
2025-02-26 A Causal Lens for Evaluating Faithfulness Metrics Kerem Zaman et.al. 2502.18848 null
2025-02-26 Sliding Window Attention Training for Efficient Large Language Models Zichuan Fu et.al. 2502.18845 null
2025-02-26 Evidence-Driven Marker Extraction for Social Media Suicide Risk Detection Carter Adams et.al. 2502.18823 null
2025-02-26 Data-Efficient Multi-Agent Spatial Planning with LLMs Huangyuan Su et.al. 2502.18822 null
2025-02-26 CAMEx: Curvature-aware Merging of Experts Dung V. Nguyen et.al. 2502.18821 null
2025-02-26 Judge as A Judge: Improving the Evaluation of Retrieval-Augmented Generation through the Judge-Consistency of Large Language Models Shuliang Liu et.al. 2502.18817 null
2025-02-26 Holistic Audit Dataset Generation for LLM Unlearning via Knowledge Graph Traversal and Redundancy Removal Weipeng Jiang et.al. 2502.18810 null
2025-02-26 Optimal Stochastic Trace Estimation in Generative Modeling Xinyang Liu et.al. 2502.18808 null
2025-02-26 SolEval: Benchmarking Large Language Models for Repository-level Solidity Code Generation Zhiyuan Peng et.al. 2502.18793 null
2025-02-26 Active Few-Shot Learning for Text Classification Saeed Ahmadnia et.al. 2502.18782 null
2025-02-26 Towards Optimal Multi-draft Speculative Decoding Zhengmian Hu et.al. 2502.18779 null
2025-02-26 M2-omni: Advancing Omni-MLLM for Comprehensive Modality Support with Competitive Performance Qingpei Guo et.al. 2502.18778 null
2025-02-26 Plutus: Benchmarking Large Language Models in Low-Resource Greek Finance Xueqing Peng et.al. 2502.18772 null
2025-02-26 Exploring Graph Tasks with Pure LLMs: A Comprehensive Benchmark and Investigation Yuxiang Wang et.al. 2502.18771 link
2025-02-26 Reward Shaping to Mitigate Reward Hacking in RLHF Jiayi Fu et.al. 2502.18770 null
2025-02-26 CommGPT: A Graph and Retrieval-Augmented Multimodal Communication Foundation Model Feibo Jiang et.al. 2502.18763 null
2025-02-26 Training Large Recommendation Models via Graph-Language Token Alignment Mingdai Yang et.al. 2502.18757 null
2025-02-26 M-ANT: Efficient Low-bit Group Quantization for LLMs via Mathematically Adaptive Numerical Type Weiming Hu et.al. 2502.18755 null
2025-02-26 AgentSociety Challenge: Designing LLM Agents for User Modeling and Recommendation on Web Platforms Yuwei Yan et.al. 2502.18754 null
2025-02-26 Spectral-Enhanced Transformers: Leveraging Large-Scale Pretrained Models for Hyperspectral Object Tracking Shaheer Mohamed et.al. 2502.18748 null
2025-02-26 Automatic Prompt Optimization via Heuristic Search: A Survey Wendi Cui et.al. 2502.18746 null
2025-02-25 DRAMA: Diverse Augmentation from Large Language Models to Smaller Dense Retrievers Xueguang Ma et.al. 2502.18460 null
2025-02-25 LLM-Based Design Pattern Detection Christian Schindler et.al. 2502.18458 null
2025-02-25 FRIDA to the Rescue! Analyzing Synthetic Data Effectiveness in Object-Based Common Sense Reasoning for Disaster Response Mollie Shichman et.al. 2502.18452 null
2025-02-25 SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution Yuxiang Wei et.al. 2502.18449 null
2025-02-25 MAPoRL: Multi-Agent Post-Co-Training for Collaborative Large Language Models with Reinforcement Learning Chanwoo Park et.al. 2502.18439 null
2025-02-25 TextGames: Learning to Self-Play Text-Based Puzzle Games via Language Model Reasoning Frederikus Hudi et.al. 2502.18431 null
2025-02-25 OmniAlign-V: Towards Enhanced Alignment of MLLMs with Human Preference Xiangyu Zhao et.al. 2502.18411 null
2025-02-25 Enhancing DNA Foundation Models to Address Masking Inefficiencies Monireh Safari et.al. 2502.18405 null
2025-02-25 Monte Carlo Temperature: a robust sampling strategy for LLM's uncertainty quantification methods Nicola Cecere et.al. 2502.18389 null
2025-02-25 How Far are LLMs from Real Search? A Comprehensive Study on Efficiency, Completeness, and Inherent Capabilities Minhua Lin et.al. 2502.18387 null
2025-02-25 MindMem: Multimodal for Predicting Advertisement Memorability Using LLMs and Deep Learning Sepehr Asgarian et.al. 2502.18371 null
2025-02-25 Sparse Bayesian Generative Modeling for Joint Parameter and Channel Estimation Benedikt Böck et.al. 2502.18369 null
2025-02-25 ART: Anonymous Region Transformer for Variable Multi-Layer Transparent Image Generation Yifan Pu et.al. 2502.18364 null
2025-02-25 Responsible AI Agents Deven R. Desai et.al. 2502.18359 null
2025-02-25 Which Contributions Deserve Credit? Perceptions of Attribution in Human-AI Co-Creation Jessica He et.al. 2502.18357 null
2025-02-25 BRIDO: Bringing Democratic Order to Abstractive Summarization Junhyun Lee et.al. 2502.18342 null
2025-02-25 Mapping of Subjective Accounts into Interpreted Clusters (MOSAIC): Topic Modelling and LLM applied to Stroboscopic Phenomenology Romy Beauté et.al. 2502.18318 null
2025-02-25 GCDance: Genre-Controlled 3D Full Body Dance Generation Driven By Music Xinran Liu et.al. 2502.18309 null
2025-02-25 RefuteBench 2.0 -- Agentic Benchmark for Dynamic Evaluation of LLM Responses to Refutation Instruction Jianhao Yan et.al. 2502.18308 null
2025-02-25 LDGen: Enhancing Text-to-Image Synthesis via Large Language Model-Driven Language Representation Pengzhi Li et.al. 2502.18302 null
2025-02-25 Bayesian Computation in Deep Learning Wenlong Chen et.al. 2502.18300 null
2025-02-25 DeepCircuitX: A Comprehensive Repository-Level Dataset for RTL Code Understanding, Generation, and PPA Analysis Zeju Li et.al. 2502.18297 null
2025-02-25 AMPO: Active Multi-Preference Optimization Taneesh Gupta et.al. 2502.18293 null
2025-02-25 Better Aligned with Survey Respondents or Training Data? Unveiling Political Leanings of LLMs on U.S. Supreme Court Cases Shanshan Xu et.al. 2502.18282 null
2025-02-25 Citrus: Leveraging Expert Cognitive Pathways in a Medical Language Model for Advanced Medical Decision Support Guoxin Wang et.al. 2502.18274 null
2025-02-25 Imperfect Knowledge Management (IKM) in GEFRED (GENeralized model for Fuzzy RElational Databases) Leoncio Jimenez et.al. 2502.18255 null
2025-02-25 Iterative Counterfactual Data Augmentation Mitchell Plyler et.al. 2502.18249 null
2025-02-25 Unveiling and Causalizing CoT: A Causal Pespective Jiarun Fu et.al. 2502.18239 null
2025-02-25 Beyond the convexity assumption: Realistic tabular data generation under quantifier-free real linear constraints Mihaela Cătălina Stoian et.al. 2502.18237 null
2025-02-25 Debt Collection Negotiations with Large Language Models: An Evaluation System and Optimizing Decision Making with Multi-Agent Xiaofeng Wang et.al. 2502.18228 null
2025-02-25 From ChatGPT to DeepSeek: Can LLMs Simulate Humanity? Qian Wang et.al. 2502.18210 null
2025-02-25 LAG: LLM agents for Leaderboard Auto Generation on Demanding Jian Wu et.al. 2502.18209 null
2025-02-25 Grandes modelos de lenguaje: de la predicción de palabras a la comprensión? Carlos Gómez-Rodríguez et.al. 2502.18205 null
2025-02-25 Intersubjective Model of AI-mediated Communication: Augmenting Human-Human Text Chat through LLM-based Adaptive Agent Pair Shutaro Aoyama et.al. 2502.18201 null
2025-02-25 Task-Agnostic Semantic Communication with Multimodal Foundation Models Jiangjing Hu et.al. 2502.18200 null
2025-02-25 Agnostic calculation of atomic free energies with the descriptor density of states Thomas D Swinburne et.al. 2502.18191 null
2025-02-25 ChatMotion: A Multimodal Multi-Agent for Human Motion Analysis Li Lei et.al. 2502.18180 null
2025-02-25 Problem Solved? Information Extraction Design Space for Layout-Rich Documents using LLMs Gaye Colakoglu et.al. 2502.18179 null
2025-02-25 CLIPure: Purification in Latent Space via CLIP for Adversarially Robust Zero-Shot Classification Mingkun Zhang et.al. 2502.18176 null
2025-02-25 SECURA: Sigmoid-Enhanced CUR Decomposition with Uninterrupted Retention and Low-Rank Adaptation in Large Language Models Zhang Yuxuan et.al. 2502.18168 null
2025-02-25 Can LLMs Explain Themselves Counterfactually? Zahra Dehghanighobadi et.al. 2502.18156 null
2025-02-25 Carbon and Silicon, Coexist or Compete? A Survey on Human-AI Interactions in Agent-based Modeling and Simulation Ziyue Lin et.al. 2502.18145 null
2025-02-25 LevelRAG: Enhancing Retrieval-Augmented Generation with Multi-hop Logic Planning over Rewriting Augmented Searchers Zhuocheng Zhang et.al. 2502.18139 null
2025-02-25 Large Language Model Driven Agents for Simulating Echo Chamber Formation Chenhao Gu et.al. 2502.18138 null
2025-02-25 Inverse Materials Design by Large Language Model-Assisted Generative Framework Yun Hao et.al. 2502.18127 null
2025-02-25 HyperG: Hypergraph-Enhanced LLMs for Structured Knowledge Sirui Huang et.al. 2502.18125 null
2025-02-25 Bayesian Optimization for Controlled Image Editing via LLMs Chengkun Cai et.al. 2502.18116 null
2025-02-25 PromptMID: Modal Invariant Descriptors Based on Diffusion and Vision Foundation Models for Optical-SAR Image Matching Han Nie et.al. 2502.18104 null
2025-02-25 Detecting Offensive Memes with Social Biases in Singapore Context Using Multimodal Large Language Models Cao Yuxuan et.al. 2502.18101 link
2025-02-25 Towards Thinking-Optimal Scaling of Test-Time Compute for LLM Reasoning Wenkai Yang et.al. 2502.18080 null
2025-02-25 Examining the Threat Landscape: Foundation Models and Model Stealing Ankita Raj et.al. 2502.18077 null
2025-02-25 MRBTP: Efficient Multi-Robot Behavior Tree Planning and Collaboration Yishuai Cai et.al. 2502.18072 null
2025-02-25 Golden Ratio Mixing of Real and Synthetic Data for Stabilizing Generative Model Training Hengzhi He et.al. 2502.18049 null
2025-02-25 AutoCas: Autoregressive Cascade Predictor in Social Networks via Large Language Models Yuhao Zheng et.al. 2502.18040 null
2025-02-25 Harnessing Multiple Large Language Models: A Survey on LLM Ensemble Zhijun Chen et.al. 2502.18036 null
2025-02-25 Detecting Knowledge Boundary of Vision Large Language Models by Sampling-Based Inference Zhuo Chen et.al. 2502.18023 null
2025-02-25 AfroXLMR-Comet: Multilingual Knowledge Distillation with Attention Matching for Low-Resource languages Joshua Sakthivel Raju et.al. 2502.18020 null
2025-02-25 NotaGen: Advancing Musicality in Symbolic Music Generation with Large Language Model Training Paradigms Yashan Wang et.al. 2502.18008 null
2025-02-25 Unveiling the Key Factors for Distilling Chain-of-Thought Reasoning Xinghao Chen et.al. 2502.18001 null
2025-02-25 Model-Free Adversarial Purification via Coarse-To-Fine Tensor Network Representation Guang Lin et.al. 2502.17972 null
2025-02-25 LLM Knows Geometry Better than Algebra: Numerical Understanding of LLM-Based Agents in A Trading Arena Tianmi Ma et.al. 2502.17967 null
2025-02-25 Towards Better Understanding of Program-of-Thought Reasoning in Cross-Lingual and Multilingual Environments Patomporn Payoungkhamdee et.al. 2502.17956 null
2025-02-25 DeepSeek-R1 Outperforms Gemini 2.0 Pro, OpenAI o1, and o3-mini in Bilingual Complex Ophthalmology Reasoning Pusheng Xu et.al. 2502.17947 null
2025-02-25 Assessing Large Language Models in Agentic Multilingual National Bias Qianying Liu et.al. 2502.17945 null
2025-02-25 CaseGen: A Benchmark for Multi-Stage Legal Case Documents Generation Haitao Li et.al. 2502.17943 null
2025-02-25 Advantage-Guided Distillation for Preference Alignment in Small Language Models Shiping Gao et.al. 2502.17927 null
2025-02-25 LeanProgress: Guiding Search for Neural Theorem Proving via Proof Progress Prediction Suozhi Huang et.al. 2502.17925 null
2025-02-25 FACT-AUDIT: An Adaptive Multi-Agent Framework for Dynamic Fact-Checking Evaluation of Large Language Models Hongzhan Lin et.al. 2502.17924 null
2025-02-25 Towards Sustainable Web Agents: A Plea for Transparency and Dedicated Metrics for Energy Consumption Lars Krupp et.al. 2502.17903 null
2025-02-25 Knowledge-enhanced Multimodal ECG Representation Learning with Arbitrary-Lead Inputs Che Liu et.al. 2502.17900 null
2025-02-25 Can Large Language Models Identify Implicit Suicidal Ideation? An Empirical Evaluation Tong Li et.al. 2502.17899 null
2025-02-25 FetchBot: Object Fetching in Cluttered Shelves via Zero-Shot Sim2Real Weiheng Liu et.al. 2502.17894 null
2025-02-25 RankCoT: Refining Knowledge for Retrieval-Augmented Generation through Ranking Chain-of-Thoughts Mingyan Wu et.al. 2502.17888 null
2025-02-25 Science Across Languages: Assessing LLM Multilingual Translation of Scientific Papers Hannah Calzi Kleidermacher et.al. 2502.17882 null
2025-02-25 EEGM2: An Efficient Mamba-2-Based Self-Supervised Framework for Long-Sequence EEG Modeling Jiazhen Hong et.al. 2502.17873 null
2025-02-25 ASurvey: Spatiotemporal Consistency in Video Generation Zhiyu Yin et.al. 2502.17863 null
2025-02-25 HRR: Hierarchical Retrospection Refinement for Generated Image Detection Peipei Yuan et.al. 2502.17862 null
2025-02-25 LR ${}^{2}$ Bench: Evaluating Long-chain Reflective Reasoning Capabilities of Large Language Models via Constraint Satisfaction Problems Jianghao Chen et.al. 2502.17848 null
2025-02-25 Quantifying interdisciplinary synergy in higher STEM education Gahyoun Gim et.al. 2502.17841 null
2025-02-25 A Combinatorial Identities Benchmark for Theorem Proving via Automated Theorem Generation Beibei Xiong et.al. 2502.17840 null
2025-02-25 TagGAN: A Generative Model for Data Tagging Muhammad Nawaz et.al. 2502.17836 null
2025-02-25 MM-PoisonRAG: Disrupting Multimodal RAG with Local and Global Poisoning Attacks Hyeonjeong Ha et.al. 2502.17832 null
2025-02-25 A General Framework to Enhance Fine-tuning-based LLM Unlearning Jie Ren et.al. 2502.17823 null
2025-02-25 An Overview of Large Language Models for Statisticians Wenlong Ji et.al. 2502.17814 null
2025-02-25 Can Multimodal LLMs Perform Time Series Anomaly Detection? Xiongxiao Xu et.al. 2502.17812 null
2025-02-25 URO-Bench: A Comprehensive Benchmark for End-to-End Spoken Dialogue Models Ruiqi Yan et.al. 2502.17810 null
2025-02-25 DocPuzzle: A Process-Aware Benchmark for Evaluating Realistic Long-Context Reasoning Capabilities Tianyi Zhuang et.al. 2502.17807 null
2025-02-25 Your Language Model May Think Too Rigidly: Achieving Reasoning Consistency with Symmetry-Enhanced Training Yihang Yao et.al. 2502.17800 null
2025-02-25 AIR: Complex Instruction Generation via Automatic Iterative Refinement Wei Liu et.al. 2502.17787 null
2025-02-25 Exploring the Potential of Large Language Models for Estimating the Reading Comprehension Question Difficulty Yoshee Jain et.al. 2502.17785 null
2025-02-25 Tip of the Tongue Query Elicitation for Simulated Evaluation Yifan He et.al. 2502.17776 null
2025-02-25 FoREST: Frame of Reference Evaluation in Spatial Reasoning Tasks Tanawan Premsri et.al. 2502.17775 null
2025-02-25 Uncertainty Quantification for LLM-Based Survey Simulations Chengpiao Huang et.al. 2502.17773 null
2025-02-25 DeepSeek vs. ChatGPT: A Comparative Study for Scientific Computing and Scientific Machine Learning Tasks Qile Jiang et.al. 2502.17764 null
2025-02-25 Design and implementation of a distributed security threat detection system integrating federated learning and multimodal LLM Yuqing Wang et.al. 2502.17763 null
2025-02-25 Detection of LLM-Paraphrased Code and Identification of the Responsible LLM Using Coding Style Features Shinwoo Park et.al. 2502.17749 null
2025-02-24 LLM Inference Acceleration via Efficient Operation Fusion Mahsa Salmani et.al. 2502.17728 null
2025-02-24 Can Score-Based Generative Modeling Effectively Handle Medical Image Classification? Sushmita Sarker et.al. 2502.17727 null
2025-02-24 Spontaneous Giving and Calculated Greed in Language Models Yuxuan Li et.al. 2502.17720 null
2025-02-24 Mind the Gesture: Evaluating AI Sensitivity to Culturally Offensive Non-Verbal Gestures Akhila Yerukola et.al. 2502.17710 null
2025-02-24 Fractal Generative Models Tianhong Li et.al. 2502.17437 link
2025-02-24 Introducing Visual Perception Token into Multimodal Large Language Model Runpeng Yu et.al. 2502.17425 link
2025-02-24 MLLMs Know Where to Look: Training-free Perception of Small Visual Details with Multimodal LLMs Jiarui Zhang et.al. 2502.17422 link
2025-02-24 LongSpec: Long-Context Speculative Decoding with Efficient Drafting and Verification Penghui Yang et.al. 2502.17421 link
2025-02-24 The Geometry of Refusal in Large Language Models: Concept Cones and Representational Independence Tom Wollschläger et.al. 2502.17420 null
2025-02-24 From System 1 to System 2: A Survey of Reasoning Large Language Models Zhong-Zhi Li et.al. 2502.17419 link
2025-02-24 Reasoning with Latent Thoughts: On the Power of Looped Transformers Nikunj Saunshi et.al. 2502.17416 null
2025-02-24 COSMOS: A Hybrid Adaptive Optimizer for Memory-Efficient Training of LLMs Liming Liu et.al. 2502.17410 link
2025-02-24 Large Language Models are Powerful EHR Encoders Stefan Hegselmann et.al. 2502.17403 null
2025-02-24 What is a Good Question? Utility Estimation with LLM-based Simulations Dong-Ho Lee et.al. 2502.17383 null
2025-02-24 KV-Edit: Training-Free Image Editing for Precise Background Preservation Tianrui Zhu et.al. 2502.17363 link
2025-02-24 A Closer Look at TabPFN v2: Strength, Limitation, and Extension Han-Jia Ye et.al. 2502.17361 null
2025-02-24 RELICT: A Replica Detection Framework for Medical Image Generation Orhun Utku Aydin et.al. 2502.17360 null
2025-02-24 On Relation-Specific Neurons in Large Language Models Yihong Liu et.al. 2502.17355 link
2025-02-24 How Scientists Use Large Language Models to Program Gabrielle O'Brien et.al. 2502.17348 null
2025-02-24 Time series forecasting based on optimized LLM for fault prediction in distribution power grid insulators João Pedro Matos-Carvalho et.al. 2502.17341 null
2025-02-24 HIPPO: Enhancing the Table Understanding Capability of Large Language Models through Hybrid-Modal Preference Optimization Zhenghao Liu et.al. 2502.17315 link
2025-02-24 Delta Decompression for MoE-based LLMs Compression Hao Gu et.al. 2502.17298 link
2025-02-24 Benchmarking Retrieval-Augmented Generation in Multi-Modal Contexts Zhenghao Liu et.al. 2502.17297 null
2025-02-24 Integrating protein sequence embeddings with structure via graph-based deep learning for the prediction of single-residue properties Kevin Michalewicz et.al. 2502.17294 null
2025-02-24 Capability Instruction Tuning: A New Paradigm for Dynamic LLM Routing Yi-Kai Zhang et.al. 2502.17282 link
2025-02-24 MonoTODia: Translating Monologue Requests to Task-Oriented Dialogues Sebastian Steindl et.al. 2502.17268 null
2025-02-24 Unveiling Downstream Performance Scaling of LLMs: A Clustering-Based Perspective Chengyin Xu et.al. 2502.17262 null
2025-02-24 Detecting Benchmark Contamination Through Watermarking Tom Sander et.al. 2502.17259 null
2025-02-24 REINFORCE Adversarial Attacks on Large Language Models: An Adaptive, Distributional, and Semantic Objective Simon Geisler et.al. 2502.17254 null
2025-02-24 Alpha-SQL: Zero-Shot Text-to-SQL using Monte Carlo Tree Search Boyan Li et.al. 2502.17248 null
2025-02-24 Baichuan-Audio: A Unified Framework for End-to-End Speech Interaction Tianpeng Li et.al. 2502.17239 link
2025-02-24 Making LLMs Reason? The Intermediate Language Problem in Neurosymbolic Approaches Alexander Beiser et.al. 2502.17216 null
2025-02-24 CoT-UQ: Improving Response-wise Uncertainty Quantification in LLMs with Chain-of-Thought Boxuan Zhang et.al. 2502.17214 link
2025-02-24 Order Matters: Investigate the Position Bias in Multi-constraint Instruction Following Jie Zeng et.al. 2502.17204 link
2025-02-24 IGDA: Interactive Graph Discovery through Large Language Model Agents Alex Havrilla et.al. 2502.17189 null
2025-02-24 Evaluating Expert Contributions in a MoE LLM for Quiz-Based Tasks Andrei Chernov et.al. 2502.17187 null
2025-02-24 Measuring Data Diversity for Instruction Tuning: A Systematic Analysis and A Reliable Metric Yuming Yang et.al. 2502.17184 link
2025-02-24 Unsupervised Accelerated MRI Reconstruction via Ground-Truth-Free Flow Matching Xinzhe Luo et.al. 2502.17174 null
2025-02-24 Cheems: A Practical Guidance for Building and Evaluating Chinese Reward Models from Scratch Xueru Wen et.al. 2502.17173 null
2025-02-24 Logic Haystacks: Probing LLMs Long-Context Logical Reasoning (Without Easily Identifiable Unrelated Padding) Damien Sileo et.al. 2502.17169 null
2025-02-24 JUREX-4E: Juridical Expert-Annotated Four-Element Knowledge Base for Legal Reasoning Huanghai Liu et.al. 2502.17166 link
2025-02-24 MEMERAG: A Multilingual End-to-End Meta-Evaluation Benchmark for Retrieval Augmented Generation María Andrea Cruz Blandón et.al. 2502.17163 null
2025-02-24 Real-time Monitoring of Economic Shocks using Company Websites Michael Koenig et.al. 2502.17161 null
2025-02-24 A Pragmatic Note on Evaluating Generative Models with Fréchet Inception Distance for Retinal Image Synthesis Yuli Wu et.al. 2502.17160 null
2025-02-24 Parameter Efficient Merging for Multimodal Large Language Models with Complementary Parameter Adaptation Fanhu Zeng et.al. 2502.17159 null
2025-02-24 CodeSwift: Accelerating LLM Inference for Efficient Code Generation Qianhui Zhao et.al. 2502.17139 null
2025-02-24 Evaluating the Effectiveness of Large Language Models in Automated News Article Summarization Lionel Richy Panlap Houamegni et.al. 2502.17136 null
2025-02-24 Applications of Large Models in Medicine YunHe Su et.al. 2502.17132 null
2025-02-24 Thus Spake Long-Context Large Language Model Xiaoran Liu et.al. 2502.17129 null
2025-02-24 Adversarial Training for Defense Against Label Poisoning Attacks Melis Ilayda Bal et.al. 2502.17121 link
2025-02-24 Diffusion Models for Tabular Data: Challenges, Current Progress, and Future Directions Zhong Li et.al. 2502.17119 link
2025-02-24 SFLD: Reducing the content bias for AI-generated Image Detection Seoyeon Gye et.al. 2502.17105 null
2025-02-24 Generative Models in Decision Making: A Survey Yinchuan Li et.al. 2502.17100 null
2025-02-24 Improved Diffusion-based Generative Model with Better Adversarial Robustness Zekun Wang et.al. 2502.17099 link
2025-02-24 Conditional Diffusion-Flow models for generating 3D cosmic density fields: applications to f(R) cosmologies Julieth Katherine Riveros et.al. 2502.17087 null
2025-02-24 Automatically Evaluating the Paper Reviewing Capability of Large Language Models Hyungyu Shin et.al. 2502.17086 null
2025-02-24 Pleno-Generation: A Scalable Generative Face Video Compression Framework with Bandwidth Intelligence Bolin Chen et.al. 2502.17085 null
2025-02-24 Systematic Weight Evaluation for Pruning Large Language Models: Enhancing Performance and Sustainability Ashhadul Islam et.al. 2502.17071 null
2025-02-24 LLM-QE: Improving Query Expansion by Aligning Large Language Models with Ranking Preferences Sijia Yao et.al. 2502.17057 link
2025-02-24 PrivaCI-Bench: Evaluating Privacy with Contextual Integrity and Legal Compliance Haoran Li et.al. 2502.17041 null
2025-02-24 Evolution 6.0: Evolving Robotic Capabilities Through Generative Design Muhammad Haris Khan et.al. 2502.17034 null
2025-02-24 Understanding the Uncertainty of LLM Explanations: A Perspective Based on Reasoning Topology Longchao Da et.al. 2502.17026 null
2025-02-24 Towards Auto-Regressive Next-Token Prediction: In-Context Learning Emerges from Generalization Zixuan Gong et.al. 2502.17024 null
2025-02-24 Quantifying Logical Consistency in Transformers via Query-Key Alignment Eduard Tulchinskii et.al. 2502.17017 null
2025-02-24 Predicting Liquidity-Aware Bond Yields using Causal GANs and Deep Reinforcement Learning with LLM Evaluation Jaskaran Singh Walia et.al. 2502.17011 null
2025-02-24 Be CIM or Be Memory: A Dual-mode-aware DNN Compiler for CIM Accelerators Shixin Zhao et.al. 2502.17006 null
2025-02-24 An Enhanced Large Language Model For Cross Modal Query Understanding System Using DL-KeyBERT Based CAZSSCL-MPGPT Shreya Singh et.al. 2502.17000 null
2025-02-24 Active Learning for Conditional Inverse Design with Crystal Generation and Foundation Atomic Models Zhuoyuan Li et.al. 2502.16984 null
2025-02-24 LongSafety: Evaluating Long-Context Safety of Large Language Models Yida Lu et.al. 2502.16971 link
2025-02-24 Autoregressive Image Generation Guided by Chains of Thought Miaomiao Cai et.al. 2502.16965 null
2025-02-24 Make LLM Inference Affordable to Everyone: Augmenting GPU Memory with NDP-DIMM Lian Liu et.al. 2502.16963 null
2025-02-24 UrduLLaMA 1.0: Dataset Curation, Preprocessing, and Evaluation in Low-Resource Settings Layba Fiaz et.al. 2502.16961 null
2025-02-24 Lean and Mean: Decoupled Value Policy Optimization with Global Value Guidance Chenghua Huang et.al. 2502.16944 null
2025-02-24 Reasoning Does Not Necessarily Improve Role-Playing Ability Xiachong Feng et.al. 2502.16940 null
2025-02-24 BigMac: A Communication-Efficient Mixture-of-Experts Model Structure for Fast Training and Inference Zewen Jin et.al. 2502.16927 null
2025-02-24 FilterLLM: Text-To-Distribution LLM for Billion-Scale Cold-Start Recommendation Ruochen Liu et.al. 2502.16924 null
2025-02-24 A Systematic Survey of Automatic Prompt Optimization Techniques Kiran Ramnath et.al. 2502.16923 null
2025-02-24 Benchmarking Temporal Reasoning and Alignment Across Chinese Dynasties Zhenglin Wang et.al. 2502.16922 null
2025-02-24 SS-MPC: A Sequence-Structured Multi-Party Conversation System Yoonjin Jang et.al. 2502.16920 null
2025-02-24 Multi-Dimensional Quality Assessment for Text-to-3D Assets: Dataset and Model Kang Fu et.al. 2502.16915 null
2025-02-24 SPARC: Score Prompting and Adaptive Fusion for Zero-Shot Multi-Label Recognition in Vision-Language Models Kevin Miller et.al. 2502.16911 null
2025-02-24 AutoLogi: Automated Generation of Logic Puzzles for Evaluating Reasoning Abilities of Large Language Models Qin Zhu et.al. 2502.16906 null
2025-02-24 GuidedBench: Equipping Jailbreak Evaluation with Guidelines Ruixuan Huang et.al. 2502.16903 null
2025-02-24 Culture-TRIP: Culturally-Aware Text-to-Image Generation with Iterative Prompt Refinment Suchae Jeong et.al. 2502.16902 null
2025-02-24 Char-mander Use mBackdoor! A Study of Cross-lingual Backdoor Attacks in Multilingual LLMs Himanshu Beniwal et.al. 2502.16901 link
2025-02-24 Zero-shot Load Forecasting for Integrated Energy Systems: A Large Language Model-based Framework with Multi-task Learning Jiaheng Li et.al. 2502.16896 null
2025-02-24 Unlocking Scientific Concepts: How Effective Are LLM-Generated Analogies for Student Understanding and Classroom Practice? Zekai Shao et.al. 2502.16895 null
2025-02-24 Make LoRA Great Again: Boosting LoRA with Adaptive Singular Values and Mixture-of-Experts Optimization Alignment Chenghao Fan et.al. 2502.16894 null
2025-02-24 Applying LLMs to Active Learning: Towards Cost-Efficient Cross-Task Text Classification without Manually Labeled Data Yejian Zhang et.al. 2502.16892 null
2025-02-24 Unveiling Institution-Specific Bias in Pathology Foundation Models: Detriments, Causes, and Potential Solutions Weiping Lin et.al. 2502.16889 null
2025-02-24 DBudgetKV: Dynamic Budget in KV Cache Compression for Ensuring Optimal Performance Xuanfan Ni et.al. 2502.16886 null
2025-02-24 CORAL: Learning Consistent Representations across Multi-step Training with Lighter Speculative Drafter Yepeng Weng et.al. 2502.16880 null
2025-02-24 A Multi-LLM-Agent-Based Framework for Economic and Public Policy Analysis Yuzhi Hao et.al. 2502.16879 null
2025-02-24 Graphy'our Data: Towards End-to-End Modeling, Exploring and Generating Report from Raw Data Longbin Lai et.al. 2502.16868 null
2025-02-24 Leveraging Large Language Models for Effective and Explainable Multi-Agent Credit Assignment Kartik Nagpal et.al. 2502.16863 null
2025-02-24 LongAttn: Selecting Long-context Training Data via Token-level Attention Longyun Wu et.al. 2502.16860 null
2025-02-24 Sarang at DEFACTIFY 4.0: Detecting AI-Generated Text Using Noised Data and an Ensemble of DeBERTa Models Avinash Trivedi et.al. 2502.16857 null
2025-02-24 Improving LLM General Preference Alignment via Optimistic Online Mirror Descent Yuheng Zhang et.al. 2502.16852 null
2025-02-24 Exploring Causes and Mitigation of Hallucinations in Large Vision Language Models Yaqi Sun et.al. 2502.16842 null
2025-02-24 Fair Foundation Models for Medical Image Analysis: Challenges and Perspectives Dilermando Queiroz et.al. 2502.16841 null
2025-02-24 In-context learning of evolving data streams with tabular foundational models Afonso Lourenço et.al. 2502.16840 null
2025-02-24 "Actionable Help" in Crises: A Novel Dataset and Resource-Efficient Models for Identifying Request and Offer Social Media Posts Rabindra Lamsal et.al. 2502.16839 null
2025-02-24 REGen: A Reliable Evaluation Framework for Generative Event Argument Extraction Omar Sharif et.al. 2502.16838 null
2025-02-24 Finding the Sweet Spot: Preference Data Construction for Scaling Preference Optimization Yao Xiao et.al. 2502.16825 null
2025-02-21 ELIP: Enhanced Visual-Language Foundation Models for Image Retrieval Guanqi Zhan et.al. 2502.15682 null
2025-02-21 Privacy Ripple Effects from Adding or Removing Personal Information in Language Model Training Jaydeep Borkar et.al. 2502.15680 null
2025-02-21 FLEKE: Federated Locate-then-Edit Knowledge Editing Zongkai Zhao et.al. 2502.15677 null
2025-02-21 AutoToM: Automated Bayesian Inverse Planning and Model Discovery for Open-ended Theory of Mind Zhining Zhang et.al. 2502.15676 null
2025-02-21 VaViM and VaVAM: Autonomous Driving through Video Generative Modeling Florent Bartoccioni et.al. 2502.15672 link
2025-02-21 Almost AI, Almost Human: The Challenge of Detecting AI-Polished Writing Shoumik Saha et.al. 2502.15666 null
2025-02-21 Machine-generated text detection prevents language model collapse George Drayson et.al. 2502.15654 null
2025-02-21 Empowering LLMs with Logical Reasoning: A Comprehensive Survey Fengxiang Cheng et.al. 2502.15652 null
2025-02-21 Steering into New Embedding Spaces: Analyzing Cross-Lingual Alignment Induced by Model Interventions in Multilingual Language Models Anirudh Sundar et.al. 2502.15639 null
2025-02-21 Mantis: Lightweight Calibrated Foundation Model for User-Friendly Time Series Classification Vasilii Feofanov et.al. 2502.15637 null
2025-02-21 The Relationship Between Reasoning and Performance in Large Language Models -- o3 (mini) Thinks Harder, Not Longer Marthe Ballon et.al. 2502.15631 null
2025-02-21 Probe Pruning: Accelerating LLMs through Dynamic Pruning via Model-Probing Qi Le et.al. 2502.15618 null
2025-02-21 On the Robustness of Transformers against Context Hijacking for Linear Classification Tianle Li et.al. 2502.15609 null
2025-02-21 Cross-Format Retrieval-Augmented Generation in XR with LLMs for Context-Aware Maintenance Assistance Akos Nagy et.al. 2502.15604 null
2025-02-21 Do Multilingual LLMs Think In English? Lisa Schut et.al. 2502.15603 null
2025-02-21 WorldCraft: Photo-Realistic 3D World Creation and Customization via LLM Agents Xinhang Liu et.al. 2502.15601 null
2025-02-21 SafeInt: Shielding Large Language Models from Jailbreak Attacks via Safety-Aware Representation Intervention Jiaqi Wu et.al. 2502.15594 null
2025-02-21 Generalizing From Short to Long: Effective Data Synthesis for Long-Context Instruction Tuning Wenhao Zhu et.al. 2502.15592 null
2025-02-21 LightThinker: Thinking Step-by-Step Compression Jintian Zhang et.al. 2502.15589 null
2025-02-21 Chats-Grid: An Iterative Retrieval Q&A Optimization Scheme Leveraging Large Model and Retrieval Enhancement Generation in smart grid Yunfeng Li et.al. 2502.15583 null
2025-02-21 Fine-tuning foundation models of materials interatomic potentials with frozen transfer learning Mariia Radova et.al. 2502.15582 null
2025-02-21 Interpreting and Steering LLMs with Mutual Information-based Explanations on Sparse Autoencoders Xuansheng Wu et.al. 2502.15576 null
2025-02-21 DReSD: Dense Retrieval for Speculative Decoding Milan Gritta et.al. 2502.15572 null
2025-02-21 A Cautionary Tale About "Neutrally" Informative AI Tools Ahead of the 2025 Federal Elections in Germany Ina Dormuth et.al. 2502.15568 null
2025-02-21 PIP-KAG: Mitigating Knowledge Conflicts in Knowledge-Augmented Generation via Parametric Pruning Pengcheng Huang et.al. 2502.15543 null
2025-02-21 Accurate and efficient machine learning interatomic potentials for finite temperature modeling of molecular crystals Flaviano Della Pia et.al. 2502.15530 null
2025-02-21 Scaling Sparse and Dense Retrieval in Decoder-Only LLMs Hansi Zeng et.al. 2502.15526 null
2025-02-21 Towards Swift Serverless LLM Cold Starts with ParaServe Chiheng Lou et.al. 2502.15524 null
2025-02-21 Activation Steering in Neural Theorem Provers Shashank Kirtania et.al. 2502.15507 null
2025-02-21 Construction and Evaluation of LLM-based agents for Semi-Autonomous penetration testing Masaya Kobayashi et.al. 2502.15506 null
2025-02-21 Scale-Distribution Decoupling: Enabling Stable and Effective Training of Large Language Models Ya Wang et.al. 2502.15499 null
2025-02-21 Programmers Aren't Obsolete Yet: A Syllabus for Teaching CS Students to Responsibly Use Large Language Models for Code Generation Bruno Pereira Cipriano et.al. 2502.15493 null
2025-02-21 ExpliCa: Evaluating Explicit Causal Reasoning in Large Language Models Martina Miliani et.al. 2502.15487 null
2025-02-21 Enhancing RWKV-based Language Models for Long-Sequence Text Generation Xinghan Pan et.al. 2502.15485 null
2025-02-21 FaultGPT: Industrial Fault Diagnosis Question Answering System by Vision Language Models Jiao Chen et.al. 2502.15481 null
2025-02-21 PAPI: Exploiting Dynamic Parallelism in Large Language Model Decoding with a Processing-In-Memory-Enabled Computing System Yintao He et.al. 2502.15470 null
2025-02-21 Mitigating Data Scarcity in Time Series Analysis: A Foundation Model with Series-Symbol Data Generation Wenxuan Wang et.al. 2502.15466 null
2025-02-21 Memory Helps, but Confabulation Misleads: Understanding Streaming Events in Videos with MLLMs Gengyuan Zhang et.al. 2502.15457 null
2025-02-21 R-LoRA: Random Initialization of Multi-Head LoRA for Multi-Task Learning Jinda Liu et.al. 2502.15455 null
2025-02-21 A fast convergence algorithm based on binary integer programming for expert load balancing in MoE LLMs Yuan Sun et.al. 2502.15451 null
2025-02-21 When Compression Meets Model Compression: Memory-Efficient Double Compression for Large Language Models Weilan Wang et.al. 2502.15443 null
2025-02-21 On the Effectiveness of Large Language Models in Writing Alloy Formulas Yang Hong et.al. 2502.15441 null
2025-02-21 Fed-SB: A Silver Bullet for Extreme Communication Efficiency and Performance in (Private) Federated LoRA Fine-Tuning Raghav Singhal et.al. 2502.15436 link
2025-02-21 Single-pass Detection of Jailbreaking Input in Large Language Models Leyla Naz Candogan et.al. 2502.15435 null
2025-02-21 Mixup Model Merge: Enhancing Model Merging Performance through Randomized Linear Interpolation Yue Zhou et.al. 2502.15434 null
2025-02-21 Pub-Guard-LLM: Detecting Fraudulent Biomedical Articles with Reliable Explanations Lihu Chen et.al. 2502.15429 null
2025-02-21 Adversarial Prompt Evaluation: Systematic Benchmarking of Guardrails Against Prompt Input Attacks on LLMs Giulio Zizzo et.al. 2502.15427 null
2025-02-21 Beyond Translation: LLM-Based Data Generation for Multilingual Fact-Checking Yi-Ling Chung et.al. 2502.15419 null
2025-02-21 MHQA: A Diverse, Knowledge Intensive Mental Health Question Answering Challenge for Language Models Suraj Racha et.al. 2502.15418 null
2025-02-21 HiFi-KPI: A Dataset for Hierarchical KPI Extraction from Earnings Filings Rasmus Aavang et.al. 2502.15411 null
2025-02-21 Problem-Solving Logic Guided Curriculum In-Context Learning for LLMs Complex Reasoning Xuetao Ma et.al. 2502.15401 null
2025-02-21 Beyond Tools: Understanding How Heavy Users Integrate LLMs into Everyday Tasks and Decision-Making Eunhye Kim et.al. 2502.15395 null
2025-02-21 Chitrarth: Bridging Vision and Language for a Billion People Shaharukh Khan et.al. 2502.15392 null
2025-02-21 MOVE: A Mixture-of-Vision-Encoders Approach for Domain-Focused Vision-Language Processing Matvey Skripkin et.al. 2502.15381 null
2025-02-21 Weakly Supervised Video Scene Graph Generation via Natural Language Supervision Kibum Kim et.al. 2502.15370 null
2025-02-21 Identifying Features that Shape Perceived Consciousness in Large Language Model-based AI: A Quantitative Study of Human Responses Kang Bongsu et.al. 2502.15365 null
2025-02-21 Evaluating Social Biases in LLM Reasoning Xuyang Wu et.al. 2502.15361 null
2025-02-21 ARS: Automatic Routing Solver with Large Language Models Kai Li et.al. 2502.15359 null
2025-02-21 AttentionEngine: A Versatile Framework for Efficient Attention Mechanisms on Diverse Hardware Platforms Feiyang Chen et.al. 2502.15349 null
2025-02-21 Constructing a Norm for Children's Scientific Drawing: Distribution Features Based on Semantic Similarity of Large Language Models Yi Zhang et.al. 2502.15348 null
2025-02-21 Efficiently Solving Discounted MDPs with Predictions on Transition Matrices Lixing Lyu et.al. 2502.15345 null
2025-02-21 Exploring Embodied Multimodal Large Models: Development, Datasets, and Future Directions Shoubin Chen et.al. 2502.15336 null
2025-02-21 Stepwise Informativeness Search for Improving LLM Reasoning Siyuan Wang et.al. 2502.15335 null
2025-02-21 Attention Eclipse: Manipulating Attention to Bypass LLM Safety-Alignment Pedram Zaree et.al. 2502.15334 null
2025-02-21 Detecting Future-related Contexts of Entity Mentions Puneet Prashar et.al. 2502.15332 null
2025-02-21 DynamicGSG: Dynamic 3D Gaussian Scene Graphs for Environment Adaptation Luzhou Ge et.al. 2502.15309 link
2025-02-21 SVDq: 1.25-bit and 410x Key Cache Compression for LLM Attention Hong Yankun et.al. 2502.15304 null
2025-02-21 Round Attention: A Novel Round-Level Attention Mechanism to Accelerate LLM Inference Yaohua Tang et.al. 2502.15294 null
2025-02-21 Bridging Bug Localization and Issue Fixing: A Hierarchical Localization Framework Leveraging Large Language Models Jianming Chang et.al. 2502.15292 null
2025-02-21 BundleFlow: Deep Menus for Combinatorial Auctions by Diffusion-Based Optimization Tonghan Wang et.al. 2502.15283 null
2025-02-21 A Training-free LLM-based Approach to General Chinese Character Error Correction Houquan Zhou et.al. 2502.15266 null
2025-02-21 Retrieval-Augmented Speech Recognition Approach for Domain Challenges Peng Shen et.al. 2502.15264 null
2025-02-21 LightMamba: Efficient Mamba Acceleration on FPGA with Quantization and Hardware Co-design Renjie Wei et.al. 2502.15260 null
2025-02-21 An approach for API synthesis using large language models Hua Zhong et.al. 2502.15246 null
2025-02-21 Comparative Analysis of Large Language Models for Context-Aware Code Completion using SAFIM Framework Hang Zhang et.al. 2502.15243 null
2025-02-21 From Documents to Dialogue: Building KG-RAG Enhanced AI Assistants Manisha Mukherjee et.al. 2502.15237 null
2025-02-21 A General Pseudonymization Framework for Cloud-Based LLMs: Replacing Privacy Information in Controlled Text Generation Shilong Hou et.al. 2502.15233 null
2025-02-21 User Experience with LLM-powered Conversational Recommendation Systems: A Case of Music Recommendation Sojeong Yun et.al. 2502.15229 null
2025-02-21 Understand User Opinions of Large Language Models via LLM-Powered In-the-Moment User Experience Interviews Mengqiao Liu et.al. 2502.15226 null
2025-02-21 Auto-Bench: An Automated Benchmark for Scientific Discovery in LLMs Tingting Chen et.al. 2502.15224 null
2025-02-21 FormalSpecCpp: A Dataset of C++ Formal Specifications created using LLMs Madhurima Chakraborty et.al. 2502.15217 link
2025-02-21 The Evolving Landscape of LLM- and VLM-Integrated Reinforcement Learning Sheila Schoepp et.al. 2502.15214 null
2025-02-21 Unveiling Attractor Cycles in Large Language Models: A Dynamical Systems View of Successive Paraphrasing Zhilin Wang et.al. 2502.15208 null
2025-02-21 Lung-DDPM: Semantic Layout-guided Diffusion Models for Thoracic CT Image Synthesis Yifan Jiang et.al. 2502.15204 null
2025-02-21 TETRIS: Optimal Draft Token Selection for Batch Speculative Decoding Zhaoxuan Wu et.al. 2502.15197 null
2025-02-21 LEDD: Large Language Model-Empowered Data Discovery in Data Lakes Qi An et.al. 2502.15182 null
2025-02-21 Enhancing Speech Large Language Models with Prompt-Aware Mixture of Audio Encoders Weiqiao Shan et.al. 2502.15178 null
2025-02-21 Methods and Trends in Detecting Generated Images: A Comprehensive Review Arpan Mahara et.al. 2502.15176 null
2025-02-21 M3-AGIQA: Multimodal, Multi-Round, Multi-Aspect AI-Generated Image Quality Assessment Chuan Cui et.al. 2502.15167 null
2025-02-21 Extreme Speech Classification in the Era of LLMs: Exploring Open-Source and Proprietary Models Sarthak Mahajan et.al. 2502.15155 null
2025-02-21 Investigating the Adaptive Robustness with Knowledge Conflicts in LLM-based Multi-Agent Systems Tianjie Ju et.al. 2502.15153 null
2025-02-21 Do LLMs Make Mistakes Like Students? Exploring Natural Alignment between Language Models and Human Error Patterns Naiming Liu et.al. 2502.15140 null
2025-02-21 Chain-of-Rank: Enhancing Large Language Models for Domain-Specific RAG in Edge Device Juntae Lee et.al. 2502.15134 null
2025-02-21 TransMamba: Fast Universal Architecture Adaption from Transformers to Mamba Xiuwei Chen et.al. 2502.15130 null
2025-02-20 LUME: LLM Unlearning with Multitask Evaluations Anil Ramakrishna et.al. 2502.15097 null
2025-02-20 Detecting Student Intent for Chat-Based Intelligent Tutoring Systems Ella Cutler et.al. 2502.15096 null
2025-02-20 Judging It, Washing It: Scoring and Greenwashing Corporate Climate Disclosures using Large Language Models Marianne Chuang et.al. 2502.15094 null
2025-02-20 Optimizing Singular Spectrum for Large Language Model Compression Dengjie Li et.al. 2502.15092 null
2025-02-20 Analyze the Neurons, not the Embeddings: Understanding When and Where LLM Representations Align with Humans Masha Fedzechkina et.al. 2502.15090 null
2025-02-20 Is Safety Standard Same for Everyone? User-Specific Safety Evaluation of Large Language Models Yeonjun In et.al. 2502.15086 null
2025-02-20 LServe: Efficient Long-sequence LLM Serving with Unified Sparse Attention Shang Yang et.al. 2502.14866 link
2025-02-20 Aligning LLMs to Ask Good Questions A Case Study in Clinical Reasoning Shuyue Stella Li et.al. 2502.14860 link
2025-02-20 FR-Spec: Accelerating Large-Vocabulary Language Models via Frequency-Ranked Speculative Sampling Weilin Zhao et.al. 2502.14856 null
2025-02-20 Prompt-to-Leaderboard Evan Frick et.al. 2502.14855 link
2025-02-20 GATE: Graph-based Adaptive Tool Evolution Across Diverse Tasks Jianwen Luo et.al. 2502.14848 null
2025-02-20 Red-Teaming LLM Multi-Agent Systems via Communication Attacks Pengfei He et.al. 2502.14847 null
2025-02-20 Scaling Text-Rich Image Understanding via Code-Guided Synthetic Multimodal Data Generation Yue Yang et.al. 2502.14846 null
2025-02-20 Revealing and Mitigating Over-Attention in Knowledge Editing Pinzheng Wang et.al. 2502.14838 link
2025-02-20 Middle-Layer Representation Alignment for Cross-Lingual Transfer in Fine-Tuned LLMs Danni Liu et.al. 2502.14830 link
2025-02-20 A Survey of Model Architectures in Information Retrieval Zhichao Xu et.al. 2502.14822 null
2025-02-20 eC-Tab2Text: Aspect-Based Text Generation from e-Commerce Product Tables Luis Antonio Gutiérrez Guanilo et.al. 2502.14820 null
2025-02-20 Dynamic Low-Rank Sparse Adaptation for Large Language Models Weizhong Huang et.al. 2502.14816 null
2025-02-20 FetalCLIP: A Visual-Language Foundation Model for Fetal Ultrasound Image Analysis Fadillah Maani et.al. 2502.14807 link
2025-02-20 From RAG to Memory: Non-Parametric Continual Learning for Large Language Models Bernal Jiménez Gutiérrez et.al. 2502.14802 link
2025-02-20 A Multi-Agent Perspective on Modern Information Retrieval Haya Nachimovsky et.al. 2502.14796 null
2025-02-20 Rapid Word Learning Through Meta In-Context Learning Wentao Wang et.al. 2502.14791 null
2025-02-20 DC-ControlNet: Decoupling Inter- and Intra-Element Conditions in Image Generation with Diffusion Models Hongji Yang et.al. 2502.14779 null
2025-02-20 SurveyX: Academic Survey Automation via Large Language Models Xun Liang et.al. 2502.14776 null
2025-02-20 Determining Layer-wise Sparsity for Large Language Models Through a Theoretical Perspective Weizhong Huang et.al. 2502.14770 null
2025-02-20 Tree-of-Debate: Multi-Persona Debate Trees Elicit Critical Thinking for Scientific Comparative Analysis Priyanka Kargupta et.al. 2502.14767 link
2025-02-20 EquivaMap: Leveraging LLMs for Automatic Equivalence Checking of Optimization Formulations Haotian Zhai et.al. 2502.14760 link
2025-02-20 On the Influence of Context Size and Model Choice in Retrieval-Augmented Generation Systems Juraj Vladika et.al. 2502.14759 null
2025-02-20 TritonBench: Benchmarking Large Language Model Capabilities for Generating Triton Operators Jianling Li et.al. 2502.14752 link
2025-02-20 Large Language Models Struggle to Describe the Haystack without Human Help: Human-in-the-loop Evaluation of LLMs Zongxia Li et.al. 2502.14748 null
2025-02-20 Multi-Agent Coordination across Diverse Applications: A Survey Lijun Sun et.al. 2502.14743 null
2025-02-20 SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines M-A-P Team et.al. 2502.14739 null
2025-02-20 EAGER-LLM: Enhancing Large Language Models as Recommenders through Exogenous Behavior-Semantic Integration Minjie Hong et.al. 2502.14735 null
2025-02-20 WavRAG: Audio-Integrated Retrieval Augmented Generation for Spoken Dialogue Models Yifu Chen et.al. 2502.14727 null
2025-02-20 I-MCTS: Enhancing Agentic AutoML via Introspective Monte Carlo Tree Search Zujie Liang et.al. 2502.14693 null
2025-02-20 Bridging the Gap: Transforming Natural Language Questions into SQL Queries via Abstract Query Pattern and Contextual Schema Markup Yonghui Kong et.al. 2502.14682 null
2025-02-20 How to Get Your LLM to Generate Challenging Problems for Evaluation Arkil Patel et.al. 2502.14678 link
2025-02-20 Data-Constrained Synthesis of Training Data for De-Identification Thomas Vakili et.al. 2502.14677 null
2025-02-20 Explanations of Deep Language Models Explain Language Representations in the Brain Maryam Rahimi et.al. 2502.14671 null
2025-02-20 AlphaMaze: Enhancing Large Language Models' Spatial Intelligence via GRPO Alan Dao et.al. 2502.14669 null
2025-02-20 Beyond the Surface: Uncovering Implicit Locations with LLMs for Personalized Local News Gali Katz et.al. 2502.14660 null
2025-02-20 Edit Once, Update Everywhere: A Simple Framework for Cross-Lingual Knowledge Synchronization in LLMs Yuchen Wu et.al. 2502.14645 null
2025-02-20 LIFT: Improving Long Context Understanding of Large Language Models through Long Input Fine-Tuning Yansheng Mao et.al. 2502.14644 null
2025-02-20 Length-Controlled Margin-Based Preference Optimization without Reference Model Gengxu Li et.al. 2502.14643 link
2025-02-20 ReQFlow: Rectified Quaternion Flow for Efficient and High-Quality Protein Backbone Generation Angxiao Yue et.al. 2502.14637 link
2025-02-20 CER: Confidence Enhanced Reasoning in LLMs Ali Razghandi et.al. 2502.14634 link
2025-02-20 Augmenting Coaching with GenAI: Insights into Use, Effectiveness, and Future Potential Jennifer Haase et.al. 2502.14632 null
2025-02-20 Synergistic Fusion of Multi-Source Knowledge via Evidence Theory for High-Entropy Alloy Discovery Minh-Quyet Ha et.al. 2502.14631 null
2025-02-20 PEARL: Towards Permutation-Resilient LLMs Liang Chen et.al. 2502.14628 link
2025-02-20 Reward Models Identify Consistency, Not Causality Yuhui Xu et.al. 2502.14619 null
2025-02-20 Serving Models, Fast and Slow:Optimizing Heterogeneous LLM Inferencing Workloads at Scale Shashwat Jaiswal et.al. 2502.14617 null
2025-02-20 FIND: Fine-grained Information Density Guided Adaptive Retrieval-Augmented Generation for Disease Diagnosis Mingyi Jia et.al. 2502.14614 null
2025-02-20 Behavioral Analysis of Information Salience in Large Language Models Jan Trienes et.al. 2502.14613 link
2025-02-20 "Don't Forget the Teachers": Towards an Educator-Centered Understanding of Harms from Large Language Models in Education Emma Harvey et.al. 2502.14592 null
2025-02-20 Vision Foundation Models in Medical Image Analysis: Advances and Challenges Pengchen Liang et.al. 2502.14584 null
2025-02-20 A Theory for Conditional Generative Modeling on Multiple Data Sources Rongzhen Wang et.al. 2502.14583 link
2025-02-20 ReVISE: Learning to Refine at Test-Time via Intrinsic Self-Verification Hyunseok Lee et.al. 2502.14565 null
2025-02-20 Plan-over-Graph: Towards Parallelable LLM Agent Schedule Shiqi Zhang et.al. 2502.14563 link
2025-02-20 Can LLMs Predict Citation Intent? An Experimental Analysis of In-context Learning and Fine-tuning on Open LLMs Paris Koloveas et.al. 2502.14561 link
2025-02-20 Less is More: Improving LLM Alignment via Preference Data Selection Xun Deng et.al. 2502.14560 null
2025-02-20 Multiscale Byte Language Models -- A Hierarchical Architecture for Causal Million-Length Sequence Modeling Eric Egli et.al. 2502.14553 link
2025-02-20 Position: Graph Learning Will Lose Relevance Due To Poor Benchmarks Maya Bechler-Speicher et.al. 2502.14546 null
2025-02-20 LLM-based User Profile Management for Recommender System Seunghwan Bang et.al. 2502.14541 null
2025-02-20 LoRA-GGPO: Mitigating Double Descent in LoRA Fine-Tuning via Gradient-Guided Perturbation Optimization Yupeng Chang et.al. 2502.14538 link
2025-02-20 CORBA: Contagious Recursive Blocking Attacks on Multi-Agent Systems Based on Large Language Models Zhenhong Zhou et.al. 2502.14529 link
2025-02-20 Generative adversarial networks vs large language models: a comparative study on synthetic tabular data generation Austin A. Barr et.al. 2502.14523 link
2025-02-20 Can LLMs Simulate L2-English Dialogue? An Information-Theoretic Analysis of L1-Dependent Biases Rena Gao et.al. 2502.14507 link
2025-02-20 How Much Knowledge Can You Pack into a LoRA Adapter without Harming LLM? Sergey Pletenev et.al. 2502.14502 link
2025-02-20 MLGym: A New Framework and Benchmark for Advancing AI Research Agents Deepak Nathani et.al. 2502.14499 null
2025-02-20 StructFlowBench: A Structured Flow Benchmark for Multi-turn Instruction Following Jinnan Li et.al. 2502.14494 link
2025-02-20 How Jailbreak Defenses Work and Ensemble? A Mechanistic Investigation Zhuohang Long et.al. 2502.14486 null
2025-02-20 NLoRA: Nyström-Initiated Low-Rank Adaptation for Large Language Models Chenlu Guo et.al. 2502.14482 link
2025-02-20 Unshackling Context Length: An Efficient Selective Attention Approach through Query-Key Compression Haoyu Wang et.al. 2502.14477 null
2025-02-20 Argument-Based Comparative Question Answering Evaluation Benchmark Irina Nikishina et.al. 2502.14476 null
2025-02-20 Enhancing Smart Environments with Context-Aware Chatbots using Large Language Models Aurora Polo-Rodríguez et.al. 2502.14469 null
2025-02-20 Narrative-Driven Travel Planning: Geoculturally-Grounded Script Generation with Evolutionary Itinerary Optimization Ran Ding et.al. 2502.14456 link
2025-02-20 Optimal word order for non-causal text generation with Large Language Models: the Spanish case Andrea Busto-Castiñeira et.al. 2502.14451 null
2025-02-20 LLM4FaaS: No-Code Application Development using LLMs and FaaS Minghe Wang et.al. 2502.14450 null
2025-02-20 PredictaBoard: Benchmarking LLM Score Predictability Lorenzo Pacchiardi et.al. 2502.14445 link
2025-02-20 Token-Level Density-Based Uncertainty Quantification Methods for Eliciting Truthfulness of Large Language Models Artem Vazhentsev et.al. 2502.14427 link
2025-02-20 A Survey on Data Contamination for Large Language Models Yuxing Cheng et.al. 2502.14425 link
2025-02-20 ChatVLA: Unified Multimodal Understanding and Robot Control with Vision-Language-Action Model Zhongyi Zhou et.al. 2502.14420 null
2025-02-20 Towards Efficient Automatic Self-Pruning of Large Language Models Weizhong Huang et.al. 2502.14413 null
2025-02-20 Evaluating Precise Geolocation Inference Capabilities of Vision Language Models Neel Jay et.al. 2502.14412 link
2025-02-20 Unstructured Evidence Attribution for Long Context Query Focused Summarization Dustin Wright et.al. 2502.14409 null
2025-02-20 HPS: Hard Preference Sampling for Human Preference Alignment Xiandong Zou et.al. 2502.14400 null
2025-02-20 Enhancing Portuguese Variety Identification with Cross-Domain Approaches Hugo Sousa et.al. 2502.14394 null
2025-02-20 Leveraging Small LLMs for Argument Mining in Education: Argument Component Identification, Classification, and Assessment Lucile Favero et.al. 2502.14389 null
2025-02-20 S: Test Time Scaling for Code Generation* Dacheng Li et.al. 2502.14382 link
2025-02-20 PPO-MI: Efficient Black-Box Model Inversion via Proximal Policy Optimization Xinpeng Shou et.al. 2502.14370 null
2025-02-20 Entropy-UID: A Method for Optimizing Information Density Xinpeng Shou et.al. 2502.14366 null
2025-02-20 Retrieval-Augmented Process Reward Model for Generalizable Mathematical Reasoning Jiachen Zhu et.al. 2502.14361 null
2025-02-20 SR-LLM: Rethinking the Structured Representation in Large Language Model Jiahuan Zhang et.al. 2502.14352 null
2025-02-20 SegAnyPET: Universal Promptable Segmentation from Positron Emission Tomography Images Yichi Zhang et.al. 2502.14351 null
2025-02-20 FlowAgent: Achieving Compliance and Flexibility for Workflow Agents Yuchen Shi et.al. 2502.14345 link
2025-02-20 Earlier Tokens Contribute More: Learning Direct Preference Optimization From Temporal Decay Perspective Ruichen Shao et.al. 2502.14340 null
2025-02-20 A Survey on Feedback-based Multi-step Reasoning for Large Language Models on Mathematics Ting-Ruen Wei et.al. 2502.14333 null
2025-02-20 SolSearch: An LLM-Driven Framework for Efficient SAT-Solving Code Generation Junjie Sheng et.al. 2502.14328 null
2025-02-20 ChemHTS: Hierarchical Tool Stacking for Enhancing Chemical Agents Zhucong Li et.al. 2502.14327 link
2025-02-20 Beyond Self-Talk: A Communication-Centric Survey of LLM-Based Multi-Agent Systems Bingyu Yan et.al. 2502.14321 null
2025-02-20 Line Goes Up? Inherent Limitations of Benchmarks for Evaluating Large Language Models James Fodor et.al. 2502.14318 null
2025-02-20 ParallelComp: Parallel Long-Context Compressor for Length Extrapolation Jing Xiong et.al. 2502.14317 null
2025-02-20 Unveiling Cultural Blind Spots: Analyzing the Limitations of mLLMs in Procedural Text Comprehension Amir Hossein Yari et.al. 2502.14315 null
2025-02-20 Efficient AI in Practice: Training and Deployment of Efficient LLMs for Industry Applications Kayhan Behdin et.al. 2502.14305 null
2025-02-20 MedHallu: A Comprehensive Benchmark for Detecting Medical Hallucinations in Large Language Models Shrey Pandit et.al. 2502.14302 null
2025-02-20 SEA-HELM: Southeast Asian Holistic Evaluation of Language Models Yosephine Susanto et.al. 2502.14301 null
2025-02-19 Where's the Bug? Attention Probing for Scalable Fault Localization Adam Stein et.al. 2502.13966 null
2025-02-19 Autellix: An Efficient Serving Engine for LLM Agents as General Programs Michael Luo et.al. 2502.13965 null
2025-02-19 MuDAF: Long-Context Multi-Document Attention Focusing through Contrastive Learning on Attention Heads Weihao Liu et.al. 2502.13963 link
2025-02-19 Is That Your Final Answer? Test-Time Scaling Improves Selective Question Answering William Jurayj et.al. 2502.13962 null
2025-02-19 LIDDIA: Language-based Intelligent Drug Discovery Agent Reza Averly et.al. 2502.13959 null
2025-02-19 Neurosymbolic artificial intelligence via large language models and coherence-driven inference Steve Huntsman et.al. 2502.13953 null
2025-02-19 Why Safeguarded Ships Run Aground? Aligned Large Language Models' Safety Mechanisms Tend to Be Anchored in The Template Region Chak Tou Leong et.al. 2502.13946 null
2025-02-19 Image compositing is all you need for data augmentation Ang Jia Ning Shermaine et.al. 2502.13936 null
2025-02-19 LongPO: Long Context Self-Evolution of Large Language Models through Short-to-Long Preference Optimization Guanzheng Chen et.al. 2502.13922 link
2025-02-19 Exploring Code Language Models for Automated HLS-based Hardware Generation: Benchmark, Infrastructure and Analysis Jiahao Gai et.al. 2502.13921 null
2025-02-19 Exploring Personalized Health Support through Data-Driven, Theory-Guided LLMs: A Case Study in Sleep Health Xingbo Wang et.al. 2502.13920 null
2025-02-19 How Do LLMs Perform Two-Hop Reasoning in Context? Tianyu Guo et.al. 2502.13913 null
2025-02-19 Lost in Sequence: Do Large Language Models Understand Sequential Recommendation? Sein Kim et.al. 2502.13909 link
2025-02-19 Judging the Judges: A Collection of LLM-Generated Relevance Judgements Hossein A. Rahmani et.al. 2502.13908 link
2025-02-19 DataSciBench: An LLM Agent Benchmark for Data Science Dan Zhang et.al. 2502.13897 link
2025-02-19 NavigateDiff: Visual Predictors are Zero-Shot Navigation Assistants Yiran Qin et.al. 2502.13894 null
2025-02-19 Refining embeddings with fill-tuning: data-efficient generalised performance improvements for materials foundation models Matthew P. Wilson et.al. 2502.13886 link
2025-02-19 SPEX: Scaling Feature Interaction Explanations for LLMs Justin Singh Kang et.al. 2502.13870 link
2025-02-19 MagicGeo: Training-Free Text-Guided Geometric Diagram Generation Junxiao Wang et.al. 2502.13855 null
2025-02-19 Enhancing LLM-Based Recommendations Through Personalized Reasoning Jiahao Liu et.al. 2502.13845 null
2025-02-19 Enhancing Cross-Domain Recommendations with Memory-Optimized LLM-Based User Agents Jiahao Liu et.al. 2502.13843 null
2025-02-19 Inner Thinking Transformer: Leveraging Dynamic Depth Scaling to Foster Adaptive Internal Thinking Yilong Chen et.al. 2502.13842 null
2025-02-19 Quantifying Memorization and Retriever Performance in Retrieval-Augmented Vision-Language Models Peter Carragher et.al. 2502.13836 null
2025-02-19 Proving Olympiad Inequalities by Synergizing LLMs and Symbolic Reasoning Zenan Li et.al. 2502.13834 null
2025-02-19 ArtMentor: AI-Assisted Evaluation of Artworks to Explore Multimodal Large Language Models Capabilities Chanjin Zheng et.al. 2502.13832 link
2025-02-19 LESA: Learnable LLM Layer Scaling-Up Yifei Yang et.al. 2502.13794 link
2025-02-19 From Tools to Teammates: Evaluating LLMs in Multi-Session Coding Interactions Nathanaël Carraz Rakotonirina et.al. 2502.13791 link
2025-02-19 From Correctness to Comprehension: AI Agents for Personalized Error Diagnosis in Education Yi-Fan Zhang et.al. 2502.13789 null
2025-02-19 Helix-mRNA: A Hybrid Foundation Model For Full Sequence mRNA Therapeutics Matthew Wood et.al. 2502.13785 link
2025-02-19 Generative Large Recommendation Models: Emerging Trends in LLMs for Recommendation Hao Wang et.al. 2502.13783 null
2025-02-19 Translation in the Hands of Many:Centering Lay Users in Machine Translation Interactions Beatrice Savoldi et.al. 2502.13780 null
2025-02-19 VITAL: A New Dataset for Benchmarking Pluralistic Alignment in Healthcare Anudeex Shetty et.al. 2502.13775 null
2025-02-19 AI Software Engineer: Programming with Trust Abhik Roychoudhury et.al. 2502.13767 null
2025-02-19 SCALAR: Scientific Citation-based Live Assessment of Long-context Academic Reasoning Renxi Wang et.al. 2502.13753 null
2025-02-19 Reverse Markov Learning: Multi-Step Generative Models for Complex Distributions Xinwei Shen et.al. 2502.13747 null
2025-02-19 Enhancing Input-Label Mapping in In-Context Learning with Contrastive Decoding Keqin Peng et.al. 2502.13738 null
2025-02-19 CARE: Confidence-Aware Regression Estimation of building density fine-tuning EO Foundation Models Nikolaos Dionelis et.al. 2502.13734 null
2025-02-19 Adapting Large Language Models for Time Series Modeling via a Novel Parameter-efficient Adaptation Method Juyuan Zhang et.al. 2502.13725 null
2025-02-19 Direct Value Optimization: Improving Chain-of-Thought Reasoning in LLMs with Refined Values Hongbo Zhang et.al. 2502.13723 null
2025-02-19 TALKPLAY: Multimodal Music Recommendation with Large Language Models Seungheon Doh et.al. 2502.13713 null
2025-02-19 Is This Collection Worth My LLM's Time? Automatically Measuring Information Potential in Text Corpora Tristan Karch et.al. 2502.13691 null
2025-02-19 An LLM-based Agent for Reliable Docker Environment Configuration Ruida Hu et.al. 2502.13681 null
2025-02-19 SCOPE: A Self-supervised Framework for Improving Faithfulness in Conditional Text Generation Song Duong et.al. 2502.13674 null
2025-02-19 Refining Sentence Embedding Model through Ranking Sentences Generation with Large Language Models Liyang He et.al. 2502.13656 link
2025-02-19 C2T: A Classifier-Based Tree Construction Method in Speculative Decoding Feiye Huo et.al. 2502.13652 null
2025-02-19 Reliability Across Parametric and External Knowledge: Understanding Knowledge Handling in LLMs Youna Kim et.al. 2502.13648 null
2025-02-19 D.Va: Validate Your Demonstration First Before You Use It Qi Zhang et.al. 2502.13646 null
2025-02-19 Qorgau: Evaluating LLM Safety in Kazakh-Russian Bilingual Contexts Maiya Goloburda et.al. 2502.13640 null
2025-02-19 Concept Layers: Enhancing Interpretability and Intervenability via LLM Conceptualization Or Raphael Bidusa et.al. 2502.13632 null
2025-02-19 AI-Empowered Catalyst Discovery: A Survey from Classical Machine Learning Approaches to Large Language Models Yuanyuan Xu et.al. 2502.13626 null
2025-02-19 REFIND: Retrieval-Augmented Factuality Hallucination Detection in Large Language Models DongGeon Lee et.al. 2502.13622 null
2025-02-19 Complex Ontology Matching with Large Language Model Embeddings Guilherme Sousa et.al. 2502.13619 null
2025-02-19 LaVCa: LLM-assisted Visual Cortex Captioning Takuya Matsuyama et.al. 2502.13606 null
2025-02-19 BeamLoRA: Beam-Constraint Low-Rank Adaptation Naibin Gu et.al. 2502.13604 null
2025-02-19 MMTEB: Massive Multilingual Text Embedding Benchmark Kenneth Enevoldsen et.al. 2502.13595 null
2025-02-19 Don't Stop the Multi-Party! On Generating Synthetic Multi-Party Conversations with Constraints Nicolò Penzo et.al. 2502.13592 null
2025-02-19 Unraveling the Localized Latents: Learning Stratified Manifold Structures in LLM Embedding Space with Sparse Mixture-of-Experts Xin Li et.al. 2502.13577 null
2025-02-19 LSR-Adapt: Ultra-Efficient Parameter Tuning with Matrix Low Separation Rank Kernel Adaptation Xin Li et.al. 2502.13568 null
2025-02-19 Extracting Social Connections from Finnish Karelian Refugee Interviews Using LLMs Joonatan Laato et.al. 2502.13566 null
2025-02-19 PRIV-QA: Privacy-Preserving Question Answering for Cloud Large Language Models Guangwei Li et.al. 2502.13564 link
2025-02-19 Are Large Language Models In-Context Graph Learners? Jintang Li et.al. 2502.13562 null
2025-02-19 Democratizing Large Language Model-Based Graph Data Augmentation via Latent Knowledge Graphs Yushi Feng et.al. 2502.13555 link
2025-02-19 STaR-SQL: Self-Taught Reasoner for Text-to-SQL Mingqian He et.al. 2502.13550 null
2025-02-19 Detecting Linguistic Bias in Government Documents Using Large language Models Milena de Swart et.al. 2502.13548 null
2025-02-19 From Sub-Ability Diagnosis to Human-Aligned Generation: Bridging the Gap for Text Length Control via MARKERGEN Peiwen Yuan et.al. 2502.13544 null
2025-02-19 Activation-aware Probe-Query: Effective Key-Value Retrieval for Long-Context LLMs Inference Qingfa Xiao et.al. 2502.13542 null
2025-02-19 Bursting Filter Bubble: Enhancing Serendipity Recommendations with Aligned Large Language Models Yunjia Xi et.al. 2502.13539 null
2025-02-19 Train Small, Infer Large: Memory-Efficient LoRA Training for Large Language Models Jun Zhang et.al. 2502.13533 link
2025-02-19 Exploiting Prefix-Tree in Structured Output Interfaces for Enhancing Jailbreak Attacking Yanzeng Li et.al. 2502.13527 link
2025-02-19 SPPD: Self-training with Process Preference Learning Using Dynamic Value Margin Hao Yi et.al. 2502.13516 null
2025-02-19 Unlocking Multimodal Integration in EHRs: A Prompt Learning Framework for Language and Time Series Fusion Shuai Niu et.al. 2502.13509 null
2025-02-19 Reproducing NevIR: Negation in Neural Information Retrieval Coen van Elsen et.al. 2502.13506 link
2025-02-19 PLDR-LLMs Learn A Generalizable Tensor Operator That Can Replace Its Own Deep Neural Net At Inference Burc Gokden et.al. 2502.13502 link
2025-02-19 Towards Geo-Culturally Grounded LLM Generations Piyawat Lertvittayakumjorn et.al. 2502.13497 null
2025-02-19 What are Models Thinking about? Understanding Large Language Model Hallucinations "Psychology" through Model Inner State Analysis Peiran Wang et.al. 2502.13490 null
2025-02-19 LLM4Tag: Automatic Tagging System for Information Retrieval via Large Language Models Ruiming Tang et.al. 2502.13481 null
2025-02-19 Integration of Agentic AI with 6G Networks for Mission-Critical Applications: Use-case and Challenges Sunder Ali Khowaja et.al. 2502.13476 null
2025-02-19 LLM should think and action as a human Haun Leung et.al. 2502.13475 null
2025-02-19 Towards Lightweight, Adaptive and Attribute-Aware Multi-Aspect Controllable Text Generation with Large Language Models Chenyu Zhu et.al. 2502.13474 null
2025-02-19 ThinkGuard: Deliberative Slow Thinking Leads to Cautious Guardrails Xiaofei Wen et.al. 2502.13458 link
2025-02-19 Interleaved Gibbs Diffusion for Constrained Generation Gautham Govind Anil et.al. 2502.13450 null
2025-02-19 Enhancing Chest X-ray Classification through Knowledge Injection in Cross-Modality Learning Yang Yan et.al. 2502.13447 null
2025-02-19 TreeCut: A Synthetic Unanswerable Math Word Problem Dataset for LLM Hallucination Evaluation Jialin Ouyang et.al. 2502.13442 null
2025-02-19 The Self-Improvement Paradox: Can Language Models Bootstrap Reasoning Capabilities without External Scaffolding? Yutao Sun et.al. 2502.13441 null
2025-02-19 MATS: An Audio Language Model under Text-only Supervision Wen Wang et.al. 2502.13433 null
2025-02-19 Vision-Based Generic Potential Function for Policy Alignment in Multi-Agent Reinforcement Learning Hao Ma et.al. 2502.13430 null
2025-02-19 MCTS-KBQA: Monte Carlo Tree Search for Knowledge Base Question Answering Guanming Xiong et.al. 2502.13428 null
2025-02-19 TabSD: Large Free-Form Table Question Answering with SQL-Based Table Decomposition Yuxiang Wang et.al. 2502.13422 null
2025-02-19 RLTHF: Targeted Human Feedback for LLM Alignment Yifei Xu et.al. 2502.13417 null
2025-02-19 Detecting LLM Fact-conflicting Hallucinations Enhanced by Temporal-logic-based Reasoning Ningke Li et.al. 2502.13416 null
2025-02-19 Explore-Construct-Filter: An Automated Framework for Rich and Reliable API Knowledge Graph Construction Yanbang Sun et.al. 2502.13412 null
2025-02-19 Generative Predictive Control: Flow Matching Policies for Dynamic and Difficult-to-Demonstrate Tasks Vince Kurtz et.al. 2502.13406 null
2025-02-19 $\mathtt{GeLLM^3O}$ : Generalizing Large Language Models for Multi-property Molecule Optimization Vishal Dey et.al. 2502.13398 null
2025-02-19 Prompting a Weighting Mechanism into LLM-as-a-Judge in Two-Step: A Case Study Wenwen Xie et.al. 2502.13396 null
2025-02-19 Flow-based generative models as iterative algorithms in probability space Yao Xie et.al. 2502.13394 null
2025-02-19 Reasoning with Reinforced Functional Token Tuning Kongcheng Zhang et.al. 2502.13389 link
2025-02-19 Reflection of Episodes: Learning to Play Game from Expert and Self Experiences Xiaojie Xu et.al. 2502.13388 null
2025-02-19 MM-Verify: Enhancing Multimodal Reasoning with Chain-of-Thought Verification Linzhuang Sun et.al. 2502.13383 link
2025-02-19 AutoTEE: Automated Migration and Protection of Programs in Trusted Execution Environments Ruidong Han et.al. 2502.13379 null
2025-02-19 Task-agnostic Prompt Compression with Context-aware Sentence Embedding and Reward-guided Task Descriptor Barys Liskavets et.al. 2502.13374 null
2025-02-18 Re-Align: Aligning Vision Language Models via Retrieval-Augmented Direct Preference Optimization Shuo Xing et.al. 2502.13146 link
2025-02-18 Multimodal Mamba: Decoder-only Multimodal State Space Model via Quadratic to Linear Distillation Bencheng Liao et.al. 2502.13145 link
2025-02-18 Pre-training Auto-regressive Robotic Models with 4D Representations Dantong Niu et.al. 2502.13142 null
2025-02-18 UniGuardian: A Unified Defense for Detecting Prompt Injection, Backdoor Attacks and Adversarial Attacks in Large Language Models Huawei Lin et.al. 2502.13141 link
2025-02-18 AIDE: AI-Driven Exploration in the Space of Code Zhengyao Jiang et.al. 2502.13138 link
2025-02-18 Theorem Prover as a Judge for Synthetic Data Generation Joshua Ong Jun Leang et.al. 2502.13137 null
2025-02-18 AV-Flow: Transforming Text to Audio-Visual Human-like Interactions Aggelina Chatziagapi et.al. 2502.13133 null
2025-02-18 Learning to Defer for Causal Discovery with Imperfect Experts Oscar Clivio et.al. 2502.13132 null
2025-02-18 Rethinking Diverse Human Preference Learning through Principal Component Analysis Feng Luo et.al. 2502.13131 null
2025-02-18 Magma: A Foundation Model for Multimodal AI Agents Jianwei Yang et.al. 2502.13130 link
2025-02-18 Is Noise Conditioning Necessary for Denoising Generative Models? Qiao Sun et.al. 2502.13129 null
2025-02-18 Facilitating Long Context Understanding via Supervised Chain-of-Thought Reasoning Jingyang Lin et.al. 2502.13127 null
2025-02-18 RuozhiBench: Evaluating LLMs with Logical Fallacies and Misleading Premises Zenan Zhai et.al. 2502.13125 null
2025-02-18 Adapting Psycholinguistic Research for LLMs: Gender-inclusive Language in a Coreference Context Marion Bartl et.al. 2502.13120 null
2025-02-18 STEER-ME: Assessing the Microeconomic Reasoning of Large Language Models Narun Raman et.al. 2502.13119 null
2025-02-18 Performance Evaluation of Large Language Models in Statistical Programming Xinyi Song et.al. 2502.13117 link
2025-02-18 MatterChat: A Multi-Modal LLM for Material Science Yingheng Tang et.al. 2502.13107 null
2025-02-18 Text2World: Benchmarking Large Language Models for Symbolic World Model Generation Mengkang Hu et.al. 2502.13092 null
2025-02-18 A Neural Difference-of-Entropies Estimator for Mutual Information Haoran Ni et.al. 2502.13085 null
2025-02-18 Personalized Image Generation with Deep Generative Models: A Decade Survey Yuxiang Wei et.al. 2502.13081 link
2025-02-18 SimpleVQA: Multimodal Factuality Evaluation for Multimodal Large Language Models Xianfu Cheng et.al. 2502.13059 null
2025-02-18 LAMD: Context-driven Android Malware Detection and Classification with LLMs Xingzhi Qian et.al. 2502.13055 null
2025-02-18 Do we still need Human Annotators? Prompting Large Language Models for Aspect Sentiment Quad Prediction Nils Constantin Hellwig et.al. 2502.13044 null
2025-02-18 HPSS: Heuristic Prompting Strategy Search for LLM Evaluators Bosi Wen et.al. 2502.13031 null
2025-02-18 A deep learning framework for efficient pathology image analysis Peter Neidlinger et.al. 2502.13027 null
2025-02-18 Agentic Deep Graph Reasoning Yields Self-Organizing Knowledge Networks Markus J. Buehler et.al. 2502.13025 null
2025-02-18 Oreo: A Plug-in Context Reconstructor to Enhance Retrieval-Augmented Generation Sha Li et.al. 2502.13019 null
2025-02-18 Towards a Design Guideline for RPA Evaluation: A Survey of Large Language Model-Based Role-Playing Agents Chaoran Chen et.al. 2502.13012 null
2025-02-18 Adaptive Knowledge Graphs Enhance Medical Question Answering: Bridging the Gap Between LLMs and Evolving Medical Knowledge Mohammad Reza Rezaei et.al. 2502.13010 null
2025-02-18 You need to MIMIC to get FAME: Solving Meeting Transcript Scarcity with a Multi-Agent Conversations Frederic Kirstein et.al. 2502.13001 null
2025-02-18 Personalized Top-k Set Queries Over Predicted Scores Sohrab Namazi Nia et.al. 2502.12998 null
2025-02-18 Beyond Profile: From Surface-Level Facts to Deep Persona Simulation in LLMs Zixiao Wang et.al. 2502.12988 null
2025-02-18 Towards Variational Flow Matching on General Geometries Olga Zaghen et.al. 2502.12981 null
2025-02-18 Learning More Effective Representations for Dense Retrieval through Deliberate Thinking Before Search Yifan Ji et.al. 2502.12974 link
2025-02-18 Reasoning-to-Defend: Safety-Aware Reasoning Can Defend Large Language Models from Jailbreaking Junda Zhu et.al. 2502.12970 link
2025-02-18 Trust Me, I'm Wrong: High-Certainty Hallucinations in LLMs Adi Simhi et.al. 2502.12964 null
2025-02-18 Infinite Retrieval: Attention Enhanced LLMs in Long-Context Processing Xiaoju Ye et.al. 2502.12962 null
2025-02-18 Adaptive Tool Use in Large Language Models with Meta-Cognition Trigger Wenjun Li et.al. 2502.12961 null
2025-02-18 Guaranteed Conditional Diffusion: 3D Block-based Models for Scientific Data Compression Jaemoon Lee et.al. 2502.12951 null
2025-02-18 Fake It Till You Make It: Using Synthetic Data and Domain Knowledge for Improved Text-Based Learning for LGE Detection Athira J Jacob et.al. 2502.12948 null
2025-02-18 Every Expert Matters: Towards Effective Knowledge Distillation for Mixture-of-Experts Language Models Gyeongman Kim et.al. 2502.12947 null
2025-02-18 LLMPopcorn: An Empirical Study of LLMs as Assistants for Popular Micro-video Generation Junchen Fu et.al. 2502.12945 null
2025-02-18 Performance of Zero-Shot Time Series Foundation Models on Cloud Data William Toner et.al. 2502.12944 null
2025-02-18 Flow-of-Options: Diversified and Improved LLM Reasoning by Thinking Through Options Lakshmi Nair et.al. 2502.12929 link
2025-02-18 Finedeep: Mitigating Sparse Activation in Dense LLMs via Multi-Layer Fine-Grained Experts Leiyu Pan et.al. 2502.12928 null
2025-02-18 SEFL: Harnessing Large Language Model Agents to Improve Educational Feedback Systems Mike Zhang et.al. 2502.12927 null
2025-02-18 Towards more Contextual Agents: An extractor-Generator Optimization Framework Mourad Aouini et.al. 2502.12926 null
2025-02-18 Keep what you need : extracting efficient subnetworks from large audio representation models David Genova et.al. 2502.12925 link
2025-02-18 Conditioning LLMs to Generate Code-Switched Text: A Methodology Grounded in Naturally Occurring Data Maite Heredia et.al. 2502.12924 link
2025-02-18 On-Device LLMs for Home Assistant: Dual Role in Intent Detection and Response Generation Rune Birkmose et.al. 2502.12923 link
2025-02-18 Q-STRUM Debate: Query-Driven Contrastive Summarization for Recommendation Comparison George-Kirollos Saad et.al. 2502.12921 null
2025-02-18 Lightweight Online Adaption for Time Series Foundation Model Forecasts Thomas L. Lee et.al. 2502.12920 null
2025-02-18 GSQ-Tuning: Group-Shared Exponents Integer in Fully Quantized Training for LLMs On-Device Fine-tuning Sifan Zhou et.al. 2502.12913 null
2025-02-18 Probabilistic neural operators for functional uncertainty quantification Christopher Bülte et.al. 2502.12902 link
2025-02-18 Soundwave: Less is More for Speech-Text Alignment in LLMs Yuhao Zhang et.al. 2502.12900 link
2025-02-18 Multilingual European Language Models: Benchmarking Approaches and Challenges Fabio Barth et.al. 2502.12895 null
2025-02-18 CAST: Component-Aligned 3D Scene Reconstruction from an RGB Image Kaixin Yao et.al. 2502.12894 null
2025-02-18 Are Multilingual Language Models an Off-ramp for Under-resourced Languages? Will we arrive at Digital Language Equality in Europe in 2030? Georg Rehm et.al. 2502.12886 null
2025-02-18 How desirable is alignment between LLMs and linguistically diverse human users? Pia Knoeferle et.al. 2502.12884 null
2025-02-18 Continuous Learning Conversational AI: A Personalized Agent Framework via A2C Reinforcement Learning Nandakishor M et.al. 2502.12876 null
2025-02-18 RobotIQ: Empowering Mobile Robots with Human-Level Planning for Real-World Execution Emmanuel K. Raptis et.al. 2502.12862 link
2025-02-18 PAFT: Prompt-Agnostic Fine-Tuning Chenxing Wei et.al. 2502.12859 null
2025-02-18 Rejected Dialects: Biases Against African American Language in Reward Models Joel Mire et.al. 2502.12858 null
2025-02-18 MeMo: Towards Language Models with Associative Memory Mechanisms Fabio Massimo Zanzotto et.al. 2502.12851 null
2025-02-18 MOLLM: Multi-Objective Large Language Model for Molecular Design -- Optimizing with Experts Nian Ran et.al. 2502.12845 null
2025-02-18 Towards Adaptive Feedback with AI: Comparing the Feedback Quality of LLMs and Teachers on Experimentation Protocols Kathrin Seßler et.al. 2502.12842 null
2025-02-18 Towards Equitable AI: Detecting Bias in Using Large Language Models for Marketing Berk Yilmaz et.al. 2502.12838 null
2025-02-18 An LLM-Powered Agent for Physiological Data Analysis: A Case Study on PPG-based Heart Rate Estimation Mohammad Feli et.al. 2502.12836 null
2025-02-18 KazMMLU: Evaluating Language Models on Kazakh, Russian, and Regional Knowledge of Kazakhstan Mukhammed Togmanov et.al. 2502.12829 null
2025-02-18 Reasoning and the Trusting Behavior of DeepSeek and GPT: An Experiment Revealing Hidden Fault Lines in Large Language Models Rubing Lu et.al. 2502.12825 null
2025-02-18 Pitfalls of Scale: Investigating the Inverse Task of Redefinition in Large Language Models Elena Stringli et.al. 2502.12821 null
2025-02-18 Simulating User Diversity in Task-Oriented Dialogue Systems using Large Language Models Adnan Ahmad et.al. 2502.12813 null
2025-02-18 Towards Text-Image Interleaved Retrieval Xin Zhang et.al. 2502.12799 link
2025-02-18 RAPID: Retrieval Augmented Training of Differentially Private Diffusion Models Tanqiu Jiang et.al. 2502.12794 link
2025-02-18 Commonsense Reasoning in Arab Culture Abdelrahman Sadallah et.al. 2502.12788 null
2025-02-18 Portable Reward Tuning: Towards Reusable Fine-Tuning across Different Pretrained Models Daiki Chijiwa et.al. 2502.12776 null
2025-02-18 How Much Do LLMs Hallucinate across Languages? On Multilingual Estimation of LLM Hallucination in the Wild Saad Obaid ul Islam et.al. 2502.12769 link
2025-02-18 R2-KG: General-Purpose Dual-Agent Framework for Reliable Reasoning on Knowledge Graphs Sumin Jo et.al. 2502.12767 null
2025-02-18 One-bit Compressed Sensing using Generative Models Swatantra Kafle et.al. 2502.12762 null
2025-02-18 Efficient Machine Translation Corpus Generation: Integrating Human-in-the-Loop Post-Editing with Large Language Models Kamer Ali Yuksel et.al. 2502.12755 link
2025-02-18 Architect of the Bits World: Masked Autoregressive Modeling for Circuit Generation Guided by Truth Table Haoyuan Wu et.al. 2502.12751 null
2025-02-18 Self-Enhanced Reasoning Training: Activating Latent Reasoning in Small Models for Enhanced Reasoning Distillation Yong Zhang et.al. 2502.12744 null
2025-02-18 "I know myself better, but not really greatly": Using LLMs to Detect and Explain LLM-Generated Texts Jiazhou Ji et.al. 2502.12743 null
2025-02-18 Circuit Representation Learning with Masked Gate Modeling and Verilog-AIG Alignment Haoyuan Wu et.al. 2502.12732 null
2025-02-18 TREND: A Whitespace Replacement Information Hiding Method Malte Hellmeier et.al. 2502.12710 null
2025-02-18 Multi-Novelty: Improve the Diversity and Novelty of Contents Generated by Large Language Models via inference-time Multi-Views Brainstorming Arash Lagzian et.al. 2502.12700 null
2025-02-18 Multi-Step Alignment as Markov Games: An Optimistic Online Gradient Descent Approach with Convergence Guarantees Yongtao Wu et.al. 2502.12678 null
2025-02-18 Baichuan-M1: Pushing the Medical Capability of Large Language Models Bingning Wang et.al. 2502.12671 null
2025-02-18 Perovskite-LLM: Knowledge-Enhanced Large Language Models for Perovskite Solar Cell Research Xiang Liu et.al. 2502.12669 null
2025-02-18 Evaluation of Best-of-N Sampling Strategies for Language Model Alignment Yuki Ichihara et.al. 2502.12668 null
2025-02-18 A $^2$ ATS: Retrieval-Based KV Cache Reduction via Windowed Rotary Position Embedding and Query-Aware Vector Quantization Junhui He et.al. 2502.12665 null
2025-02-18 Demystifying Multilingual Chain-of-Thought in Process Reward Modeling Weixuan Wang et.al. 2502.12663 null
2025-02-18 The Hidden Risks of Large Reasoning Models: A Safety Assessment of R1 Kaiwen Zhou et.al. 2502.12659 null
2025-02-18 R.R.: Unveiling LLM Training Privacy through Recollection and Ranking Wenlong Meng et.al. 2502.12658 link
2025-02-18 NExT-Mol: 3D Diffusion Meets 1D Language Modeling for 3D Molecule Generation Zhiyuan Liu et.al. 2502.12638 link
2025-02-18 Corrupted but Not Broken: Rethinking the Impact of Corrupted Data in Visual Instruction Tuning Yunhao Gou et.al. 2502.12635 null
2025-02-18 \textit{One Size doesn't Fit All}: A Personalized Conversational Tutoring Agent for Mathematics Instruction Ben Liu et.al. 2502.12633 null
2025-02-18 Automating Prompt Leakage Attacks on Large Language Models Using Agentic Approach Tvrtko Sternak et.al. 2502.12630 link
2025-02-18 DeepResonance: Enhancing Multimodal Music Understanding via Music-centric Multi-way Instruction Tuning Zhuoyuan Mao et.al. 2502.12623 null
2025-02-18 Improving Chain-of-Thought Reasoning via Quasi-Symbolic Abstractions Leonardo Ranaldi et.al. 2502.12616 null
2025-02-17 Idiosyncrasies in Large Language Models Mingjie Sun et.al. 2502.12150 link
2025-02-17 HermesFlow: Seamlessly Closing the Gap in Multimodal Understanding and Generation Ling Yang et.al. 2502.12148 link
2025-02-17 Fast or Better? Balancing Accuracy and Cost in Retrieval-Augmented Generation with Flexible User Control Jinyan Su et.al. 2502.12145 link
2025-02-17 Small Models Struggle to Learn from Strong Reasoners Yuetai Li et.al. 2502.12143 null
2025-02-17 SoftCoT: Soft Chain-of-Thought for Efficient Reasoning with LLMs Yige Xu et.al. 2502.12134 null
2025-02-17 Transformer Dynamics: A neuroscientific approach to interpretability of large language models Jesseba Fernando et.al. 2502.12131 null
2025-02-17 Scaling Autonomous Agents via Automatic Reward Modeling And Planning Zhenfang Chen et.al. 2502.12130 null
2025-02-17 LaM-SLidE: Latent Space Modeling of Spatial Dynamical Systems via Linked Entities Florian Sestak et.al. 2502.12128 link
2025-02-17 Minimal Ranks, Maximum Confidence: Parameter-efficient Uncertainty Quantification for LoRA Patryk Marszałek et.al. 2502.12122 link
2025-02-17 LLMs on the Line: Data Determines Loss-to-Loss Scaling Laws Prasanna Mayilvahanan et.al. 2502.12120 null
2025-02-17 PRISM: Self-Pruning Intrinsic Selection Method for Training-Free Multimodal Data Selection Jinhe Bi et.al. 2502.12119 null
2025-02-17 A-MEM: Agentic Memory for LLM Agents Wujiang Xu et.al. 2502.12110 link
2025-02-17 Personality Structured Interview for Large Language Model Simulation in Personality Research Pengda Wang et.al. 2502.12109 null
2025-02-17 Relational Norms for Human-AI Cooperation Brian D. Earp et.al. 2502.12102 null
2025-02-17 Token Communications: A Unified Framework for Cross-modal Context-aware Semantic Communications Li Qiao et.al. 2502.12096 null
2025-02-17 How compositional generalization and creativity improve as diffusion models are trained Alessandro Favero et.al. 2502.12089 null
2025-02-17 Meta-Statistical Learning: Supervised Learning of Statistical Inference Maxime Peyrard et.al. 2502.12088 null
2025-02-17 APB: Accelerating Distributed Long-Context Inference by Passing Compressed Context Blocks across GPUs Yuxiang Huang et.al. 2502.12085 link
2025-02-17 Can LLMs Simulate Social Media Engagement? A Study on Action-Guided Response Generation Zhongyi Qiu et.al. 2502.12073 null
2025-02-17 TokenSkip: Controllable Chain-of-Thought Compression in LLMs Heming Xia et.al. 2502.12067 link
2025-02-17 CONSTRUCTA: Automating Commercial Construction Schedules in Fabrication Facilities with Large Language Models Yifan Zhang et.al. 2502.12066 null
2025-02-17 AI-generated Text Detection with a GLTR-based Approach Lucía Yan Wu et.al. 2502.12064 null
2025-02-17 Designing Role Vectors to Improve LLM Inference Behaviour Daniele Potertì et.al. 2502.12055 null
2025-02-17 PhysReason: A Comprehensive Benchmark towards Physics-Based Reasoning Xinyu Zhang et.al. 2502.12054 null
2025-02-17 A Survey on Bridging EEG Signals and Generative AI: From Image and Text to Beyond Shreya Shukla et.al. 2502.12048 null
2025-02-17 KnowPath: Knowledge-enhanced Reasoning via LLM-generated Inference Paths over Knowledge Graphs Qi Zhao et.al. 2502.12029 null
2025-02-17 SafeChain: Safety of Language Models with Long Chain-of-Thought Reasoning Capabilities Fengqing Jiang et.al. 2502.12025 null
2025-02-17 Teaching LLMs According to Their Aptitude: Adaptive Reasoning for Mathematical Problem Solving Xin Xu et.al. 2502.12022 null
2025-02-17 Atom of Thoughts for Markov LLM Test-Time Scaling Fengwei Teng et.al. 2502.12018 null
2025-02-17 Unsupervised Structural-Counterfactual Generation under Domain Shift Krishn Vishwas Kher et.al. 2502.12013 null
2025-02-17 Design Considerations Based on Stability for a Class of TCP Algorithms Sreekanth Prabhakar et.al. 2502.11983 null
2025-02-17 Image Inversion: A Survey from GANs to Diffusion and Beyond Yinan Chen et.al. 2502.11974 link
2025-02-17 Generating Text from Uniform Meaning Representation Emma Markle et.al. 2502.11973 link
2025-02-17 A MIMO Wireless Channel Foundation Model via CIR-CSI Consistency Jun Jiang et.al. 2502.11965 null
2025-02-17 Navigating the Helpfulness-Truthfulness Trade-Off with Uncertainty-Aware Instruction Fine-Tuning Tianyi Wu et.al. 2502.11962 null
2025-02-17 On Representational Dissociation of Language and Arithmetic in Large Language Models Riku Kisako et.al. 2502.11932 null
2025-02-17 GRAPHGPT-O: Synergistic Multimodal Comprehension and Generation on Graphs Yi Fang et.al. 2502.11925 null
2025-02-17 From Text to Trust: Empowering AI-assisted Decision Making with Adaptive LLM-powered Analysis Zhuoyan Li et.al. 2502.11919 null
2025-02-17 EssayJudge: A Multi-Granular Benchmark for Assessing Automated Essay Scoring Capabilities of Multimodal Large Language Models Jiamin Su et.al. 2502.11916 null
2025-02-17 Adversarial Alignment for LLMs Requires Simpler, Reproducible, and More Measurable Objectives Leo Schwinn et.al. 2502.11910 null
2025-02-17 MMRC: A Large-Scale Benchmark for Understanding Multimodal Large Language Model in Real-World Conversation Haochen Xue et.al. 2502.11903 null
2025-02-17 DLFR-VAE: Dynamic Latent Frame Rate VAE for Video Generation Zhihang Yuan et.al. 2502.11897 link
2025-02-17 CAMEL: Continuous Action Masking Enabled by Large Language Models for Reinforcement Learning Yanxiao Zhao et.al. 2502.11896 null
2025-02-17 Continual Quantization-Aware Pre-Training: When to transition from 16-bit to 1.58-bit pre-training for BitNet language models? Jacob Nielsen et.al. 2502.11895 null
2025-02-17 Leveraging Dual Process Theory in Language Agent Framework for Real-time Simultaneous Human-AI Collaboration Shao Zhang et.al. 2502.11882 link
2025-02-17 Hypothesis-Driven Theory-of-Mind Reasoning for Large Language Models Hyunwoo Kim et.al. 2502.11881 null
2025-02-17 Bitnet.cpp: Efficient Edge Inference for Ternary LLMs Jinheng Wang et.al. 2502.11880 link
2025-02-17 JoLT: Joint Probabilistic Predictions on Tabular Data Using LLMs Aliaksandra Shysheya et.al. 2502.11877 link
2025-02-17 FedEAT: A Robustness Optimization Framework for Federated LLMs Yahao Pang et.al. 2502.11863 null
2025-02-17 Understanding In-Context Machine Translation for Low-Resource Languages: A Case Study on Manchu Renhao Pei et.al. 2502.11862 null
2025-02-17 Exploring Large Language Models in Healthcare: Insights into Corpora Sources, Customization Strategies, and Evaluation Metrics Shuqi Yang et.al. 2502.11861 null
2025-02-17 StructTransform: A Scalable Attack Surface for Safety-Aligned Large Language Models Shehel Yoosuf et.al. 2502.11853 link
2025-02-17 BaxBench: Can LLMs Generate Correct and Secure Backends? Mark Vero et.al. 2502.11844 null
2025-02-17 Can LLM Agents Maintain a Persona in Discourse? Pranav Bhandari et.al. 2502.11843 null
2025-02-17 Model Generalization on Text Attribute Graphs: Principles with Large Language Models Haoyu Wang et.al. 2502.11836 link
2025-02-17 HAAN: A Holistic Approach for Accelerating Normalization Operations in Large Language Models Tianfan Peng et.al. 2502.11832 null
2025-02-17 Intuitive physics understanding emerges from self-supervised pretraining on natural videos Quentin Garrido et.al. 2502.11831 link
2025-02-17 Text Classification in the LLM Era - Where do we stand? Sowmya Vajjala et.al. 2502.11830 null
2025-02-17 Code-Vision: Evaluating Multimodal LLMs Logic Understanding and Code Generation Capabilities Hanbin Wang et.al. 2502.11829 link
2025-02-17 M-ABSA: A Multilingual Dataset for Aspect-Based Sentiment Analysis Chengyan Wu et.al. 2502.11824 link
2025-02-17 Towards Understanding Fine-Tuning Mechanisms of LLMs via Circuit Analysis Xu Wang et.al. 2502.11812 null
2025-02-17 FineFilter: A Fine-grained Noise Filtering Mechanism for Retrieval-Augmented Large Language Models Qianchi Zhang et.al. 2502.11811 null
2025-02-17 Exploring Translation Mechanism of Large Language Models Hongbin Zhang et.al. 2502.11806 null
2025-02-17 Table-Critic: A Multi-Agent Framework for Collaborative Criticism and Refinement in Table Reasoning Peiying Yu et.al. 2502.11799 null
2025-02-17 Personality Editing for Language Models through Relevant Knowledge Editing Seojin Hwang et.al. 2502.11789 null
2025-02-17 Efficient Response Generation Method Selection for Fine-Tuning Large Language Models Xuan Ren et.al. 2502.11779 null
2025-02-17 video-SALMONN-o1: Reasoning-enhanced Audio-visual Large Language Model Guangzhi Sun et.al. 2502.11775 null
2025-02-17 The Validation Gap: A Mechanistic Analysis of How Language Models Compute Arithmetic but Fail to Validate It Leonardo Bertolazzi et.al. 2502.11771 link
2025-02-17 Cognitive-Aligned Document Selection for Retrieval-augmented Generation Bingyu Wan et.al. 2502.11770 null
2025-02-17 From Selection to Generation: A Survey of LLM-based Active Learning Yu Xia et.al. 2502.11767 null
2025-02-17 Warmup-Distill: Bridge the Distribution Mismatch between Teacher and Student before Knowledge Distillation Zengkui Sun et.al. 2502.11766 link
2025-02-17 HintsOfTruth: A Multimodal Checkworthiness Detection Dataset with Real and Synthetic Claims Michiel van der Meer et.al. 2502.11753 null
2025-02-17 Language Models Can See Better: Visual Contrastive Decoding For LLM Multimodal Reasoning Yuqi Pang et.al. 2502.11751 link
2025-02-17 ILIAS: Instance-Level Image retrieval At Scale Giorgos Kordopatis-Zilos et.al. 2502.11748 null
2025-02-17 SQL-o1: A Self-Reward Heuristic Dynamic Search Method for Text-to-SQL Shuai Lyu et.al. 2502.11741 link
2025-02-17 ReviewEval: An Evaluation Framework for AI-Generated Reviews Chavvi Kirtani et.al. 2502.11736 null
2025-02-17 Plant in Cupboard, Orange on Table, Book on Shelf. Benchmarking Practical Reasoning and Situation Modelling in a Text-Simulated Situated Environment Jonathan Jordan et.al. 2502.11733 null
2025-02-17 Energy-Conscious LLM Decoding: Impact of Text Generation Strategies on GPU Energy Consumption Alireza Nik et.al. 2502.11723 null
2025-02-17 Enhancing Recommendation Explanations through User-Centric Refinement Jingsen Zhang et.al. 2502.11721 null
2025-02-17 Can you pass that tool?: Implications of Indirect Speech in Physical Human-Robot Collaboration Yan Zhang et.al. 2502.11720 null
2025-02-17 Component-aware Unsupervised Logical Anomaly Generation for Industrial Anomaly Detection Xuan Tong et.al. 2502.11712 null
2025-02-17 Ad-hoc Concept Forming in the Game Codenames as a Means for Evaluating Large Language Models Sherzod Hakimov et.al. 2502.11707 null
2025-02-17 LLM Agents Making Agent Tools Georg Wölflein et.al. 2502.11705 null
2025-02-17 CMQCIC-Bench: A Chinese Benchmark for Evaluating Large Language Models in Medical Quality Control Indicator Calculation Guangya Yu et.al. 2502.11703 null
2025-02-17 MVTokenFlow: High-quality 4D Content Generation using Multiview Token Flow Hanzhuo Huang et.al. 2502.11697 null
2025-02-17 Improve LLM-as-a-Judge Ability as a General Ability Jiachen Yu et.al. 2502.11689 null
2025-02-17 MathFimer: Enhancing Mathematical Reasoning by Expanding Reasoning Steps through Fill-in-the-Middle Task Yuchen Yan et.al. 2502.11684 null
2025-02-17 RIDE: Enhancing Large Language Model Alignment through Restyled In-Context Learning Demonstration Exemplars Yuncheng Hua et.al. 2502.11681 link
2025-02-17 Exploring LLM-based Student Simulation for Metacognitive Cultivation Haoxuan Li et.al. 2502.11678 null
2025-02-17 Towards Fully Exploiting LLM Internal States to Enhance Knowledge Boundary Perception Shiyu Ni et.al. 2502.11677 null
2025-02-17 Diversity-Oriented Data Augmentation with Large Language Models Zaitian Wang et.al. 2502.11671 null
2025-02-17 VRoPE: Rotary Position Embedding for Video Large Language Models Zikang Liu et.al. 2502.11664 link
2025-02-17 An Innovative Brain-Computer Interface Interaction System Based on the Large Language Model Jing Jina et.al. 2502.11659 null
2025-02-17 Competing LLM Agents in a Non-Cooperative Game of Opinion Polarisation Amin Qasmi et.al. 2502.11649 null
2025-02-17 DELMAN: Dynamic Defense Against Large Language Model Jailbreaking with Model Editing Yi Wang et.al. 2502.11647 null
2025-02-17 Hyperspherical Energy Transformer with Recurrent Depth Yunzhe Hu et.al. 2502.11646 null
2025-02-17 Is Human-Like Text Liked by Humans? Multilingual Human Detection and Preference Against AI Yuxia Wang et.al. 2502.11614 null
2025-02-17 Maximum Entropy Reinforcement Learning with Diffusion Policy Xiaoyi Dong et.al. 2502.11612 link
2025-02-17 Accuracy Assessment of OpenAlex and Clarivate Scholar ID with an LLM-Assisted Benchmark Renyu Zhao et.al. 2502.11610 null
2025-02-17 GraphThought: Graph Combinatorial Optimization with Thought Generation Zixiao Huang et.al. 2502.11607 null
2025-02-14 MM-RLHF: The Next Step Forward in Multimodal LLM Alignment Yi-Fan Zhang et.al. 2502.10391 null
2025-02-14 Aspect-Oriented Summarization for Psychiatric Short-Term Readmission Prediction WonJin Yoon et.al. 2502.10388 null
2025-02-14 Robustness tests for biomedical foundation models should tailor to specification R. Patrick Xian et.al. 2502.10374 link
2025-02-14 AffinityFlow: Guided Flows for Antibody Affinity Maturation Can Chen et.al. 2502.10365 null
2025-02-14 Enhancing Multilingual LLM Pretraining with Model-Based Data Selection Bettina Messmer et.al. 2502.10361 null
2025-02-14 Dimension-free Score Matching and Time Bootstrapping for Diffusion Models Syamantak Kumar et.al. 2502.10354 null
2025-02-14 Organize the Web: Constructing Domains Enhances Pre-Training Data Curation Alexander Wettig et.al. 2502.10341 null
2025-02-14 Evaluating the Meta- and Object-Level Reasoning of Large Language Models for Question Answering Nick Ferguson et.al. 2502.10338 null
2025-02-14 Generalised Parallel Tempering: Flexible Replica Exchange via Flows and Diffusions Leo Zhang et.al. 2502.10328 null
2025-02-14 LLM-Powered Preference Elicitation in Combinatorial Assignment Ermis Soumalias et.al. 2502.10308 null
2025-02-14 SPIRIT: Short-term Prediction of solar IRradIance for zero-shot Transfer learning using Foundation Models Aditya Mishra et.al. 2502.10307 null
2025-02-14 Open-Source AI-Powered Optimization in Scalene: Advancing Python Performance Profiling with DeepSeek-R1 and LLaMA 3.2 Saem Hasan et.al. 2502.10299 null
2025-02-14 Probabilistic Super-Resolution for High-Fidelity Physical System Simulations with Uncertainty Quantification Pengyu Zhang et.al. 2502.10280 null
2025-02-14 Are Large Language Models the future crowd workers of Linguistics? Iris Ferrazzo et.al. 2502.10266 null
2025-02-14 Large Language Models and Synthetic Data for Monitoring Dataset Mentions in Research Papers Aivin V. Solatorio et.al. 2502.10263 null
2025-02-14 VisCon-100K: Leveraging Contextual Web Data for Fine-tuning Vision Language Models Gokul Karthik Kumar et.al. 2502.10250 null
2025-02-14 Step-Video-T2V Technical Report: The Practice, Challenges, and Future of Video Foundation Model Guoqing Ma et.al. 2502.10248 link
2025-02-14 Efficient Zero-Order Federated Finetuning of Language Models for Resource-Constrained Devices Mohamed Aboelenien Ahmed et.al. 2502.10239 null
2025-02-14 Shaping Inductive Bias in Diffusion Models through Frequency-Based Noise Control Thomas Jiralerspong et.al. 2502.10236 null
2025-02-14 AdaPTS: Adapting Univariate Foundation Models to Probabilistic Multivariate Time Series Forecasting Abdelhakim Benechehab et.al. 2502.10235 link
2025-02-14 Do Large Language Models Reason Causally Like Us? Even Better? Hanna M. Dettki et.al. 2502.10215 null
2025-02-14 Can Post-Training Quantization Benefit from an Additional QLoRA Integration? Xiliang Zhu et.al. 2502.10202 null
2025-02-14 Prediction hubs are context-informed frequent tokens in LLMs Beatrix M. G. Nielsen et.al. 2502.10201 null
2025-02-14 MathConstruct: Challenging LLM Reasoning with Constructive Proofs Mislav Balunović et.al. 2502.10197 null
2025-02-14 Translating Common Security Assertions Across Processor Designs: A RISC-V Case Study Sharjeel Imtiaz et.al. 2502.10194 null
2025-02-14 VideoDiff: Human-AI Video Co-Creation with Alternatives Mina Huh et.al. 2502.10190 null
2025-02-14 Modeling biases in binary decision-making within the generalized nonlinear q-voter model Maciej Doniec et.al. 2502.10172 null
2025-02-14 Video Soundtrack Generation by Aligning Emotions and Temporal Boundaries Serkan Sulun et.al. 2502.10154 null
2025-02-14 Semantica: Decentralized Search using a LLM-Guided Semantic Tree Overlay Petru Neague et.al. 2502.10151 link
2025-02-14 Cooperative Multi-Agent Planning with Adaptive Skill Synthesis Zhiyuan Li et.al. 2502.10148 null
2025-02-14 Small Models, Big Impact: Efficient Corpus and Graph-Based Adaptation of Small Multilingual Language Models for Low-Resource Languages Daniil Gurgurov et.al. 2502.10140 null
2025-02-14 Physics-Informed Generative Modeling of Wireless Channels Benedikt Böck et.al. 2502.10137 null
2025-02-14 ScamFerret: Detecting Scam Websites Autonomously with Large Language Models Hiroki Nakano et.al. 2502.10110 link
2025-02-14 NeuroXVocal: Detection and Explanation of Alzheimer's Disease through Non-invasive Analysis of Picture-prompted Speech Nikolaos Ntampakis et.al. 2502.10108 null
2025-02-14 A novel approach to data generation in generative model JaeHong Kim et.al. 2502.10092 null
2025-02-14 Enhancing Patient Acceptance of Robotic Ultrasound through Conversational Virtual Agent and Immersive Visualizations Tianyu Song et.al. 2502.10088 null
2025-02-14 DiSciPLE: Learning Interpretable Programs for Scientific Visual Discovery Utkarsh Mall et.al. 2502.10060 null
2025-02-14 A Generalized Modeling Approach to Liquid-driven Ballooning Membranes Mirroyal Ismayilov et.al. 2502.10057 null
2025-02-14 ORI: O Routing Intelligence Ahmad Shadid et.al. 2502.10051 null
2025-02-14 A Survey on LLM-powered Agents for Recommender Systems Qiyao Peng et.al. 2502.10050 null
2025-02-14 ViRAC: A Vision-Reasoning Agent Head Movement Control Framework in Arbitrary Virtual Environments Juyeong Hwang et.al. 2502.10046 null
2025-02-14 POI-Enhancer: An LLM-based Semantic Enhancement Framework for POI Representation Learning Jiawei Cheng et.al. 2502.10038 null
2025-02-14 Probabilistic Lexical Manifold Construction in Large Language Models via Hierarchical Vector Field Interpolation Clive Pendleton et.al. 2502.10013 null
2025-02-14 ChatGPT and Deepseek: Can They Predict the Stock Market and Macroeconomy? Jian Chen et.al. 2502.10008 null
2025-02-14 EmbBERT-Q: Breaking Memory Barriers in Embedded NLP Riccardo Bravin et.al. 2502.10001 null
2025-02-14 Decision Information Meets Large Language Models: The Future of Explainable Operations Research Yansen Zhang et.al. 2502.09994 null
2025-02-14 Large Language Diffusion Models Shen Nie et.al. 2502.09992 null
2025-02-14 V2V-LLM: Vehicle-to-Vehicle Cooperative Autonomous Driving with Multi-Modal Large Language Models Hsu-kuang Chiu et.al. 2502.09980 null
2025-02-14 LaRA: Benchmarking Retrieval-Augmented Generation and Long-Context LLMs - No Silver Bullet for LC or RAG Routing Kuan Li et.al. 2502.09977 null
2025-02-14 Has My System Prompt Been Used? Large Language Model Prompt Membership Inference Roman Levin et.al. 2502.09974 null
2025-02-14 KGGen: Extracting Knowledge Graphs from Plain Text with Language Models Belinda Mo et.al. 2502.09956 null
2025-02-14 A Preliminary Exploration with GPT-4o Voice Mode Yu-Xiang Lin et.al. 2502.09940 null
2025-02-14 Precise Parameter Localization for Textual Generation in Diffusion Models Łukasz Staniszewski et.al. 2502.09935 null
2025-02-14 MIR-Bench: Benchmarking LLM's Long-Context Intelligence via Many-Shot In-Context Inductive Reasoning Kai Yan et.al. 2502.09933 null
2025-02-14 Granite Vision: a lightweight, open-source multimodal model for enterprise Intelligence Granite Vision Team et.al. 2502.09927 null
2025-02-14 λScale: Enabling Fast Scaling for Serverless Large Language Model Inference Minchen Yu et.al. 2502.09922 null
2025-02-14 INF^2: High-Throughput Generative Inference of Large Language Models using Near-Storage Processing Hongsun Jang et.al. 2502.09921 null
2025-02-14 AutoS $^2$ earch: Unlocking the Reasoning Potential of Large Models for Web-based Source Search Zhengqiu Zhu et.al. 2502.09913 null
2025-02-14 Insect-Foundation: A Foundation Model and Large Multimodal Dataset for Vision-Language Insect Understanding Thanh-Dat Truong et.al. 2502.09906 null
2025-02-14 The Ann Arbor Architecture for Agent-Oriented Programming Wei Dong et.al. 2502.09903 null
2025-02-14 Artificial Intelligence in Spectroscopy: Advancing Chemistry from Prediction to Generation and Beyond Kehan Guo et.al. 2502.09897 null
2025-02-14 ChatIoT: Large Language Model-based Security Assistant for Internet of Things with Retrieval-Augmented Generation Ye Dong et.al. 2502.09896 null
2025-02-14 ArchRAG: Attributed Community-based Hierarchical Retrieval-Augmented Generation Shu Wang et.al. 2502.09891 null
2025-02-14 Video2Policy: Scaling up Manipulation Tasks in Simulation through Internet Videos Weirui Ye et.al. 2502.09886 null
2025-02-14 Solvable Dynamics of Self-Supervised Word Embeddings and the Emergence of Analogical Reasoning Dhruva Karkada et.al. 2502.09863 null
2025-02-14 Microphone Array Geometry Independent Multi-Talker Distant ASR: NTT System for the DASR Task of the CHiME-8 Challenge Naoyuki Kamo et.al. 2502.09859 null
2025-02-14 Automated Hypothesis Validation with Agentic Sequential Falsifications Kexin Huang et.al. 2502.09858 link
2025-02-14 Port-LLM: A Port Prediction Method for Fluid Antenna based on Large Language Models Yali Zhang et.al. 2502.09857 null
2025-02-14 Efficient Multitask Learning in Small Language Models Through Upside-Down Reinforcement Learning Yu-Chen Lin et.al. 2502.09854 null
2025-02-14 HealthGPT: A Medical Large Vision-Language Model for Unifying Comprehension and Generation via Heterogeneous Knowledge Adaptation Tianwei Lin et.al. 2502.09838 link
2025-02-13 A Solver-Aided Hierarchical Language for LLM-Driven CAD Design Benjamin T. Jones et.al. 2502.09819 null
2025-02-13 Statistical Coherence Alignment for Large Language Model Representation Learning Through Tensor Field Convergence Jonathan Gale et.al. 2502.09815 null
2025-02-13 INJONGO: A Multicultural Intent Detection and Slot-filling Dataset for 16 African Languages Hao Yu et.al. 2502.09814 null
2025-02-13 AgentGuard: Repurposing Agentic Orchestrator for Safety Evaluation of Tool Orchestration Jizhou Chen et.al. 2502.09809 null
2025-02-13 Unit Testing Past vs. Present: Examining LLMs' Impact on Defect Detection and Efficiency Rudolf Ramler et.al. 2502.09801 null
2025-02-13 Co-designing Large Language Model Tools for Project-Based Learning with K12 Educators Prerna Ravi et.al. 2502.09799 null
2025-02-13 A Survey on LLM-based News Recommender Systems Rongyao Wang et.al. 2502.09797 null
2025-02-13 TableTalk: Scaffolding Spreadsheet Development with a Language Agent Jenny T. Liang et.al. 2502.09787 null
2025-02-13 Improving Acoustic Side-Channel Attacks on Keyboards Using Transformers and Large Language Models Jin Hyun Park et.al. 2502.09782 null
2025-02-13 CellFlow: Simulating Cellular Morphology Changes via Flow Matching Yuhui Zhang et.al. 2502.09775 null
2025-02-13 Non-Markovian Discrete Diffusion with Causal Language Models Yangtian Zhang et.al. 2502.09767 null
2025-02-13 LLM-Generated Microservice Implementations from RESTful API Definitions Saurabh Chauhan et.al. 2502.09766 link
2025-02-13 Enhancing Jailbreak Attacks via Compliance-Refusal-Based Initialization Amit Levi et.al. 2502.09755 null
2025-02-13 Vote-Tree-Planner: Optimizing Execution Order in LLM-based Task Planning Pipeline via Voting Chaoyuan Zhang et.al. 2502.09749 null
2025-02-13 The Widespread Adoption of Large Language Model-Assisted Writing Across Society Weixin Liang et.al. 2502.09747 null
2025-02-13 Fine-Tuning Foundation Models with Federated Learning for Privacy Preserving Medical Time Series Forecasting Mahad Ali et.al. 2502.09744 null
2025-02-13 FoNE: Precise Single-Token Number Embeddings via Fourier Features Tianyi Zhou et.al. 2502.09741 null
2025-02-13 Making Them a Malicious Database: Exploiting Query Code to Jailbreak Aligned Large Language Models Qingsong Zou et.al. 2502.09723 link
2025-02-13 NestQuant: Nested Lattice Quantization for Matrix Products and LLMs Semyon Savkin et.al. 2502.09720 null
2025-02-13 Genetic Data Governance in Crisis: Policy Recommendations for Safeguarding Privacy and Preventing Discrimination Vivek Ramanan et.al. 2502.09716 null
2025-02-13 MME-CoT: Benchmarking Chain-of-Thought in Large Multimodal Models for Reasoning Quality, Robustness, and Efficiency Dongzhi Jiang et.al. 2502.09621 null
2025-02-13 Exploring the Potential of Encoder-free Architectures in 3D LMMs Yiwen Tang et.al. 2502.09620 link
2025-02-13 Designing a Conditional Prior Distribution for Flow-Based Generative Models Noam Issachar et.al. 2502.09611 null
2025-02-14 Score-of-Mixture Training: Training One-Step Generative Models Made Simple via Score Estimation of Mixture Distributions Tejas Jayashankar et.al. 2502.09609 null
2025-02-13 Human-LLM Coevolution: Evidence from Academic Writing Mingmeng Geng et.al. 2502.09606 null
2025-02-13 SelfCite: Self-Supervised Alignment for Context Attribution in Large Language Models Yung-Sung Chuang et.al. 2502.09604 link
2025-02-13 Do LLMs Recognize Your Preferences? Evaluating Personalized Preference Following in LLMs Siyan Zhao et.al. 2502.09597 link
2025-02-13 KIMAs: A Configurable Knowledge Integrated Multi-Agent System Zitao Li et.al. 2502.09596 null
2025-02-13 Logical forms complement probability in understanding language model (and human) performance Yixuan Wang et.al. 2502.09589 null
2025-02-13 Rolling Ahead Diffusion for Traffic Scene Simulation Yunpeng Liu et.al. 2502.09587 null
2025-02-13 Polymind: Parallel Visual Diagramming with Large Language Models to Support Prewriting Through Microtasks Qian Wan et.al. 2502.09577 null
2025-02-13 Zero-shot generation of synthetic neurosurgical data with large language models Austin A. Barr et.al. 2502.09566 link
2025-02-13 MDCrow: Automating Molecular Dynamics Workflows with Large Language Models Quintina Campbell et.al. 2502.09565 link
2025-02-13 EmbodiedBench: Comprehensive Benchmarking Multi-modal Large Language Models for Vision-Driven Embodied Agents Rui Yang et.al. 2502.09560 null
2025-02-13 Explainable AI-assisted Optimization for Feynman Integral Reduction Zhuo-Yang Song et.al. 2502.09544 null
2025-02-13 Mind the Gap! Choice Independence in Using Multilingual LLMs for Persuasive Co-Writing Tasks in Different Languages Shreyan Biswas et.al. 2502.09532 null
2025-02-13 SQ-GAN: Semantic Image Communications Using Masked Vector Quantization Francesco Pezone et.al. 2502.09520 link
2025-02-13 Diffusion Models for Molecules: A Survey of Methods and Tasks Liang Wang et.al. 2502.09511 link
2025-02-13 EQ-VAE: Equivariance Regularized Latent Space for Improved Generative Image Modeling Theodoros Kouzelis et.al. 2502.09509 null
2025-02-13 Improve LLM-based Automatic Essay Scoring with Linguistic Features Zhaoyi Joey Hou et.al. 2502.09497 null
2025-02-13 Foundation Neural-Network Quantum States Riccardo Rende et.al. 2502.09488 null
2025-02-13 Objective quantification of mood states using large language models Jakub Onysk et.al. 2502.09487 null
2025-02-13 DiffRenderGAN: Addressing Training Data Scarcity in Deep Segmentation Networks for Quantitative Nanomaterial Analysis through Differentiable Rendering and Generative Modelling Dennis Possart et.al. 2502.09477 null
2025-02-13 Transformer-Enhanced Variational Autoencoder for Crystal Structure Prediction Ziyi Chen et.al. 2502.09423 null
2025-02-13 ImageRAG: Dynamic Image Retrieval for Reference-Guided Image Generation Rotem Shalev-Arkushin et.al. 2502.09411 null
2025-02-13 SQuARE: Sequential Question Answering Reasoning Engine for Enhanced Chain-of-Thought in Large Language Models Daniel Fleischer et.al. 2502.09390 link
2025-02-13 Truth Knows No Language: Evaluating Truthfulness Beyond English Blanca Calvo Figueras et.al. 2502.09387 null
2025-02-13 APT-LLM: Embedding-Based Anomaly Detection of Cyber Advanced Persistent Threats Using Large Language Models Sidahmed Benabderrahmane et.al. 2502.09385 null
2025-02-13 LoRA Training Provably Converges to a Low-Rank Global Minimum or It Fails Loudly (But it Probably Won't Fail) Junsu Kim et.al. 2502.09376 null
2025-02-13 Inverse problems with experiment-guided AlphaFold Advaith Maddipatla et.al. 2502.09372 null
2025-02-13 Language Agents as Digital Representatives in Collective Decision-Making Daniel Jarrett et.al. 2502.09369 null
2025-02-13 Machine learning for modelling unstructured grid data in computational physics: a review Sibo Cheng et.al. 2502.09346 null
2025-02-13 ThunderServe: High-performance and Cost-efficient LLM Serving in Cloud Environments Youhe Jiang et.al. 2502.09334 null
2025-02-13 Beyond English: The Impact of Prompt Translation Strategies across Languages and Tasks in Multilingual LLMs Itai Mondshine et.al. 2502.09331 null
2025-02-13 Copilot Arena: A Platform for Code LLM Evaluation in the Wild Wayne Chi et.al. 2502.09328 null
2025-02-13 A Benchmark for Crime Surveillance Video Analysis with Large Models Haoran Chen et.al. 2502.09325 null
2025-02-13 A Judge-free LLM Open-ended Generation Benchmark Based on the Distributional Hypothesis Kentaro Imajo et.al. 2502.09316 link
2025-02-13 When the LM misunderstood the human chuckled: Analyzing garden path effects in humans and language models Samuel Joseph Amouyal et.al. 2502.09307 null
2025-02-13 Non-asymptotic Analysis of Diffusion Annealed Langevin Monte Carlo for Generative Modelling Paula Cordero-Encinar et.al. 2502.09306 null
2025-02-13 KET-RAG: A Cost-Efficient Multi-Granular Indexing Framework for Graph-RAG Yiqian Huang et.al. 2502.09304 link
2025-02-13 When do neural networks learn world models? Tianren Zhang et.al. 2502.09297 null
2025-02-13 SparQLe: Speech Queries to Text Translation Through LLMs Amirbek Djanibekov et.al. 2502.09284 null
2025-02-13 GEVRM: Goal-Expressive Video Generation Model For Robust Visual Manipulation Hongyin Zhang et.al. 2502.09268 null
2025-02-13 AnomalyGFM: Graph Foundation Model for Zero/Few-shot Anomaly Detection Hezhe Qiao et.al. 2502.09254 null
2025-02-13 From large language models to multimodal AI: A scoping review on the potential of generative AI in medicine Lukas Buess et.al. 2502.09242 null
2025-02-13 OpenBench: A New Benchmark and Baseline for Semantic Navigation in Smart Logistics Junhui Wang et.al. 2502.09238 null
2025-02-13 Reliable Conversational Agents under ASP Control that Understand Natural Language Yankai Zeng et.al. 2502.09237 null
2025-02-13 Data2Concept2Text: An Explainable Multilingual Framework for Data Analysis Narration Flavio Bertini et.al. 2502.09218 null
2025-02-13 LP-LM: No Hallucinations in Question Answering with Logic Programming Katherine Wu et.al. 2502.09212 link
2025-02-13 Visual Graph Question Answering with ASP and LLMs for Language Parsing Jakob Johannes Bauer et.al. 2502.09211 null
2025-02-13 On LLM-generated Logic Programs and their Inference Execution Methods Paul Tarau et.al. 2502.09209 null
2025-02-13 Logical Lease Litigation: Prolog and LLMs for Rental Law Compliance in New York Sanskar Sehgal et.al. 2502.09204 null
2025-02-13 XAInomaly: Explainable and Interpretable Deep Contractive Autoencoder for O-RAN Traffic Anomaly Detection Osman Tugay Basaran et.al. 2502.09194 null
2025-02-13 Thinking beyond the anthropomorphic paradigm benefits LLM research Lujain Ibrahim et.al. 2502.09192 null
2025-02-13 Matina: A Large-Scale 73B Token Persian Text Corpus Sara Bourbour Hosseinbeigi et.al. 2502.09188 null
2025-02-13 RefineCoder: Iterative Improving of Large Language Models via Adaptive Critique Refinement for Code Generation Changzhi Zhou et.al. 2502.09183 null
2025-02-13 FLAME: Flexible LLM-Assisted Moderation Engine Ivan Bakulin et.al. 2502.09175 null
2025-02-13 Two-Stage Representation Learning for Analyzing Movement Behavior Dynamics in People Living with Dementia Jin Cui et.al. 2502.09173 null
2025-02-13 Improving TCM Question Answering through Tree-Organized Self-Reflective Retrieval with LLMs Chang Liu et.al. 2502.09156 null
2025-02-13 Finite-Time Analysis of Discrete-Time Stochastic Interpolants Yuhao Liu et.al. 2502.09130 null
2025-02-13 One-shot Federated Learning Methods: A Practical Guide Xiang Liu et.al. 2502.09104 null
2025-02-13 Bridging the Gap Between LLMs and Human Intentions: Progresses and Challenges in Instruction Understanding, Intention Reasoning, and Reliable Generation Zongyu Chang et.al. 2502.09101 null
2025-02-13 Logical Reasoning in Large Language Models: A Survey Hanmeng Liu et.al. 2502.09100 null
2025-02-13 Show Me the Work: Fact-Checkers' Requirements for Explainable Automated Fact-Checking Greta Warren et.al. 2502.09083 null
2025-02-13 CoSER: Coordinating LLM-Based Persona Simulation of Established Roles Xintao Wang et.al. 2502.09082 link
2025-02-13 Enhancing RAG with Active Learning on Conversation Records: Reject Incapables and Answer Capables Xuzhao Geng et.al. 2502.09073 null
2025-02-13 Unleashing the Power of Large Language Model for Denoising Recommendation Shuyao Wang et.al. 2502.09058 null
2025-02-13 An Open Recipe: Adapting Language-Specific LLMs to a Reasoning Model in One Day via Model Merging Kunat Pipatanakul et.al. 2502.09056 null
2025-02-13 Game Theory Meets Large Language Models: A Systematic Survey Haoran Sun et.al. 2502.09053 null
2025-02-13 Typhoon T1: An Open Thai Reasoning Model Pittawat Taveekitworachai et.al. 2502.09042 null
2025-02-13 Implementation of a Fuzzy Relational Database. Case Study: Chilean Cardboard Industry in the Maule Region Leoncio Jimenez et.al. 2502.09035 null
2025-02-13 MTDP: Modulated Transformer Diffusion Policy Model Qianhao Wang et.al. 2502.09029 null
2025-02-13 EventSTR: A Benchmark Dataset and Baselines for Event Stream based Scene Text Recognition Xiao Wang et.al. 2502.09020 link
2025-02-13 Diversity Enhances an LLM's Performance in RAG and Long-context Task Zhchao Wang et.al. 2502.09017 null
2025-02-13 Hope vs. Hate: Understanding User Interactions with LGBTQ+ News Content in Mainstream US News Media through the Lens of Hope Speech Jonathan Pofcher et.al. 2502.09004 null
2025-02-13 RoSTE: An Efficient Quantization-Aware Supervised Fine-Tuning Approach for Large Language Models Quan Wei et.al. 2502.09003 null
2025-02-13 End-to-End triplet loss based fine-tuning for network embedding in effective PII detection Rishika Kohli et.al. 2502.09002 null
2025-02-13 Task Generalization With AutoRegressive Compositional Structure: Can Learning From $\d$ Tasks Generalize to $\d^{T}$ Tasks? Amirhesam Abedsoltan et.al. 2502.08991 null
2025-02-13 Prophet Inequalities for Bandits, Cabinets, and DAGs Robin Bowers et.al. 2502.08976 null
2025-02-13 Medicine on the Edge: Comparative Performance Analysis of On-Device LLMs for Clinical Reasoning Leon Nissen et.al. 2502.08954 link
2025-02-13 Structured Convergence in Large Language Model Representations via Hierarchical Latent Space Folding Fenella Harcourt et.al. 2502.08947 null
2025-02-13 Beyond the Singular: The Essential Role of Multiple Generations in Effective Benchmark Evaluation and Analysis Wenbo Zhang et.al. 2502.08943 null
2025-02-13 Escaping Collapse: The Strength of Weak Data for Large Language Model Training Kareem Amin et.al. 2502.08924 null
2025-02-13 Self-Consistency of the Internal Reward Models Improves Self-Rewarding Language Models Xin Zhou et.al. 2502.08922 null
2025-02-13 Detecting Malicious Concepts Without Image Generation in AIGC Kun Xu et.al. 2502.08921 null
2025-02-13 InfiniteHiP: Extending Language Model Context Up to 3 Million Tokens on a Single GPU Heejun Lee et.al. 2502.08910 null
2025-02-13 Towards Automated Fact-Checking of Real-World Claims: Exploring Task Formulation and Assessment with LLMs Premtim Sahitaj et.al. 2502.08909 null
2025-02-13 Reinforced Large Language Model is a formal theorem prover Zhiling Luo et.al. 2502.08908 link
2025-02-13 DiffoRA: Enabling Parameter-Efficient LLM Fine-Tuning via Differential Low-Rank Matrix Adaptation Tangyu Jiang et.al. 2502.08905 null
2025-02-13 MIH-TCCT: Mitigating Inconsistent Hallucinations in LLMs via Event-Driven Text-Code Cyclic Training Xinxin You et.al. 2502.08904 null
2025-02-13 3D-Grounded Vision-Language Framework for Robotic Task Planning: Automated Prompt Synthesis and Supervised Reasoning Guoqin Tang et.al. 2502.08903 null
2025-02-13 Communication is All You Need: Persuasion Dataset Construction via Multi-LLM Communication Weicheng Ma et.al. 2502.08896 null
2025-02-13 ShapeLib: designing a library of procedural 3D shape abstractions with Large Language Models R. Kenny Jones et.al. 2502.08884 null
2025-02-13 Utilizing Pre-trained and Large Language Models for 10-K Items Segmentation Hsin-Min Lu et.al. 2502.08875 null
2025-02-13 Harnessing Vision Models for Time Series Analysis: A Survey Jingchao Ni et.al. 2502.08869 link
2025-02-13 A Systematic Evaluation of Generative Models on Tabular Transportation Data Chengen Wang et.al. 2502.08856 link
2025-02-12 Ask in Any Modality: A Comprehensive Survey on Multimodal Retrieval-Augmented Generation Mohammad Mahdi Abootorabi et.al. 2502.08826 link
2025-02-12 DejAIvu: Identifying and Explaining AI Art on the Web in Real-Time with Saliency Maps Jocelyn Dzuong et.al. 2502.08821 link
2025-02-12 Can a Single Model Master Both Multi-turn Conversations and Tool Use? CALM: A Unified Conversational Agentic Language Model Emre Can Acikgoz et.al. 2502.08820 null
2025-02-12 Lexical Manifold Reconfiguration in Large Language Models: A Novel Architectural Approach for Contextual Modulation Koinis Vassilis et.al. 2502.08818 null
2025-02-12 Examining Multilingual Embedding Models Cross-Lingually Through LLM-Generated Adversarial Examples Andrianos Michail et.al. 2502.08638 null
2025-02-12 Ensemble based approach to quantifying uncertainty of LLM based classifications Srijith Rajamohan et.al. 2502.08631 null
2025-02-12 Continuous Cardiac Arrest Prediction in ICU using PPG Foundation Model Saurabh Kataria et.al. 2502.08612 null
2025-02-12 Causal Analysis of ASR Errors for Children: Quantifying the Impact of Physiological, Cognitive, and Extrinsic Factors Vishwanath Pratap Singh et.al. 2502.08587 null
2025-02-12 Commercial LLM Agents Are Already Vulnerable to Simple Yet Dangerous Attacks Ang Li et.al. 2502.08586 null
2025-02-12 Statistically validated projection of bipartite signed networks Anna Gallo et.al. 2502.08567 null
2025-02-12 QA-Expand: Multi-Question Answer Generation for Enhanced Query Expansion in Information Retrieval Wonduk Seo et.al. 2502.08557 null
2025-02-12 Human-Centric Foundation Models: Perception, Generation and Agentic Modeling Shixiang Tang et.al. 2502.08556 link
2025-02-12 Fostering Appropriate Reliance on Large Language Models: The Role of Explanations, Sources, and Inconsistencies Sunnie S. Y. Kim et.al. 2502.08554 null
2025-02-12 LLMs can implicitly learn from mistakes in-context Lisa Alazraki et.al. 2502.08550 null
2025-02-12 LLM Pretraining with Continuous Concepts Jihoon Tack et.al. 2502.08524 null
2025-02-12 FedMHO: Heterogeneous One-Shot Federated Learning Towards Resource-Constrained Edge Devices Dezhong Yao et.al. 2502.08518 link
2025-02-12 The Paradox of Stochasticity: Limited Creativity and Computational Decoupling in Temperature-Varied LLM Outputs of Structured Fictional Data Evgenii Evstafev et.al. 2502.08515 null
2025-02-12 Faithful, Unfaithful or Ambiguous? Multi-Agent Debate with Initial Stance for Summary Evaluation Mahnaz Koupaee et.al. 2502.08514 link
2025-02-12 Measuring Diversity in Synthetic Datasets Yuchang Zhu et.al. 2502.08512 link
2025-02-12 Explanation based In-Context Demonstrations Retrieval for Multilingual Grammatical Error Correction Wei Li et.al. 2502.08507 link
2025-02-12 Salamandra Technical Report Aitor Gonzalez-Agirre et.al. 2502.08489 link
2025-02-12 One-Shot Federated Learning with Classifier-Free Diffusion Models Obaidullah Zaland et.al. 2502.08488 null
2025-02-12 Computed fingertip touch for the instrumental control of musical sound with an excursion on the computed retinal afterimage Staas de Jong et.al. 2502.08471 null
2025-02-12 mmE5: Improving Multimodal Multilingual Embeddings via High-quality Synthetic Data Haonan Chen et.al. 2502.08468 link
2025-02-12 From Haystack to Needle: Label Space Reduction for Zero-shot Classification Nathan Vandemoortele et.al. 2502.08436 null
2025-02-12 IssueBench: Millions of Realistic Prompts for Measuring Issue Bias in LLM Writing Assistance Paul Röttger et.al. 2502.08395 null
2025-02-12 ViLa-MIL: Dual-scale Vision-Language Multiple Instance Learning for Whole Slide Image Classification Jiangbo Shi et.al. 2502.08391 link
2025-02-12 Top-Theta Attention: Sparsifying Transformers by Compensated Thresholding Konstantin Berestizshevsky et.al. 2502.08363 link
2025-02-12 Systematic Knowledge Injection into Large Language Models via Diverse Augmentation for Domain-Specific RAG Kushagra Bhushan et.al. 2502.08356 null
2025-02-12 Trustworthy GNNs with LLMs: A Systematic Review and Taxonomy Ruizhan Xue et.al. 2502.08353 null
2025-02-12 Graph Foundation Models for Recommendation: A Comprehensive Survey Bin Wu et.al. 2502.08346 null
2025-02-12 Foundation Models in Computational Pathology: A Review of Challenges, Opportunities, and Impact Mohsin Bilal et.al. 2502.08333 null
2025-02-12 Modification and Generated-Text Detection: Achieving Dual Detection Capabilities for the Outputs of LLM by Watermark Yuhang Cai et.al. 2502.08332 null
2025-02-12 Contextual Compression Encoding for Large Language Models: A Novel Framework for Multi-Layered Parameter Space Pruning Barnaby Schmitt et.al. 2502.08323 null
2025-02-12 MultiProSE: A Multi-label Arabic Dataset for Propaganda, Sentiment, and Emotion Detection Lubna Al-Henaki et.al. 2502.08319 null
2025-02-12 Word Synchronization Challenge: A Benchmark for Word Association Responses for LLMs Tanguy Cazalets et.al. 2502.08312 null
2025-02-12 Unlocking Scaling Law in Industrial Recommendation Systems with a Three-step Paradigm based Large User Model Bencheng Yan et.al. 2502.08309 null
2025-02-12 HDT: Hierarchical Discrete Transformer for Multivariate Time Series Forecasting Shibo Feng et.al. 2502.08302 link
2025-02-12 Compromising Honesty and Harmlessness in Language Models via Deception Attacks Laurène Vaugrante et.al. 2502.08301 null
2025-02-12 Improving Existing Optimization Algorithms with LLMs Camilo Chacón Sartori et.al. 2502.08298 null
2025-02-12 Redefining Simplicity: Benchmarking Large Language Models from Lexical to Document Simplification Jipeng Qiang et.al. 2502.08281 null
2025-02-12 MoLoRec: A Generalizable and Efficient Framework for LLM-Based Recommendation Min Hou et.al. 2502.08271 null
2025-02-12 Exploring the Potential of Large Language Models to Simulate Personality Maria Molchanova et.al. 2502.08265 link
2025-02-12 GenIAS: Generator for Instantiating Anomalies in time Series Zahra Zamanzadeh Darban et.al. 2502.08262 null
2025-02-12 FixDrive: Automatically Repairing Autonomous Vehicle Driving Behaviour for $0.08 per Violation Yang Sun et.al. 2502.08260 link
2025-02-12 Learning Human Skill Generators at Key-Step Levels Yilu Wu et.al. 2502.08234 null
2025-02-12 Flow-of-Action: SOP Enhanced LLM-Based Multi-Agent System for Root Cause Analysis Changhua Pei et.al. 2502.08224 null
2025-02-12 Memory Offloading for Large Language Model Inference with Latency SLO Guarantees Chenxiang Ma et.al. 2502.08182 null
2025-02-12 Enhancing LLM Character-Level Manipulation via Divide and Conquer Zhen Xiong et.al. 2502.08180 null
2025-02-12 ParetoRAG: Leveraging Sentence-Context Attention for Robust and Efficient Retrieval-Augmented Generation Ruobing Yao et.al. 2502.08178 null
2025-02-12 SycEval: Evaluating LLM Sycophancy Aaron Fanous et.al. 2502.08177 null
2025-02-12 Intention is All You Need: Refining Your Code from Your Intention Qi Guo et.al. 2502.08172 null
2025-02-12 Force Matching with Relativistic Constraints: A Physics-Inspired Approach to Stable and Efficient Generative Modeling Yang Cao et.al. 2502.08150 null
2025-02-12 ACCESS : A Benchmark for Abstract Causal Event Discovery and Reasoning Vy Vo et.al. 2502.08148 null
2025-02-12 Democratizing AI: Open-source Scalable LLM Training on GPU-based Supercomputers Siddharth Singh et.al. 2502.08145 null
2025-02-12 Bridging the Safety Gap: A Guardrail Pipeline for Trustworthy LLM Inferences Shanshan Han et.al. 2502.08142 null
2025-02-12 LowRA: Accurate and Efficient LoRA Fine-Tuning of LLMs under 2 Bits Zikai Zhou et.al. 2502.08141 null
2025-02-12 Selective Self-to-Supervised Fine-Tuning for Generalization in Large Language Models Sonam Gupta et.al. 2502.08130 null
2025-02-12 Fino1: On the Transferability of Reasoning Enhanced LLMs to Finance Lingfei Qian et.al. 2502.08127 link
2025-02-12 HuDEx: Integrating Hallucination Detection and Explainability for Enhancing the Reliability of LLM responses Sujeong Lee et.al. 2502.08109 null
2025-02-12 Large language models perpetuate bias in palliative care: development and analysis of the Palliative Care Adversarial Dataset (PCAD) Naomi Akhras et.al. 2502.08073 null
2025-02-12 On Mechanistic Circuits for Extractive Question-Answering Samyadeep Basu et.al. 2502.08059 null
2025-02-12 Break the Checkbox: Challenging Closed-Style Evaluations of Cultural Alignment in LLMs Mohsinul Kabir et.al. 2502.08045 null
2025-02-12 Franken-Adapter: Cross-Lingual Adaptation of LLMs by Embedding Surgery Fan Jiang et.al. 2502.08037 null
2025-02-12 Stochastic Kinetics of Transcription: Analysis and Computation Yuntao Lu et.al. 2502.08028 null
2025-02-12 Contextual Subspace Manifold Projection for Structural Refinement of Large Language Model Representations Alistair Wren et.al. 2502.08026 null
2025-02-11 Speculate, then Collaborate: Fusing Knowledge of Language Models during Decoding Ziyao Wang et.al. 2502.08020 null
2025-02-11 The Geometry of Prompting: Unveiling Distinct Mechanisms of Task Adaptation in Language Models Artem Kirsanov et.al. 2502.08009 null
2025-02-11 An Interactive Framework for Implementing Privacy-Preserving Federated Learning: Experiments on Large Language Models Kasra Ahmadi et.al. 2502.08008 link
2025-02-11 Towards Training One-Step Diffusion Models Without Distillation Mingtian Zhang et.al. 2502.08005 null
2025-02-11 Universal Adversarial Attack on Aligned Multimodal LLMs Temurbek Rahmatullaev et.al. 2502.07987 null
2025-02-11 Deep Semantic Graph Learning via LLM based Node Enhancement Chuanqi Shi et.al. 2502.07982 null
2025-02-11 CIRCUIT: A Benchmark for Circuit Interpretation and Reasoning Capabilities of LLMs Lejla Skelic et.al. 2502.07980 null
2025-02-11 From Hazard Identification to Controller Design: Proactive and LLM-Supported Safety Engineering for ML-Powered Systems Yining Hong et.al. 2502.07974 null
2025-02-11 Caught in the Web of Words: Do LLMs Fall for Spin in Medical Literature? Hye Sun Yun et.al. 2502.07963 null
2025-02-11 Accelerating Scientific Research Through a Multi-LLM Framework Joaquin Ramirez-Medina et.al. 2502.07960 null
2025-02-11 Bridging HCI and AI Research for the Evaluation of Conversational SE Assistants Jonan Richards et.al. 2502.07956 null
2025-02-11 Symbiotic Cooperation for Web Agents: Harnessing Complementary Strengths of Large and Small LLMs Ruichen Zhang et.al. 2502.07942 null
2025-02-11 Discrete Markov Probabilistic Models Le-Tuyet-Nhi Pham et.al. 2502.07939 null
2025-02-11 Distributed Approach to Haskell Based Applications Refactoring with LLMs Based Multi-Agent Systems Shahbaz Siddeeq et.al. 2502.07928 null
2025-02-11 Sign Operator for Coping with Heavy-Tailed Noise: High Probability Convergence Bounds with Extensions to Distributed Optimization and Comparison Oracle Nikita Kornilov et.al. 2502.07923 null
2025-02-11 Elevating Legal LLM Responses: Harnessing Trainable Logical Structures and Semantic Knowledge with Legal Reasoning Rujing Yao et.al. 2502.07912 link
2025-02-11 DeepSeek on a Trip: Inducing Targeted Visual Hallucinations via Representation Vulnerabilities Chashi Mahiul Islam et.al. 2502.07905 null
2025-02-11 Intelligent Legal Assistant: An Interactive Clarification System for Legal Question Answering Rujing Yao et.al. 2502.07904 null
2025-02-11 HexGen-2: Disaggregated Generative Inference of LLMs in Heterogeneous Environment Youhe Jiang et.al. 2502.07903 null
2025-02-11 TextAtlas5M: A Large-scale Dataset for Dense Text Image Generation Alex Jinpeng Wang et.al. 2502.07870 link
2025-02-11 TransMLA: Multi-head Latent Attention Is All You Need Fanxu Meng et.al. 2502.07864 link
2025-02-11 BalanceKV: KV Cache Compression through Discrepancy Theory Insu Han et.al. 2502.07861 null
2025-02-11 Pippo: High-Resolution Multi-View Humans from a Single Image Yash Kant et.al. 2502.07785 null
2025-02-11 DarwinLM: Evolutionary Structured Pruning of Large Language Models Shengkun Tang et.al. 2502.07780 null
2025-02-11 Stay-Positive: A Case for Ignoring Real Image Features in Fake Image Detection Anirudh Sundara Rajan et.al. 2502.07778 null
2025-02-11 Auditing Prompt Caching in Language Model APIs Chenchen Gu et.al. 2502.07776 link
2025-02-11 Automatic Robot Task Planning by Integrating Large Language Model with Genetic Programming Azizjon Kobilov et.al. 2502.07772 null
2025-02-11 Great Power Brings Great Responsibility: Personalizing Conversational AI for Diverse Problem-Solvers Italo Santos et.al. 2502.07763 null
2025-02-11 Scalable Fingerprinting of Large Language Models Anshul Nasery et.al. 2502.07760 null
2025-02-11 Towards Efficient Optimizer Design for LLM via Structured Fisher Approximation with a Low-Rank Extension Wenbo Gong et.al. 2502.07752 null
2025-02-11 WHODUNIT: Evaluation benchmark for culprit detection in mystery stories Kshitij Gupta et.al. 2502.07747 link
2025-02-11 The Economics of Large Language Models: Token Allocation, Fine-Tuning, and Optimal Pricing Dirk Bergemann et.al. 2502.07736 null
2025-02-11 Revisiting Non-Acyclic GFlowNets in Discrete Environments Nikita Morozov et.al. 2502.07735 link
2025-02-11 Economics of Sourcing Human Data Sebastin Santy et.al. 2502.07732 null
2025-02-11 Verifying LLM-Generated Code in the Context of Software Verification with Ada/SPARK Marcos Cramer et.al. 2502.07728 null
2025-02-11 Near-Optimal Sample Complexity in Reward-Free Kernel-Based Reinforcement Learning Aya Kayal et.al. 2502.07715 null
2025-02-11 Magic 1-For-1: Generating One Minute Video Clips within One Minute Hongwei Yi et.al. 2502.07701 link
2025-02-11 A Framework for LLM-powered Design Assistants Swaroop Panda et.al. 2502.07698 null
2025-02-11 Large Language Models as Proxies for Theories of Human Linguistic Cognition Imry Ziv et.al. 2502.07687 null
2025-02-11 Steering Protein Family Design through Profile Bayesian Flow Jingjing Gong et.al. 2502.07671 null
2025-02-11 Guiding Time-Varying Generative Models with Natural Gradients on Exponential Family Manifold Song Liu et.al. 2502.07650 null
2025-02-11 SymGPT: Auditing Smart Contracts via Combining Symbolic Execution with Large Language Models Shihao Xia et.al. 2502.07644 null
2025-02-11 FoQA: A Faroese Question-Answering Dataset Annika Simonsen et.al. 2502.07642 null
2025-02-11 Distributional Instrumental Variable Method Anastasiia Holovchak et.al. 2502.07641 link
2025-02-11 Goedel-Prover: A Frontier Model for Open-Source Automated Theorem Proving Yong Lin et.al. 2502.07640 link
2025-02-11 Consistency Training with Physical Constraints Che-Chia Chang et.al. 2502.07636 null
2025-02-11 Exploring Mobile Touch Interaction with Large Language Models Tim Zindulka et.al. 2502.07629 null
2025-02-11 Tractable Transformers for Flexible Conditional Generation Anji Liu et.al. 2502.07616 null
2025-02-11 Beyond Prompting: Time2Lang -- Bridging Time-Series Foundation Models and Large Language Models for Health Sensing Arvind Pillai et.al. 2502.07608 null
2025-02-11 Towards Zero-Shot Anomaly Detection and Reasoning with Multimodal Large Language Models Jiacong Xu et.al. 2502.07601 null
2025-02-11 Towards spatial computing: recent advances in multimodal natural interaction for XR headsets Zhimin Wang et.al. 2502.07598 null
2025-02-11 SEMU: Singular Value Decomposition for Efficient Machine Unlearning Marcin Sendera et.al. 2502.07587 null
2025-02-11 Generative Modeling with Bayesian Sample Inference Marten Lienen et.al. 2502.07580 link
2025-02-11 PIM Is All You Need: A CXL-Enabled GPU-Free System for Large Language Model Inference Yufeng Gu et.al. 2502.07578 link
2025-02-11 Automated Capability Discovery via Model Self-Exploration Cong Lu et.al. 2502.07577 link
2025-02-11 JBShield: Defending Large Language Models from Jailbreak Attacks through Activated Concept Analysis and Manipulation Shenyi Zhang et.al. 2502.07557 link
2025-02-11 O1 Embedder: Let Retrievers Think Before Action Ruin Yan et.al. 2502.07555 null
2025-02-11 Grammar Control in Dialogue Response Generation for Language Learning Chatbots Dominik Glandorf et.al. 2502.07544 link
2025-02-11 NatureLM: Deciphering the Language of Nature for Scientific Discovery Yingce Xia et.al. 2502.07527 null
2025-02-11 The Devil is in the Prompts: De-Identification Traces Enhance Memorization Risks in Synthetic Chest X-Ray Generation Raman Dutt et.al. 2502.07516 link
2025-02-11 Enhance-A-Video: Better Generated Video for Free Yang Luo et.al. 2502.07508 link
2025-02-11 Towards THz-based Obstacle Sensing: A Generative Radio Environment Awareness Framework Tianyu Hu et.al. 2502.07504 null
2025-02-11 Unified Graph Networks (UGN): A Deep Neural Framework for Solving Graph Problems Rudrajit Dawn et.al. 2502.07500 null
2025-02-11 LLM-Sketch: Enhancing Network Sketches with LLM Yuanpeng Li et.al. 2502.07495 link
2025-02-11 Mask-Enhanced Autoregressive Prediction: Pay Less Attention to Learn More Xialie Zhuang et.al. 2502.07490 link
2025-02-11 Improving Adaptive Moment Optimization via Preconditioner Diagonalization Son Nguyen et.al. 2502.07488 null
2025-02-11 ETimeline: An Extensive Timeline Generation Dataset based on Large Language Model Xiaochen Liu et.al. 2502.07474 null
2025-02-11 JamendoMaxCaps: A Large Scale Music-caption Dataset with Imputed Metadata Abhinaba Roy et.al. 2502.07461 link
2025-02-11 Logarithmic Regret for Online KL-Regularized Reinforcement Learning Heyang Zhao et.al. 2502.07460 null
2025-02-11 PerCul: A Story-Driven Cultural Evaluation of LLMs in Persian Erfan Moosavi Monazzah et.al. 2502.07459 null
2025-02-11 RusCode: Russian Cultural Code Benchmark for Text-to-Image Generation Viacheslav Vasilev et.al. 2502.07455 link
2025-02-11 Forget What You Know about LLMs Evaluations - LLMs are Like a Chameleon Nurit Cohen-Inger et.al. 2502.07445 link
2025-02-11 Towards a Foundation Model for Physics-Informed Neural Networks: Multi-PDE Learning with Active Sampling Keon Vin Park et.al. 2502.07425 null
2025-02-11 RomanLens: Latent Romanization and its role in Multilinguality in LLMs Alan Saji et.al. 2502.07424 null
2025-02-11 Entity Linking using LLMs for Automated Product Carbon Footprint Estimation Steffen Castle et.al. 2502.07418 null
2025-02-11 EgoTextVQA: Towards Egocentric Scene-Text Aware Video Question Answering Sheng Zhou et.al. 2502.07411 link
2025-02-11 MGPATH: Vision-Language Model with Multi-Granular Prompt Learning for Few-Shot WSI Classification Anh-Tien Nguyen et.al. 2502.07409 link
2025-02-11 On Iterative Evaluation and Enhancement of Code Quality Using GPT-4o Rundong Liu et.al. 2502.07399 link
2025-02-11 FinRL-DeepSeek: LLM-Infused Risk-Sensitive Reinforcement Learning for Trading Agents Mostapha Benhenda et.al. 2502.07393 link
2025-02-11 LLMs Can Easily Learn to Reason from Demonstrations Structure, not content, is what matters! Dacheng Li et.al. 2502.07374 link
2025-02-11 EvoFlow: Evolving Diverse Agentic Workflows On The Fly Guibin Zhang et.al. 2502.07373 null
2025-02-11 LongReD: Mitigating Short-Text Degradation of Long-Context Large Language Models via Restoration Distillation Zican Dong et.al. 2502.07365 null
2025-02-11 Bridging the Evaluation Gap: Leveraging Large Language Models for Topic Model Evaluation Zhiyin Tan et.al. 2502.07352 link
2025-02-11 KABB: Knowledge-Aware Bayesian Bandits for Dynamic Expert Coordination in Multi-Agent Systems Jusheng Zhang et.al. 2502.07350 null
2025-02-11 BenchMAX: A Comprehensive Multilingual Evaluation Suite for Large Language Models Xu Huang et.al. 2502.07346 link
2025-02-11 Aligning Large Language Models to Follow Instructions and Hallucinate Less via Effective Data Filtering Shuzheng Si et.al. 2502.07340 link
2025-02-11 Music for All: Exploring Multicultural Representations in Music Generation Models (Camera Ready) Atharva Mehta et.al. 2502.07328 link
2025-02-11 Generative Ghost: Investigating Ranking Bias Hidden in AI-Generated Videos Haowen Gao et.al. 2502.07327 null
2025-02-11 MEMIT-Merge: Addressing MEMIT's Key-Value Conflicts in Same-Subject Batch Editing for LLMs Zilu Dong et.al. 2502.07322 null
2025-02-11 CodeI/O: Condensing Reasoning Patterns via Code Input-Output Prediction Junlong Li et.al. 2502.07316 link
2025-02-11 Prompt-Based Document Modifications In Ranking Competitions Niv Bardas et.al. 2502.07315 null
2025-02-11 CreAgent: Towards Long-Term Evaluation of Recommender System under Platform-Creator Information Asymmetry Xiaopeng Ye et.al. 2502.07307 link
2025-02-11 TRAVEL: Training-Free Retrieval and Alignment for Vision-and-Language Navigation Navid Rajabi et.al. 2502.07306 null
2025-02-11 Flow Matching for Collaborative Filtering Chengkai Liu et.al. 2502.07303 link
2025-02-11 Generation of Drug-Induced Cardiac Reactions towards Virtual Clinical Trials Qian Shao et.al. 2502.07297 null
2025-02-11 Small Language Model Makes an Effective Long Text Extractor Yelin Chen et.al. 2502.07286 link
2025-02-11 Articulate That Object Part (ATOP): 3D Part Articulation from Text and Motion Personalization Aditya Vora et.al. 2502.07278 null
2025-02-11 Cost-Efficient Continual Learning with Sufficient Exemplar Memory Dongkyu Cho et.al. 2502.07274 null
2025-02-11 GENERator: A Long-Context Generative Genomic Foundation Model Wei Wu et.al. 2502.07272 null
2025-02-11 When More is Less: Understanding Chain-of-Thought Length in LLMs Yuyang Wu et.al. 2502.07266 null
2025-02-11 DrugImproverGPT: A Large Language Model for Drug Optimization with Fine-Tuning via Structured Policy Optimization Xuefeng Liu et.al. 2502.07237 null
2025-02-11 A Memory Efficient Randomized Subspace Optimization Method for Training Large Language Models Yiming Chen et.al. 2502.07222 null
2025-02-11 MLLM4PUE: Toward Universal Embeddings in Computational Pathology through Multimodal LLMs Qifeng Zhou et.al. 2502.07221 null
2025-02-11 LUNAR: LLM Unlearning via Neural Activation Redirection William F. Shen et.al. 2502.07218 null
2025-02-11 Playmate: Flexible Control of Portrait Animation via 3D-Implicit Space Guided Diffusion Xingpei Ma et.al. 2502.07203 null
2025-02-11 Provably Efficient RLHF Pipeline: A Unified View from Contextual Bandits Long-Fei Li et.al. 2502.07193 link
2025-02-11 Bag of Tricks for Inference-time Computation of LLM Reasoning Fan Liu et.al. 2502.07191 null
2025-02-11 A Large-Scale Benchmark for Vietnamese Sentence Paraphrases Sang Quang Nguyen et.al. 2502.07188 link
2025-02-11 Refine Knowledge of Large Language Models via Adaptive Contrastive Learning Yinghui Li et.al. 2502.07184 null
2025-02-11 Does Training on Synthetic Data Make Models Less Robust? Lingze Zhang et.al. 2502.07164 null
2025-02-11 Rethinking Fine-Tuning when Scaling Test-Time Compute: Limiting Confidence Improves Mathematical Reasoning Feng Chen et.al. 2502.07154 link
2025-02-11 Ask Patients with Patience: Enabling LLMs for Human-Centric Medical Dialogue with Grounded Reasoning Jiayuan Zhu et.al. 2502.07143 null
2025-02-11 Language-TPP: Integrating Temporal Point Processes with Language Models for Event Analysis Quyu Kong et.al. 2502.07139 null
2025-02-10 Cardiverse: Harnessing LLMs for Novel Card Game Prototyping Danrui Li et.al. 2502.07128 null
2025-02-10 Structural Reformation of Large Language Model Neuron Encapsulation for Divergent Information Aggregation Denis Bakushev et.al. 2502.07124 null
2025-02-10 Online Scheduling for LLM Inference with KV Cache Constraints Patrick Jaillet et.al. 2502.07115 null
2025-02-10 Generative Distribution Prediction: A Unified Approach to Multimodal Learning Xinyu Tian et.al. 2502.07090 null
2025-02-10 Evaluating the Systematic Reasoning Abilities of Large Language Models through Graph Coloring Alex Heyman et.al. 2502.07087 link
2025-02-10 MPFBench: A Large Scale Dataset for SciML of Multi-Phase-Flows: Droplet and Bubble Dynamics Mehdi Shadkhah et.al. 2502.07080 null
2025-02-10 Multi-turn Evaluation of Anthropomorphic Behaviours in Large Language Models Lujain Ibrahim et.al. 2502.07077 null
2025-02-10 IRepair: An Intent-Aware Approach to Repair Data-Driven Errors in Large Language Models Sayem Mohammad Imtiaz et.al. 2502.07072 null
2025-02-10 Specializing Large Language Models to Simulate Survey Response Distributions for Global Populations Yong Cao et.al. 2502.07068 link
2025-02-10 Lumina-Video: Efficient and Flexible Video Generation with Multi-scale Next-DiT Dongyang Liu et.al. 2502.06782 null
2025-02-10 Enhancing Performance of Explainable AI Models with Constrained Concept Refinement Geyu Liang et.al. 2502.06775 null
2025-02-10 Train for the Worst, Plan for the Best: Understanding Token Ordering in Masked Diffusions Jaeyeon Kim et.al. 2502.06768 null
2025-02-10 Rationalization Models for Text-to-SQL Gaetano Rossiello et.al. 2502.06759 null
2025-02-10 Accelerating Data Processing and Benchmarking of AI Models for Pathology Andrew Zhang et.al. 2502.06750 link
2025-02-10 Gradient Multi-Normalization for Stateless and Scalable LLM Training Meyer Scetbon et.al. 2502.06742 null
2025-02-10 VersaPRM: Multi-Domain Process Reward Model via Synthetic Reasoning Data Thomas Zeng et.al. 2502.06737 null
2025-02-10 Señorita-2M: A High-Quality Instruction-based Dataset for General Video Editing by Video Specialists Bojia Zi et.al. 2502.06734 null
2025-02-10 Dynamic Loss-Based Sample Reweighting for Improved Large Language Model Pretraining Daouda Sow et.al. 2502.06733 null
2025-02-10 Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling Runze Liu et.al. 2502.06703 link
2025-02-10 No Trick, No Treat: Pursuits and Challenges Towards Simulation-free Training of Neural Samplers Jiajun He et.al. 2502.06685 null
2025-02-10 EquiTabPFN: A Target-Permutation Equivariant Prior Fitted Networks Michael Arbel et.al. 2502.06684 null
2025-02-10 Boosting Self-Efficacy and Performance of Large Language Models via Verbal Efficacy Stimulations Rui Chen et.al. 2502.06669 null
2025-02-10 Automatic Evaluation of Healthcare LLMs Beyond Question-Answering Anna Arias-Duart et.al. 2502.06666 null
2025-02-10 Evaluation of Deep Audio Representations for Hearables Fabian Gröger et.al. 2502.06664 null
2025-02-10 EfficientLLM: Scalable Pruning-Aware Pretraining for Architecture-Agnostic Edge Language Models Xingrun Xing et.al. 2502.06663 null
2025-02-10 Unbiased Evaluation of Large Language Models from a Causal Perspective Meilin Chen et.al. 2502.06655 null
2025-02-10 In-Context Learning (and Unlearning) of Length Biases Stephanie Schoch et.al. 2502.06653 null
2025-02-10 Transparent NLP: Using RAG and LLM Alignment for Privacy Q&A Anna Leschanowsky et.al. 2502.06652 null
2025-02-10 Automatic Annotation Augmentation Boosts Translation between Molecules and Natural Language Zhiqiang Zhong et.al. 2502.06634 null
2025-02-10 Combining Large Language Models with Static Analyzers for Code Review Generation Imen Jaoua et.al. 2502.06633 null
2025-02-10 Multi-Scale Feature Fusion with Image-Driven Spatial Integration for Left Atrium Segmentation from Cardiac MRI Images Bipasha Kundu et.al. 2502.06615 null
2025-02-10 A Large-scale AI-generated Image Inpainting Benchmark Paschalis Giakoumoglou et.al. 2502.06593 null
2025-02-10 Hephaestus: Improving Fundamental Agent Capabilities of Large Language Models through Continual Pre-Training Yuchen Zhuang et.al. 2502.06589 null
2025-02-10 A Survey on Video Analytics in Cloud-Edge-Terminal Collaborative Systems Linxiao Gong et.al. 2502.06581 null
2025-02-10 LawGPT: Knowledge-Guided Data Generation and Its Application to Legal LLM Zhi Zhou et.al. 2502.06572 link
2025-02-10 Large Language Models Meet Symbolic Provers for Logical Reasoning Evaluation Chengwen Qi et.al. 2502.06563 null
2025-02-10 Is API Access to LLMs Useful for Generating Private Synthetic Tabular Data? Marika Swanberg et.al. 2502.06555 null
2025-02-10 Efficient Scientific Full Text Classification: The Case of EICAT Impact Assessments Marc Felix Brinner et.al. 2502.06551 null
2025-02-10 Ignore the KL Penalty! Boosting Exploration on Critical Tokens to Enhance RL Fine-Tuning Jean Vassoyan et.al. 2502.06533 null
2025-02-10 Properties of Wasserstein Gradient Flows for the Sliced-Wasserstein Distance Christophe Vauthier et.al. 2502.06525 null
2025-02-10 GuideLLM: Exploring LLM-Guided Conversation with Applications in Autobiography Interviewing Jinhao Duan et.al. 2502.06494 null
2025-02-10 Recent Advances in Discrete Speech Tokens: A Review Yiwei Guo et.al. 2502.06490 null
2025-02-10 Adaptive Prompting: Ad-hoc Prompt Composition for Social Bias Detection Maximilian Spliethöver et.al. 2502.06487 null
2025-02-10 WyckoffDiff - A Generative Diffusion Model for Crystal Symmetry Filip Ekström Kelvinius et.al. 2502.06485 null
2025-02-10 UniMoD: Efficient Unified Multimodal Transformers with Mixture-of-Depths Weijia Mao et.al. 2502.06474 null
2025-02-10 KARMA: Leveraging Multi-Agent LLMs for Automated Knowledge Graph Enrichment Yuxing Lu et.al. 2502.06472 link
2025-02-10 A Survey of Theory of Mind in Large Language Models: Evaluations, Representations, and Safety Risks Hieu Minh "Jord" Nguyen et.al. 2502.06470 null
2025-02-10 MATH-Perturb: Benchmarking LLMs' Math Reasoning Abilities against Hard Perturbations Kaixuan Huang et.al. 2502.06453 null
2025-02-10 FEMBA: Efficient and Scalable EEG Analysis with a Bidirectional Mamba Foundation Model Anna Tegon et.al. 2502.06438 null
2025-02-10 Prompt-SID: Learning Structural Representation Prompt via Latent Diffusion for Single-Image Denoising Huaqiu Li et.al. 2502.06432 null
2025-02-10 CoS: Chain-of-Shot Prompting for Long Video Understanding Jian Hu et.al. 2502.06428 null
2025-02-10 Generating Privacy-Preserving Personalized Advice with Zero-Knowledge Proofs and LLMs Hiroki Watanabe et.al. 2502.06425 null
2025-02-10 Occ-LLM: Enhancing Autonomous Driving with Occupancy-Based Large Language Models Tianshuo Xu et.al. 2502.06419 null
2025-02-10 Systematic Outliers in Large Language Models Yongqi An et.al. 2502.06415 null
2025-02-10 AppVLM: A Lightweight Vision Language Model for Online App Control Georgios Papoudakis et.al. 2502.06395 null
2025-02-10 How Humans Help LLMs: Assessing and Incentivizing Human Preference Annotators Shang Liu et.al. 2502.06387 null
2025-02-10 Simulation as Reality? The Effectiveness of LLM-Generated Data in Open-ended Question Assessment Long Zhang et.al. 2502.06371 null
2025-02-10 Calibrating LLMs with Information-Theoretic Evidential Deep Learning Yawei Li et.al. 2502.06351 link
2025-02-10 Can AI Examine Novelty of Patents?: Novelty Evaluation Based on the Correspondence between Patent Claim and Prior Art Hayato Ikoma et.al. 2502.06316 null
2025-02-10 Latent Convergence Modulation in Large Language Models: A Novel Approach to Iterative Contextual Realignment Patricia Porretta et.al. 2502.06302 null
2025-02-10 SeaExam and SeaBench: Benchmarking LLMs with Local Multilingual Questions in Southeast Asia Chaoqun Liu et.al. 2502.06298 null
2025-02-10 Is an Ultra Large Natural Image-Based Foundation Model Superior to a Retina-Specific Model for Detecting Ocular and Systemic Diseases? Qingshan Hou et.al. 2502.06289 null
2025-02-10 Jakiro: Boosting Speculative Decoding with Decoupled Multi-Head via MoE Haiduo Huang et.al. 2502.06282 link
2025-02-10 DebateBench: A Challenging Long Context Reasoning Benchmark For Large Language Models Utkarsh Tiwari et.al. 2502.06279 null
2025-02-10 Emergent Response Planning in LLM Zhichen Dong et.al. 2502.06258 null
2025-02-10 K-ON: Stacking Knowledge On the Head Layer of Large Language Model Lingbing Guo et.al. 2502.06257 null
2025-02-10 Find Central Dogma Again Wang Liang et.al. 2502.06253 null
2025-02-10 Amplifying Minority Voices: AI-Mediated Devil's Advocate System for Inclusive Group Decision-Making Soohwan Lee et.al. 2502.06251 null
2025-02-10 PiKE: Adaptive Data Mixing for Multi-Task Learning Under Low Gradient Conflicts Zeman Li et.al. 2502.06244 null
2025-02-10 Fully Exploiting Vision Foundation Model's Profound Prior Knowledge for Generalizable RGB-Depth Driving Scene Parsing Sicen Guo et.al. 2502.06219 null
2025-02-10 LessLeak-Bench: A First Investigation of Data Leakage in LLMs Across 83 Software Engineering Benchmarks Xin Zhou et.al. 2502.06215 null
2025-02-10 Unveiling the Capabilities of Large Language Models in Detecting Offensive Language with Annotation Disagreement Junyu Lu et.al. 2502.06207 null
2025-02-10 C-3PO: Compact Plug-and-Play Proxy Optimization to Achieve Human-like Retrieval-Augmented Generation Guoxin Chen et.al. 2502.06205 null
2025-02-10 Non-literal Understanding of Number Words by Language Models Polina Tsvilodub et.al. 2502.06204 null
2025-02-10 Timing Matters: How Using LLMs at Different Timings Influences Writers' Perceptions and Ideation Outcomes in AI-Assisted Ideation Peinuan Qin et.al. 2502.06197 null
2025-02-10 Can LLMs Replace Human Evaluators? An Empirical Study of LLM-as-a-Judge in Software Engineering Ruiqi Wang et.al. 2502.06193 null
2025-02-10 Uncertainty-Aware Adaptation of Large Language Models for Protein-Protein Interaction Analysis Sanket Jantre et.al. 2502.06173 null
2025-02-10 A Data-Efficient Pan-Tumor Foundation Model for Oncology CT Interpretation Wenhui Lei et.al. 2502.06171 null
2025-02-10 Universal Approximation of Visual Autoregressive Transformers Yifang Chen et.al. 2502.06167 null
2025-02-10 Scaling Public Health Text Annotation: Zero-Shot Learning vs. Crowdsourcing for Improved Efficiency and Labeling Accuracy Kamyar Kazari et.al. 2502.06150 null
2025-02-10 Optimizing Knowledge Integration in Retrieval-Augmented Generation with Self-Selection Yan Weng et.al. 2502.06148 null
2025-02-10 LegalViz: Legal Text Visualization by Text To Diagram Generation Eri Onami et.al. 2502.06147 null
2025-02-10 LCIRC: A Recurrent Compression Approach for Efficient Long-form Context and Query Dependent Modeling in LLMs Sumin An et.al. 2502.06139 null
2025-02-10 Self-Correcting Decoding with Generative Feedback for Mitigating Hallucinations in Large Vision-Language Models Ce Zhang et.al. 2502.06130 null
2025-02-10 Foundation Model of Electronic Medical Records for Adaptive Risk Estimation Pawel Renc et.al. 2502.06124 null
2025-02-10 Task-driven Layerwise Additive Activation Intervention Hieu Trung Nguyen et.al. 2502.06115 null
2025-02-10 CSR-Bench: Benchmarking LLM Agents in Deployment of Computer Science Research Repositories Yijia Xiao et.al. 2502.06111 null
2025-02-10 RALLRec: Improving Retrieval Augmented Large Language Model Recommendation with Representation Learning Jian Xu et.al. 2502.06101 link
2025-02-10 ConMeC: A Dataset for Metonymy Resolution with Common Nouns Saptarshi Ghosh et.al. 2502.06087 link
2025-02-10 Physics-Guided Foundation Model for Scientific Discovery: An Application to Aquatic Science Runlong Yu et.al. 2502.06084 link
2025-02-10 Debiasing Guidance for Discrete Diffusion with Sequential Monte Carlo Cheuk Kit Lee et.al. 2502.06079 null
2025-02-09 Deconstructing Depression Stigma: Integrating AI-driven Data Collection and Analysis with Causal Knowledge Graphs Han Meng et.al. 2502.06075 null
2025-02-09 Allegro-FM: Towards Equivariant Foundation Model for Exascale Molecular Dynamics Simulations Ken-ichi Nomura et.al. 2502.06073 null
2025-02-09 Benchmarking Prompt Sensitivity in Large Language Models Amirhossein Razavi et.al. 2502.06065 null
2025-02-09 Online Reward-Weighted Fine-Tuning of Flow Matching with Wasserstein Regularization Jiajun Fan et.al. 2502.06061 null
2025-02-09 Benchmarking Prompt Engineering Techniques for Secure Code Generation with GPT Models Marc Bruni et.al. 2502.06039 null
2025-02-09 Investigating Compositional Reasoning in Time Series Foundation Models Willa Potosnak et.al. 2502.06037 link
2025-02-09 A Multimodal PDE Foundation Model for Prediction and Scientific Text Descriptions Elisa Negrini et.al. 2502.06026 link
2025-02-09 Dual Caption Preference Optimization for Diffusion Models Amir Saeidi et.al. 2502.06023 null
2025-02-09 Temporal Working Memory: Query-Guided Segment Refinement for Enhanced Multimodal Understanding Xingjian Diao et.al. 2502.06020 link
2025-02-09 Media Bias Detector: Designing and Implementing a Tool for Real-Time Selection and Framing Bias Analysis in News Coverage Jenny S Wang et.al. 2502.06009 null
2025-02-09 Analysis of LLM as a grammatical feature tagger for African American English Rahul Porwal et.al. 2502.06004 null
2025-02-09 HamRaz: A Culture-Based Persian Conversation Dataset for Person-Centered Therapy Using LLM Agents Mohammad Amin Abbasi et.al. 2502.05982 null
2025-02-09 $μ$ nit Scaling: Simple and Scalable FP8 LLM Training Saaketh Narayan et.al. 2502.05967 null
2025-02-09 Redefining Robot Generalization Through Interactive Intelligence Sharmita Dey et.al. 2502.05963 null
2025-02-09 MetaChain: A Fully-Automated and Zero-Code Framework for LLM Agents Jiabin Tang et.al. 2502.05957 null
2025-02-09 Cyri: A Conversational AI-based Assistant for Supporting the Human User in Detecting and Responding to Phishing Attacks Antonio La Torre et.al. 2502.05951 null
2025-02-09 Acceleration Multiple Heads Decoding for LLM via Dynamic Tree Attention Zhendong Zhang et.al. 2502.05947 null
2025-02-09 "Let the AI conspiracy begin..." Language Model coordination is just one inference-intervention away Paul Darm et.al. 2502.05945 null
2025-02-07 Long-VITA: Scaling Large Multi-modal Models to 1 Million Tokens with Leading Short-Context Accuray Yunhang Shen et.al. 2502.05177 link
2025-02-07 Fillerbuster: Multi-View Scene Completion for Casual Captures Ethan Weber et.al. 2502.05175 null
2025-02-07 NoLiMa: Long-Context Evaluation Beyond Literal Matching Ali Modarressi et.al. 2502.05167 null
2025-02-07 Multitwine: Multi-Object Compositing with Text and Layout Control Gemma Canet Tarrés et.al. 2502.05165 null
2025-02-07 DuoGuard: A Two-Player RL-Driven Framework for Multilingual LLM Guardrails Yihe Deng et.al. 2502.05163 link
2025-02-07 A Lightweight Method to Disrupt Memorized Sequences in LLM Parjanya Prajakta Prashant et.al. 2502.05159 null
2025-02-07 Transforming Science with Large Language Models: A Survey on AI-assisted Scientific Discovery, Experimentation, Content Generation, and Evaluation Steffen Eger et.al. 2502.05151 null
2025-02-07 CodeSCM: Causal Analysis for Multi-Modal Code Generation Mukur Gupta et.al. 2502.05150 link
2025-02-07 An Annotated Reading of 'The Singer of Tales' in the LLM Era Kush R. Varshney et.al. 2502.05148 null
2025-02-07 Chest X-ray Foundation Model with Global and Local Representations Integration Zefan Yang et.al. 2502.05142 link
2025-02-07 Latent Swap Joint Diffusion for Long-Form Audio Generation Yusheng Dai et.al. 2502.05130 null
2025-02-07 Refining Integration-by-Parts Reduction of Feynman Integrals with Machine Learning Matt von Hippel et.al. 2502.05121 null
2025-02-07 Flexible and Efficient Grammar-Constrained Decoding Kanghee Park et.al. 2502.05111 null
2025-02-07 Lost in Time: Clock and Calendar Understanding Challenges in Multimodal LLMs Rohit Saxena et.al. 2502.05092 null
2025-02-07 Mitigating Unintended Memorization with LoRA in Federated Learning for LLMs Thierry Bossy et.al. 2502.05087 link
2025-02-07 Causality can systematically address the monsters under the bench(marks) Felix Leeb et.al. 2502.05085 null
2025-02-07 ChallengeMe: An Adversarial Learning-enabled Text Summarization Framework Xiaoyu Deng et.al. 2502.05084 null
2025-02-07 Adaptive Graph of Thoughts: Test-Time Adaptive Reasoning Unifying Chain, Tree, and Graph Structures Tushar Pandey et.al. 2502.05078 link
2025-02-07 Beautiful Images, Toxic Words: Understanding and Addressing Offensive Text in Generated Images Aditya Kumar et.al. 2502.05066 link
2025-02-07 nvAgent: Automated Data Visualization from Natural Language via Collaborative Agent Workflow Geliang Ouyang et.al. 2502.05036 link
2025-02-07 Prospects for detecting generic fast-time features in the neutrino lightcurve of nearby supernovae in neutrino telescopes Jakob Beise et.al. 2502.05024 null
2025-02-07 QuEST: Stable Training of LLMs with 1-Bit Weights and Activations Andrei Panferov et.al. 2502.05003 link
2025-02-07 Aligning Black-box Language Models with Human Judgments Gerrit J. J. van den Burg et.al. 2502.04997 null
2025-02-07 C2GM: Cascading Conditional Generation of Multi-scale Maps from Remote Sensing Images Constrained by Geographic Features Chenxing Sun et.al. 2502.04991 null
2025-02-07 MoGraphGPT: Creating Interactive Scenes Using Modular LLM and Graphical Control Hui Ye et.al. 2502.04983 null
2025-02-07 Enhancing Pre-Trained Decision Transformers with Prompt-Tuning Bandits Finn Rietz et.al. 2502.04979 null
2025-02-07 Towards Multimodal Empathetic Response Generation: A Rich Text-Speech-Vision Avatar-based Benchmark Han Zhang et.al. 2502.04976 null
2025-02-07 CoCoA: A Generalized Approach to Uncertainty Quantification by Integrating Confidence and Consistency of LLM Outputs Roman Vashurin et.al. 2502.04964 null
2025-02-07 The Rising Threat to Emerging AI-Powered Search Engines Zeren Luo et.al. 2502.04951 null
2025-02-07 Mobile Network-specialized Large Language Models for 6G: Architectures, Innovations, Challenges, and Future Trends Abdelaali Chaoub et.al. 2502.04933 null
2025-02-07 Generative-enhanced optimization for knapsack problems: an industry-relevant study Yelyzaveta Vodovozova et.al. 2502.04928 null
2025-02-07 Classification or Prompting: A Case Study on Legal Requirements Traceability Romina Etezadi et.al. 2502.04916 null
2025-02-07 Goku: Flow Based Video Generative Foundation Models Shoufa Chen et.al. 2502.04896 null
2025-02-07 A Foundational Brain Dynamics Model via Stochastic Optimal Control Joonhyeong Park et.al. 2502.04892 null
2025-02-07 Training-free Task-oriented Grasp Generation Jiaming Wang et.al. 2502.04873 null
2025-02-07 Advancing Wasserstein Convergence Analysis of Score-Based Models: Insights from Discretization and Second-Order Acceleration Yifeng Yu et.al. 2502.04849 null
2025-02-07 Developmentally-plausible Working Memory Shapes a Critical Period for Language Acquisition Masato Mita et.al. 2502.04795 null
2025-02-07 S $^2$ -MAD: Breaking the Token Barrier to Enhance Multi-Agent Debate Efficiency Yuting Zeng et.al. 2502.04790 null
2025-02-07 Probing Internal Representations of Multi-Word Verbs in Large Language Models Hassane Kissane et.al. 2502.04789 null
2025-02-07 Enhancing SQL Injection Detection and Prevention Using Generative Models Naga Sai Dasari et.al. 2502.04786 null
2025-02-07 SiriuS: Self-improving Multi-agent Systems via Bootstrapped Reasoning Wanjia Zhao et.al. 2502.04780 link
2025-02-07 SeDi-Instruct: Enhancing Alignment of Language Models through Self-Directed Instruction Generation Jungwoo Kim et.al. 2502.04774 null
2025-02-07 Enhancing Phishing Email Identification with Large Language Models Catherine Lee et.al. 2502.04759 null
2025-02-07 Concept Navigation and Classification via Open Source Large Language Model Processing Maël Kubli et.al. 2502.04756 null
2025-02-07 Every Software as an Agent: Blueprint and Case Study Mengwei Xu et.al. 2502.04747 null
2025-02-07 PhyloVAE: Unsupervised Learning of Phylogenetic Trees via Variational Autoencoders Tianyu Xie et.al. 2502.04730 link
2025-02-07 Generating Symbolic World Models via Test-time Scaling of Large Language Models Zhouliang Yu et.al. 2502.04728 link
2025-02-07 Evaluating Text Style Transfer Evaluation: Are There Any Reliable Metrics? Sourabrata Mukherjee et.al. 2502.04718 null
2025-02-07 Enhancing Impression Change Prediction in Speed Dating Simulations Based on Speakers' Personalities Kazuya Matsuo et.al. 2502.04706 null
2025-02-07 STRIDE: Automating Reward Design, Deep Reinforcement Learning Training and Feedback Optimization in Humanoid Robotics Locomotion Zhenwei Wu et.al. 2502.04692 null
2025-02-07 ARR: Question Answering with Large Language Models via Analyzing, Retrieving, and Reasoning Yuwei Yin et.al. 2502.04689 link
2025-02-07 M-IFEval: Multilingual Instruction-Following Evaluation Antoine Dussolle et.al. 2502.04688 link
2025-02-07 Learning Strategic Language Agents in the Werewolf Game with Iterative Latent Space Policy Optimization Zelai Xu et.al. 2502.04686 null
2025-02-07 G2PDiffusion: Genotype-to-Phenotype Prediction with Diffusion Models Mengdi Liu et.al. 2502.04684 null
2025-02-07 CALF-SBM: A Covariate-Assisted Latent Factor Stochastic Block Model Sydney Louit et.al. 2502.04681 null
2025-02-07 LLM Query Scheduling with Prefix Reuse and Latency Constraints Gregory Dexter et.al. 2502.04677 null
2025-02-07 AdParaphrase: Paraphrase Dataset for Analyzing Linguistic Features toward Generating Attractive Ad Texts Soichiro Murakami et.al. 2502.04674 link
2025-02-07 Unveiling the Mechanisms of Explicit CoT Training: How Chain-of-Thought Enhances Reasoning Generalization Xinhao Yao et.al. 2502.04667 link
2025-02-07 Enhancing Health Information Retrieval with RAG by Prioritizing Topical Relevance and Factual Accuracy Rishabh Uapadhyay et.al. 2502.04666 null
2025-02-07 Importance Sampling via Score-based Generative Models Heasung Kim et.al. 2502.04646 null
2025-02-07 Agentic Reasoning: Reasoning LLMs with Tools for the Deep Research Junde Wu et.al. 2502.04644 link
2025-02-07 Confidence Elicitation: A New Attack Vector for Large Language Models Brian Formento et.al. 2502.04643 null
2025-02-07 Contrastive Learning-Enhanced Large Language Models for Monolith-to-Microservice Decomposition Khaled Sellami et.al. 2502.04604 null
2025-02-07 Extracting and Understanding the Superficial Knowledge in Alignment Runjin Chen et.al. 2502.04602 link
2025-02-07 The $α$ -Alternator: Dynamic Adaptation To Varying Noise Levels In Sequences Using The Vendi Score For Improved Robustness and Performance Mohammad Reza Rezaei et.al. 2502.04593 null
2025-02-07 Position-aware Automatic Circuit Discovery Tal Haklay et.al. 2502.04577 link
2025-02-06 My LLM might Mimic AAE -- But When Should it? Sandra C. Sandoval et.al. 2502.04564 link
2025-02-06 Speeding up Speculative Decoding via Approximate Verification Meiyu Zhong et.al. 2502.04557 null
2025-02-06 TruthFlow: Truthful LLM Generation via Representation Flow Correction Hanyu Wang et.al. 2502.04556 null
2025-02-06 Contextual Gradient Flow Modeling for Large Language Model Generalization in Multi-Scale Feature Spaces Daphne Quillington et.al. 2502.04548 null
2025-02-06 Group-Adaptive Threshold Optimization for Robust AI-Generated Text Detection Minseok Jung et.al. 2502.04528 null
2025-02-06 Safety is Essential for Responsible Open-Ended Systems Ivaxi Sheth et.al. 2502.04512 null
2025-02-06 ULPT: Prompt Tuning with Ultra-Low-Dimensional Optimization Zijun Wu et.al. 2502.04501 null
2025-02-06 Verifiable Format Control for Large Language Model Generations Zhaoyang Wang et.al. 2502.04498 null
2025-02-06 Multi-Agent Reinforcement Learning with Focal Diversity Optimization Selim Furkan Tekin et.al. 2502.04492 link
2025-02-06 Building A Unified AI-centric Language System: analysis, framework and future work Edward Hong Wang et.al. 2502.04488 null
2025-02-06 Active Task Disambiguation with LLMs Katarzyna Kobalczyk et.al. 2502.04485 link
2025-02-06 The ML Supply Chain in the Era of Software 2.0: Lessons Learned from Hugging Face Trevor Stalnaker et.al. 2502.04484 null
2025-02-06 Near-Optimal Sample Complexity for MDPs via Anchoring Jongmin Lee et.al. 2502.04477 null
2025-02-06 ADIFF: Explaining audio difference using natural language Soham Deshmukh et.al. 2502.04476 link
2025-02-06 Augmented Conditioning Is Enough For Effective Training Image Generation Jiahui Chen et.al. 2502.04475 null
2025-02-06 Iterative Importance Fine-tuning of Diffusion Models Alexander Denker et.al. 2502.04468 null
2025-02-06 FocalCodec: Low-Bitrate Speech Coding via Focal Modulation Networks Luca Della Libera et.al. 2502.04465 null
2025-02-06 Training Language Models to Reason Efficiently Daman Arora et.al. 2502.04463 link
2025-02-06 Confident or Seek Stronger: Exploring Uncertainty-Based On-device LLM Routing From Benchmarking to Generalization Yu-Neng Chuang et.al. 2502.04428 null
2025-02-06 Decoding AI Judgment: How LLMs Assess News Credibility and Bias Edoardo Loru et.al. 2502.04426 null
2025-02-06 EmoBench-M: Benchmarking Emotional Intelligence for Multimodal Large Language Models He Hu et.al. 2502.04424 null
2025-02-06 Ola: Pushing the Frontiers of Omni-Modal Language Model with Progressive Modality Alignment Zuyan Liu et.al. 2502.04328 link
2025-02-06 Can Grammarly and ChatGPT accelerate language change? AI-powered technologies and their impact on the English language: wordiness vs. conciseness Karolina Rudnicka et.al. 2502.04324 null
2025-02-06 Speak Easy: Eliciting Harmful Jailbreaks from LLMs with Simple Interactions Yik Siu Chan et.al. 2502.04322 link
2025-02-06 ConceptAttention: Diffusion Transformers Learn Highly Interpretable Features Alec Helbling et.al. 2502.04320 link
2025-02-06 sshELF: Single-Shot Hierarchical Extrapolation of Latent Features for 3D Reconstruction from Sparse-Views Eyvaz Najafli et.al. 2502.04318 null
2025-02-06 ChamaleonLLM: Batch-Aware Dynamic Low-Rank Adaptation via Inference-Time Clusters Kamer Ali Yuksel et.al. 2502.04315 link
2025-02-06 ScoreFlow: Mastering LLM Agent Workflows via Score-based Preference Optimization Yinjie Wang et.al. 2502.04306 link
2025-02-06 MotionCanvas: Cinematic Shot Design with Controllable Image-to-Video Generation Jinbo Xing et.al. 2502.04299 null
2025-02-06 Learning Real-World Action-Video Dynamics with Heterogeneous Masked Autoregression Lirui Wang et.al. 2502.04296 null
2025-02-06 Beyond Prompt Content: Enhancing LLM Performance via Content-Format Integrated Prompt Optimization Yuanye Liu et.al. 2502.04295 link
2025-02-06 PILAF: Optimal Human Preference Sampling for Reward Modeling Yunzhen Feng et.al. 2502.04270 null
2025-02-06 Efficient Randomized Experiments Using Foundation Models Piersilvio De Bartolomeis et.al. 2502.04262 link
2025-02-06 Realistic Image-to-Image Machine Unlearning via Decoupling and Knowledge Retention Ayush K. Varshney et.al. 2502.04260 null
2025-02-06 MAGA: MAssive Genre-Audience Reformulation to Pretraining Corpus Expansion Xintong Hao et.al. 2502.04235 null
2025-02-06 Can LLMs Hack Enterprise Networks? Autonomous Assumed Breach Penetration-Testing Active Directory Networks Andreas Happe et.al. 2502.04227 null
2025-02-06 Keep It Light! Simplifying Image Clustering Via Text-Free Adapters Yicen Li et.al. 2502.04226 null
2025-02-06 Éclair -- Extracting Content and Layout with Integrated Reading Order for Documents Ilia Karmanov et.al. 2502.04223 null
2025-02-06 Sports and Women's Sports: Gender Bias in Text Generation with Olympic Data Laura Biester et.al. 2502.04218 null
2025-02-06 Algorithmic causal structure emerging through compression Liang Wendong et.al. 2502.04210 null
2025-02-06 "Short-length" Adversarial Training Helps LLMs Defend "Long-length" Jailbreak Attacks: Theoretical and Empirical Evidence Shaopeng Fu et.al. 2502.04204 link
2025-02-06 The Best Instruction-Tuning Data are Those That Fit Dylan Zhang et.al. 2502.04194 null
2025-02-06 PixFoundation: Are We Heading in the Right Direction with Pixel-level Vision Foundation Models? Mennatullah Siam et.al. 2502.04192 link
2025-02-06 Automated Microservice Pattern Instance Detection Using Infrastructure-as-Code Artifacts and Large Language Models Carlos Eduardo Duarte et.al. 2502.04188 null
2025-02-06 Multi-agent Architecture Search via Agentic Supernet Guibin Zhang et.al. 2502.04180 null
2025-02-06 MRAMG-Bench: A BeyondText Benchmark for Multimodal Retrieval-Augmented Multimodal Generation Qinhan Yu et.al. 2502.04176 null
2025-02-06 Diffusion-based mass map reconstruction from weak lensing data Supranta S. Boruah et.al. 2502.04158 null
2025-02-06 UltraIF: Advancing Instruction Following from the Wild Kaikai An et.al. 2502.04153 null
2025-02-06 The Order Effect: Investigating Prompt Sensitivity in Closed-Source LLMs Bryan Guan et.al. 2502.04134 null
2025-02-06 Llasa: Scaling Train-Time and Inference-Time Compute for Llama-based Speech Synthesis Zhen Ye et.al. 2502.04128 null
2025-02-06 Generative Adversarial Networks Bridging Art and Machine Intelligence Junhao Song et.al. 2502.04116 null
2025-02-06 VTutor: An Open-Source SDK for Generative AI-Powered Animated Pedagogical Agents with Multi-Media Output Eason Chen et.al. 2502.04103 null
2025-02-06 LLMs to Support a Domain Specific Knowledge Assistant Maria-Flavia Lovin et.al. 2502.04095 null
2025-02-06 AttentionPredictor: Temporal Pattern Matters for Efficient LLM Inference Qingyue Yang et.al. 2502.04077 null
2025-02-06 Content-Rich AIGC Video Quality Assessment via Intricate Text Alignment and Motion-Aware Consistency Shangkun Sun et.al. 2502.04076 link
2025-02-06 Predicting Large Language Model Capabilities on Closed-Book QA Tasks Using Only Information Available Prior to Training Changhao Jiang et.al. 2502.04066 null
2025-02-06 TQ-DiT: Efficient Time-Aware Quantization for Diffusion Transformers Younghye Hwang et.al. 2502.04056 null
2025-02-06 Exploring Imbalanced Annotations for Effective In-Context Learning Hongfu Gao et.al. 2502.04037 null
2025-02-06 Fine, I'll Merge It Myself: A Multi-Fidelity Framework for Automated Model Merging Guinan Su et.al. 2502.04030 null
2025-02-06 Echo-Teddy: Preliminary Design and Development of Large Language Model-based Social Robot for Autistic Students Unggi Lee et.al. 2502.04029 null
2025-02-06 Quantification of Biodiversity from Historical Survey Text with LLM-based Best-Worst Scaling Thomas Haider et.al. 2502.04022 null
2025-02-06 Automating a Complete Software Test Process Using LLMs: An Automotive Case Study Shuai Wang et.al. 2502.04008 null
2025-02-06 CAD-Editor: A Locate-then-Infill Framework with Automated Training Data Synthesis for Text-Based CAD Editing Yu Yuan et.al. 2502.03997 null
2025-02-06 Ontology-Guided, Hybrid Prompt Learning for Generalization in Knowledge Graph Question Answering Longquan Jiang et.al. 2502.03992 link
2025-02-06 Tight Bounds on Jensen's Gap: Novel Approach with Applications in Generative Modeling Marcin Mazur et.al. 2502.03988 null
2025-02-06 MultiFloodSynth: Multi-Annotated Flood Synthetic Dataset Generation YoonJe Kang et.al. 2502.03966 null
2025-02-06 MAQInstruct: Instruction-based Unified Event Relation Extraction Jun Xu et.al. 2502.03954 null
2025-02-06 LR0.FM: Low-Resolution Zero-shot Classification Benchmark For Foundation Models Priyank Pathak et.al. 2502.03950 link
2025-02-06 Afrispeech-Dialog: A Benchmark Dataset for Spontaneous English Conversations in Healthcare and Beyond Mardhiyah Sanni et.al. 2502.03945 null
2025-02-06 Unravelling Causal Genetic Biomarkers of Alzheimer's Disease via Neuron to Gene-token Backtracking in Neural Architecture: A Groundbreaking Reverse-Gene-Finder Approach Victor OK Li et.al. 2502.03938 null
2025-02-06 Quantifying Correlations of Machine Learning Models Yuanyuan Li et.al. 2502.03937 link
2025-02-06 HEP-JEPA: A foundation model for collider physics using joint embedding predictive architecture Jai Bardhan et.al. 2502.03933 null
2025-02-06 Experiments with Large Language Models on Retrieval-Augmented Generation for Closed-Source Simulation Software Andreas Baumann et.al. 2502.03916 null
2025-02-06 No Free Lunch in Annotation either: An objective evaluation of foundation models for streamlining annotation in animal tracking Emil Mededovic et.al. 2502.03907 link
2025-02-06 LeAP: Consistent multi-domain 3D labeling using Foundation Models Simon Gebraad et.al. 2502.03901 null
2025-02-06 InfinitePOD: Building Datacenter-Scale High-Bandwidth Domain for LLM with Optical Circuit Switching Transceivers Chenchen Shou et.al. 2502.03885 null
2025-02-06 Rank Also Matters: Hierarchical Configuration for Mixture of Adapter Experts in LLM Fine-Tuning Peizhuang Cong et.al. 2502.03884 null
2025-02-06 BOLT: Bootstrap Long Chain-of-Thought in Language Models without Distillation Bo Pang et.al. 2502.03860 null
2025-02-06 PAGNet: Pluggable Adaptive Generative Networks for Information Completion in Multi-Agent Communication Zhuohui Zhang et.al. 2502.03845 null
2025-02-06 Improving Natural Language Understanding for LLMs via Large-Scale Instruction Synthesis Lin Yuan et.al. 2502.03843 null
2025-02-06 FairT2I: Mitigating Social Bias in Text-to-Image Generation via Large Language Model-Assisted Detection and Attribute Rebalancing Jinya Sakurai et.al. 2502.03826 null
2025-02-06 Synthetic Poisoning Attacks: The Impact of Poisoned MRI Image on U-Net Brain Tumor Segmentation Tianhao Li et.al. 2502.03825 null
2025-02-06 PsyPlay: Personality-Infused Role-Playing Conversational Agents Tao Yang et.al. 2502.03821 null
2025-02-06 Large Language Models for Multi-Robot Systems: A Survey Peihan Li et.al. 2502.03814 null
2025-02-06 Identify Critical KV Cache in LLM Inference from an Output Perturbation Perspective Yuan Feng et.al. 2502.03805 link
2025-02-06 Understanding and Supporting Formal Email Exchange by Answering AI-Generated Questions Yusuke Miura et.al. 2502.03804 null
2025-02-06 Enhancing Hallucination Detection through Noise Injection Litian Liu et.al. 2502.03799 null
2025-02-06 Distribution learning via neural differential equations: minimal energy regularization and approximation theory Youssef Marzouk et.al. 2502.03795 null
2025-02-06 It's All in The [MASK]: Simple Instruction-Tuning Enables BERT-like Masked Language Models As Generative Classifiers Benjamin Clavié et.al. 2502.03793 null
2025-02-06 Iterate to Accelerate: A Unified Framework for Iterative Reasoning and Feedback Convergence Jacob Fein-Ashley et.al. 2502.03787 null
2025-02-06 GistVis: Automatic Generation of Word-scale Visualizations from Data-rich Documents Ruishi Zou et.al. 2502.03784 link
2025-02-06 Adaptive Semantic Prompt Caching with VectorQ Luis Gaspar Schroeder et.al. 2502.03771 null
2025-02-06 Hierarchical Contextual Manifold Alignment for Structuring Latent Representations in Large Language Models Meiquan Dong et.al. 2502.03766 null
2025-02-06 Rethinking the Residual Distribution of Locate-then-Editing Methods in Model Editing Xiaopeng Li et.al. 2502.03748 null
2025-02-06 Speaking the Language of Teamwork: LLM-Guided Credit Assignment in Multi-Agent Reinforcement Learning Muhan Lin et.al. 2502.03723 null
2025-02-06 Boosting Knowledge Graph-based Recommendations through Confidence-Aware Augmentation with Large Language Models Rui Cai et.al. 2502.03715 null
2025-02-06 MultiQ&A: An Analysis in Measuring Robustness via Automated Crowdsourcing of Question Perturbations and Answers Nicole Cho et.al. 2502.03711 null
2025-02-06 Aggregate and conquer: detecting and steering LLM concepts by combining nonlinear predictors over multiple layers Daniel Beaglehole et.al. 2502.03708 null
2025-02-06 LLM Alignment as Retriever Optimization: An Information Retrieval Perspective Bowen Jin et.al. 2502.03699 null
2025-02-06 A Comparison of DeepSeek and Other LLMs Tianchen Gao et.al. 2502.03688 null
2025-02-06 Conditional Diffusion Models are Medical Image Classifiers that Provide Explainability and Uncertainty for Free Gian Mario Favero et.al. 2502.03687 null
2025-02-06 Controlled LLM Decoding via Discrete Auto-regressive Biasing Patrick Pynadath et.al. 2502.03685 null
2025-02-05 Reflection-Window Decoding: Text Generation with Selective Refinement Zeyu Tang et.al. 2502.03678 null
2025-02-05 Advancing Reasoning in Large Language Models: Promising Methods and Approaches Avinash Patil et.al. 2502.03671 null
2025-02-05 Unrealized Expectations: Comparing AI Methods vs Classical Algorithms for Maximum Independent Set Yikai Wu et.al. 2502.03669 null
2025-02-05 Privacy-Preserving Generative Models: A Comprehensive Survey Debalina Padariya et.al. 2502.03668 null
2025-02-05 Context-Preserving Gradient Modulation for Large Language Models: A Novel Approach to Semantic Consistency in Long-Form Text Generation Nirola Kobanov et.al. 2502.03643 null
2025-02-05 SymmCD: Symmetry-Preserving Crystal Generation with Diffusion Models Daniel Levy et.al. 2502.03638 link
2025-02-05 AdaPhish: AI-Powered Adaptive Defense and Education Resource Against Deceptive Emails Rei Meguro et.al. 2502.03622 null
2025-02-05 Bilevel ZOFO: Bridging Parameter-Efficient and Zeroth-Order Techniques for Efficient LLM Fine-Tuning and Meta-Training Reza Shirkavand et.al. 2502.03604 null
2025-02-05 HACK: Homomorphic Acceleration via Compression of the Key-Value Cache for Disaggregated LLM Inference Zeyu Zhang et.al. 2502.03589 null
2025-02-05 A Mixed-Methods Evaluation of LLM-Based Chatbots for Menopause Roshini Deva et.al. 2502.03579 null
2025-02-05 Code Simulation as a Proxy for High-order Tasks in Large Language Models Emanuele La Malfa et.al. 2502.03568 null
2025-02-05 Kronecker Mask and Interpretive Prompts are Language-Action Video Learners Jingyi Yang et.al. 2502.03549 link
2025-02-05 YINYANG-ALIGN: Benchmarking Contradictory Objectives and Proposing Multi-Objective Optimization based DPO for Text-to-Image Alignment Amitava Das et.al. 2502.03512 null
2025-02-05 Do Large Language Model Benchmarks Test Reliability? Joshua Vendrow et.al. 2502.03461 link
2025-02-05 Adapt-Pruner: Adaptive Structural Pruning for Efficient Small Language Model Training Boyao Wang et.al. 2502.03460 null
2025-02-05 A Schema-Guided Reason-while-Retrieve framework for Reasoning on Scene Graphs with Large-Language-Models (LLMs) Yiye Chen et.al. 2502.03450 null
2025-02-05 Dress-1-to-3: Single Image to Simulation-Ready 3D Outfit with Diffusion Prior and Differentiable Physics Xuan Li et.al. 2502.03449 null
2025-02-05 BFS-Prover: Scalable Best-First Tree Search for LLM-based Automatic Theorem Proving Ran Xin et.al. 2502.03438 null
2025-02-05 Taking a Big Step: Large Learning Rates in Denoising Score Matching Prevent Memorization Yu-Han Wu et.al. 2502.03435 null
2025-02-05 On Fairness of Unified Multimodal Large Language Model for Image Generation Ming Liu et.al. 2502.03429 null
2025-02-05 Harnessing Large Language Models for Curated Code Reviews Oussama Ben Sghaier et.al. 2502.03425 link
2025-02-05 Can Text-to-Image Generative Models Accurately Depict Age? A Comparative Study on Synthetic Portrait Generation and Age Estimation Alexey A. Novikov et.al. 2502.03420 null
2025-02-05 Think or Step-by-Step? UnZIPping the Black Box in Zero-Shot Prompts Nikta Gohari Sadr et.al. 2502.03418 null
2025-02-05 SPRI: Aligning Large Language Models with Context-Situated Principles Hongli Zhan et.al. 2502.03397 null
2025-02-05 Benchmarking Time Series Forecasting Models: From Statistical Techniques to Foundation Models in Real-World Applications Issar Arab et.al. 2502.03395 null
2025-02-05 LIMO: Less is More for Reasoning Yixin Ye et.al. 2502.03387 link
2025-02-05 Transformers and Their Roles as Time Series Foundation Models Dennis Wu et.al. 2502.03383 null
2025-02-05 Demystifying Long Chain-of-Thought Reasoning in LLMs Edward Yeo et.al. 2502.03373 link
2025-02-05 PalimpChat: Declarative and Interactive AI analytics Chunwei Liu et.al. 2502.03368 null
2025-02-05 RadVLM: A Multitask Conversational Vision-Language Model for Radiology Nicolas Deperrois et.al. 2502.03333 null
2025-02-05 ECM: A Unified Electronic Circuit Model for Explaining the Emergence of In-Context Learning and Chain-of-Thought in Large Language Model Qiguang Chen et.al. 2502.03325 null
2025-02-05 Out-of-Distribution Detection using Synthetic Data Generation Momin Abbas et.al. 2502.03323 null
2025-02-05 Simplifying Formal Proof-Generating Models with ChatGPT and Basic Searching Techniques Sangjun Han et.al. 2502.03321 null
2025-02-05 Intent Representation Learning with Large Language Model for Recommendation Yu Wang et.al. 2502.03307 link
2025-02-05 Harmony in Divergence: Towards Fast, Accurate, and Memory-efficient Zeroth-order LLM Fine-tuning Qitao Tan et.al. 2502.03304 null
2025-02-05 MeDiSumQA: Patient-Oriented Question-Answer Generation from Discharge Letters Amin Dada et.al. 2502.03298 null
2025-02-05 SymAgent: A Neural-Symbolic Self-Learning Agent Framework for Complex Reasoning over Knowledge Graphs Ben Liu et.al. 2502.03283 null
2025-02-05 Posterior SBC: Simulation-Based Calibration Checking Conditional on Data Teemu Säilynoja et.al. 2502.03279 link
2025-02-05 Token Assorted: Mixing Latent and Text Tokens for Improved Language Model Reasoning DiJia Su et.al. 2502.03275 null
2025-02-05 ZISVFM: Zero-Shot Object Instance Segmentation in Indoor Robotic Environments with Vision Foundation Models Ying Zhang et.al. 2502.03266 link
2025-02-05 General Time-series Model for Universal Knowledge Representation of Multivariate Time-Series data Cheng He et.al. 2502.03264 null
2025-02-05 CARROT: A Cost Aware Rate Optimal Router Seamus Somerstep et.al. 2502.03261 null
2025-02-05 RiemannGFM: Learning a Graph Foundation Model from Riemannian Geometry Li Sun et.al. 2502.03251 null
2025-02-05 Exploring the Security Threats of Knowledge Base Poisoning in Retrieval-Augmented Code Generation Bo Lin et.al. 2502.03233 null
2025-02-05 Improve Decoding Factuality by Token-wise Cross Layer Entropy of Large Language Models Jialiang Wu et.al. 2502.03199 null
2025-02-05 MaxInfo: A Training-Free Key-Frame Selection Method Using Maximum Volume for Enhanced Video Understanding Pengyi Li et.al. 2502.03183 null
2025-02-05 PICBench: Benchmarking LLMs for Photonic Integrated Circuits Design Yuchao Wu et.al. 2502.03159 null
2025-02-05 Strategizing with AI: Insights from a Beauty Contest Experiment Iuliia Alekseenko et.al. 2502.03158 null
2025-02-05 Scalable In-Context Learning on Tabular Data via Retrieval-Augmented Large Language Models Xumeng Wen et.al. 2502.03147 null
2025-02-05 Symmetry-Aware Bayesian Flow Networks for Crystal Generation Laura Ruple et.al. 2502.03146 null
2025-02-05 Teaching Large Language Models Number-Focused Headline Generation With Key Element Rationales Zhen Qian et.al. 2502.03129 null
2025-02-05 Metis: A Foundation Speech Generation Model with Masked Generative Pre-training Yuancheng Wang et.al. 2502.03128 link
2025-02-05 Structured Token Retention and Computational Memory Paths in Large Language Models Jonathan Delena et.al. 2502.03102 null
2025-02-05 Reveal the Mystery of DPO: The Connection between DPO and RL Algorithms Xuerui Su et.al. 2502.03095 null
2025-02-05 Implementing Large Quantum Boltzmann Machines as Generative AI Models for Dataset Balancing Salvatore Sinno et.al. 2502.03086 null
2025-02-05 IAO Prompting: Making Knowledge Flow Explicit in LLMs through Structured Reasoning Templates Aissatou Diallo et.al. 2502.03080 null
2025-02-05 Poisson Flow Joint Model for Multiphase contrast-enhanced CT Rongjun Ge et.al. 2502.03079 null
2025-02-05 Automatic Prompt Optimization Techniques: Exploring the Potential for Synthetic Data Generation Nina Freise et.al. 2502.03078 null
2025-02-05 Optimizing Electric Vehicles Charging using Large Language Models and Graph Neural Networks Stavros Orfanoudakis et.al. 2502.03067 null
2025-02-05 Understanding and Enhancing the Transferability of Jailbreaking Attacks Runqi Lin et.al. 2502.03052 link
2025-02-05 RepLoRA: Reparameterizing Low-Rank Adaptation via the Perspective of Mixture of Experts Tuan Truong et.al. 2502.03044 null
2025-02-05 Large Language Models Are Universal Recommendation Learners Junguang Jiang et.al. 2502.03041 null
2025-02-05 FuXi- $α$ : Scaling Recommendation Model with Feature Interaction Enhanced Transformer Yufei Ye et.al. 2502.03036 null
2025-02-05 Knowledge Distillation from Large Language Models for Household Energy Modeling Mohannad Takrouri et.al. 2502.03034 null
2025-02-05 Analyze Feature Flow to Enhance Interpretation and Steering in Language Models Daniil Laptev et.al. 2502.03032 null
2025-02-05 Scaling Laws for Upcycling Mixture-of-Experts Language Models Seng Pei Liew et.al. 2502.03009 null
2025-02-05 MedBioLM: Optimizing Medical and Biological QA with Fine-Tuned Large Language Models and Retrieval-Augmented Generation Seonok Kim et.al. 2502.03004 null
2025-02-05 Training an LLM-as-a-Judge Model: Pipeline, Insights, and Practical Lessons Renjun Hu et.al. 2502.02988 null
2025-02-05 Membership Inference Attack Should Move On to Distributional Statistics for Distilled Generative Models Muxing Li et.al. 2502.02970 null
2025-02-05 The Labeled Coupon Collector Problem with Random Sample Sizes and Partial Recovery Shoham Shimon Berrebi et.al. 2502.02968 null
2025-02-05 Large Language Model Adversarial Landscape Through the Lens of Attack Objectives Nan Wang et.al. 2502.02960 null
2025-02-05 Position: Editing Large Language Models Poses Serious Safety Risks Paul Youssef et.al. 2502.02958 null
2025-02-05 Control Search Rankings, Control the World: What is a Good Search Engine? Simon Coghlan et.al. 2502.02957 null
2025-02-05 LLM-KT: Aligning Large Language Models with Knowledge Tracing using a Plug-and-Play Instruction Ziwei Wang et.al. 2502.02945 null
2025-02-05 Large Language Model Guided Self-Debugging Code Generation Muntasir Adnan et.al. 2502.02928 null
2025-02-05 SPARC: Subspace-Aware Prompt Adaptation for Robust Continual Learning in LLMs Dinithi Jayasuriya et.al. 2502.02909 null
2025-02-05 AI-driven materials design: a mini-review Mouyang Cheng et.al. 2502.02905 null
2025-02-05 A Benchmark for the Detection of Metalinguistic Disagreements between LLMs and Knowledge Graphs Bradley P. Allen et.al. 2502.02896 null
2025-02-05 Lowering the Barrier of Machine Learning: Achieving Zero Manual Labeling in Review Classification Using LLMs Yejian Zhang et.al. 2502.02893 null
2025-02-05 Expertized Caption Auto-Enhancement for Video-Text Retrieval Junxiang Chen et.al. 2502.02885 null
2025-02-05 SensorChat: Answering Qualitative and Quantitative Questions during Long-Term Multimodal Sensor Interactions Xiaofan Yu et.al. 2502.02883 null
2025-02-05 Position: Multimodal Large Language Models Can Significantly Advance Scientific Reasoning Yibo Yan et.al. 2502.02871 null
2025-02-05 A Systematic Approach for Assessing Large Language Models' Test Case Generation Capability Hung-Fu Chang et.al. 2502.02866 null
2025-02-05 OceanChat: The Effect of Virtual Conversational AI Agents on Sustainable Attitude and Behavior Change Pat Pataranutaporn et.al. 2502.02863 null
2025-02-05 A Survey of Sample-Efficient Deep Learning for Change Detection in Remote Sensing: Tasks, Strategies, and Challenges Lei Ding et.al. 2502.02835 null
2025-02-05 COFFE: A Code Efficiency Benchmark for Code Generation Yun Peng et.al. 2502.02827 link
2025-02-05 Accessible and Portable LLM Inference by Compiling Computational Graphs into SQL Wenbo Sun et.al. 2502.02818 null
2025-02-05 Mol-LLM: Generalist Molecular LLM with Improved Graph Utilization Chanhui Lee et.al. 2502.02810 null
2025-02-05 CAMI: A Counselor Agent Supporting Motivational Interviewing through State Inference and Topic Exploration Yizhe Yang et.al. 2502.02807 null
2025-02-05 Leveraging the true depth of LLMs Ramón Calvo González et.al. 2502.02790 null
2025-02-05 Speculative Prefill: Turbocharging TTFT with Lightweight and Training-Free Token Importance Estimation Jingyu Liu et.al. 2502.02789 link
2025-02-05 SimMark: A Robust Sentence-Level Similarity-Based Watermarking Algorithm for Large Language Models Amirhossein Dabiriaghdam et.al. 2502.02787 link
2025-02-04 Classroom Simulacra: Building Contextual Student Generative Agents in Online Education for Learning Behavioral Simulation Songlin Xu et.al. 2502.02780 link
2025-02-04 3D Foundation AI Model for Generalizable Disease Detection in Head Computed Tomography Weicheng Zhu et.al. 2502.02779 null
2025-02-04 Twilight: Adaptive Attention Sparsity with Hierarchical Top- $p$ Pruning Chaofan Lin et.al. 2502.02770 null
2025-02-04 LLM-USO: Large Language Model-based Universal Sizing Optimizer Karthik Somayaji N. S et.al. 2502.02764 null
2025-02-04 Rethinking Vision Transformer for Object Centric Foundation Models Manuel Traub et.al. 2502.02763 null
2025-02-04 Too Noisy To Learn: Enhancing Data Quality for Code Review C Chunhua Liu et.al. 2502.02757 null
2025-02-04 PatchPilot: A Stable and Cost-Efficient Agentic Patching Framework Hongwei Li et.al. 2502.02747 null
2025-02-04 LLM Bandit: Cost-Efficient LLM Generation via Preference-Conditioned Dynamic Routing Yang Li et.al. 2502.02743 null
2025-02-04 RFMedSAM 2: Automatic Prompt Refinement for Enhanced Volumetric Medical Image Segmentation with SAM 2 Bin Xie et.al. 2502.02741 null
2025-02-04 SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Loubna Ben Allal et.al. 2502.02737 null
2025-02-04 Peri-LN: Revisiting Layer Normalization in the Transformer Architecture Jeonghoon Kim et.al. 2502.02732 null
2025-02-04 Cross-Lingual Transfer for Low-Resource Natural Language Processing Iker García-Ferrero et.al. 2502.02722 null
2025-02-04 Astromer 2 Cristobal Donoso-Oliva et.al. 2502.02717 null
2025-02-04 A Unified Understanding and Evaluation of Steering Methods Shawn Im et.al. 2502.02716 null
2025-02-04 An Analysis of LLM Fine-Tuning and Few-Shot Learning for Flaky Test Detection and Classification Riddhi More et.al. 2502.02715 null
2025-02-04 Exploring LLMs Impact on Student-Created User Stories and Acceptance Testing in Software Development Allan Brockenbrough et.al. 2502.02675 null
2025-02-04 MedRAX: Medical Reasoning Agent for Chest X-ray Adibvafa Fallahpour et.al. 2502.02673 link
2025-02-04 Transformers Boost the Performance of Decision Trees on Tabular Data across Sample Sizes Mayuka Jayawardhana et.al. 2502.02672 null
2025-02-04 Machine-learning approaches to accelerating lattice simulations Scott Lawrence et.al. 2502.02670 null
2025-02-04 A Training-Free Length Extrapolation Approach for LLMs: Greedy Attention Logit Interpolation (GALI) Yan Li et.al. 2502.02659 link
2025-02-04 Introducing the Rhea simulations of Milky-Way-like galaxies I: Effect of gravitational potential on morphology and star formation Junia Göller et.al. 2502.02646 null
2025-02-04 COCONut-PanCap: Joint Panoptic Segmentation and Grounded Captions for Fine-Grained Understanding and Generation Xueqing Deng et.al. 2502.02589 null
2025-02-04 Open Materials Generation with Stochastic Interpolants Philipp Hoellmer et.al. 2502.02582 null
2025-02-04 A comparison of translation performance between DeepL and Supertext Alex Flückiger et.al. 2502.02577 link
2025-02-04 Are Language Models Up to Sequential Optimization Problems? From Evaluation to a Hegelian-Inspired Enhancement Soheil Abbasloo et.al. 2502.02573 null
2025-02-04 Learning the RoPEs: Better 2D and 3D Position Encodings with STRING Connor Schenck et.al. 2502.02562 null
2025-02-04 Mosaic3D: Foundation Dataset and Model for Open-Vocabulary 3D Segmentation Junha Lee et.al. 2502.02548 null
2025-02-04 LLMs for Generation of Architectural Components: An Exploratory Empirical Study in the Serverless World Shrikara Arun et.al. 2502.02539 null
2025-02-04 Adaptive Self-improvement LLM Agentic System for ML Library Development Genghan Zhang et.al. 2502.02534 link
2025-02-04 Multi-Agent Design: Optimizing Agents with Better Prompts and Topologies Han Zhou et.al. 2502.02533 null
2025-02-04 Generative Modeling on Lie Groups via Euclidean Generalized Score Matching Marco Bertolini et.al. 2502.02513 null
2025-02-04 Satori: Reinforcement Learning with Chain-of-Action-Thought Enhances LLM Reasoning via Autoregressive Search Maohao Shen et.al. 2502.02508 null
2025-02-04 Learning to generate physical ocean states: Towards hybrid climate modeling Etienne Meunier et.al. 2502.02499 null
2025-02-04 EasySpec: Layer-Parallel Speculative Decoding for Efficient Multi-GPU Utilization Yize Wu et.al. 2502.02493 null
2025-02-04 Multilingual Machine Translation with Open Large Language Models at Practical Scale: An Empirical Study Menglong Cui et.al. 2502.02481 null
2025-02-04 Style transfer as data augmentation: evaluating unpaired image-to-image translation models in mammography Emir Ahmed et.al. 2502.02475 null
2025-02-04 Mind the Gap: Evaluating Patch Embeddings from General-Purpose and Histopathology Foundation Models for Cell Segmentation and Classification Valentina Vadori et.al. 2502.02471 link
2025-02-04 SAISA: Towards Multimodal Large Language Models with Both Training and Inference Efficiency Qianhao Yuan et.al. 2502.02458 null
2025-02-04 Personalization Toolkit: Training Free Personalization of Large Vision Language Models Soroush Seifi et.al. 2502.02452 null
2025-02-04 Beyond English: Evaluating Automated Measurement of Moral Foundations in Non-English Discourse with a Chinese Case Study Calvin Yixiang Cheng et.al. 2502.02451 link
2025-02-04 Generative Psycho-Lexical Approach for Constructing Value Systems in Large Language Models Haoran Ye et.al. 2502.02444 null
2025-02-04 LLMER: Crafting Interactive Extended Reality Worlds with JSON Data Generated by Large Language Models Jiangong Chen et.al. 2502.02441 link
2025-02-04 Medical Multimodal Model Stealing Attacks via Adversarial Domain Alignment Yaling Shen et.al. 2502.02438 null
2025-02-04 TransformDAS: Mapping Φ-OTDR Signals to Riemannian Manifold for Robust Classification Jiaju Kang et.al. 2502.02428 null
2025-02-04 Activation-Informed Merging of Large Language Models Amin Heyrani Nobari et.al. 2502.02421 link
2025-02-04 Towards Fast Graph Generation via Autoregressive Noisy Filtration Modeling Markus Krimmel et.al. 2502.02415 link
2025-02-04 AI-Powered, But Power-Hungry? Energy Efficiency of LLM-Generated Code Lola Solovyeva et.al. 2502.02412 null
2025-02-04 Avoiding spurious sharpness minimization broadens applicability of SAM Sidak Pal Singh et.al. 2502.02407 null
2025-02-04 LV-XAttn: Distributed Cross-Attention for Long Visual Inputs in Multimodal Large Language Models Tzu-Tao Chang et.al. 2502.02406 null
2025-02-04 CoAT: Chain-of-Associated-Thoughts Framework for Enhancing Large Language Models Reasoning Jianfeng Pan et.al. 2502.02390 null
2025-02-04 Hypergraph Link Prediction via Hyperedge Copying Xie He et.al. 2502.02386 null
2025-02-04 STAIR: Improving Safety Alignment with Introspective Reasoning Yichi Zhang et.al. 2502.02384 link
2025-02-04 Evaluating the Effectiveness of LLMs in Fixing Maintainability Issues in Real-World Projects Henrique Nunes et.al. 2502.02368 null
2025-02-04 Field Matching: an Electrostatic Paradigm to Generate and Transfer Data Alexander Kolesov et.al. 2502.02367 null
2025-02-04 Premise-Augmented Reasoning Chains Improve Error Identification in Math reasoning with LLMs Sagnik Mukherjee et.al. 2502.02362 null
2025-02-04 SHIELD: APT Detection and Intelligent Explanation Using LLM Parth Atulbhai Gandhi et.al. 2502.02342 null
2025-02-04 Boosting Multimodal Reasoning with MCTS-Automated Structured Thinking Jinyang Wu et.al. 2502.02339 null
2025-02-04 ReSpark: Leveraging Previous Data Reports as References to Generate New Reports with LLMs Yuan Tian et.al. 2502.02329 null
2025-02-04 Information-Theoretic Proofs for Diffusion Sampling Galen Reeves et.al. 2502.02305 null
2025-02-04 Density Ratio Estimation with Conditional Probability Paths Hanlin Yu et.al. 2502.02300 null
2025-02-04 Evalita-LLM: Benchmarking Large Language Models on Italian Bernardo Magnini et.al. 2502.02289 null
2025-02-04 Adaptive Resource Allocation Optimization Using Large Language Models in Dynamic Wireless Environments Hyeonho Noh et.al. 2502.02287 null
2025-02-04 Conversation AI Dialog for Medicare powered by Finetuning and Retrieval Augmented Generation Atharva Mangeshkumar Agrawal et.al. 2502.02249 null
2025-02-04 Flatten Graphs as Sequences: Transformers are Scalable Graph Generators Dexiong Chen et.al. 2502.02216 null
2025-02-04 When Dimensionality Hurts: The Role of LLM Embedding Compression for Noisy Regression Tasks Felix Drinkall et.al. 2502.02199 link
2025-02-04 Large language models in climate and sustainability policy: limits and opportunities Francesca Larosa et.al. 2502.02191 null
2025-02-04 ShapeShifter: 3D Variations Using Multiscale and Sparse Point-Voxel Diffusion Nissim Maruani et.al. 2502.02187 null
2025-02-04 Generative Kernel Spectral Clustering David Winant et.al. 2502.02185 null
2025-02-04 Mass-Editing Memory with Attention in Transformers: A cross-lingual exploration of knowledge Daniel Tamayo et.al. 2502.02173 link
2025-02-04 EditIQ: Automated Cinematic Editing of Static Wide-Angle Videos via Dialogue Interpretation and Saliency Cues Rohit Girmaji et.al. 2502.02172 null
2025-02-04 Risk-Aware Driving Scenario Analysis with Large Language Models Yuan Gao et.al. 2502.02145 link
2025-02-04 IPO: Iterative Preference Optimization for Text-to-Video Generation Xiaomeng Yang et.al. 2502.02088 null
2025-02-04 Position Paper: Building Trust in Synthetic Data for Clinical AI Krishan Agyakari Raja Babu et.al. 2502.02076 null
2025-02-04 Rethinking stance detection: A theoretically-informed research agenda for user-level inference using language models Prasanta Bhattacharya et.al. 2502.02074 null
2025-02-04 ASCenD-BDS: Adaptable, Stochastic and Context-aware framework for Detection of Bias, Discrimination and Stereotyping Rajiv Bahl et.al. 2502.02072 null
2025-02-04 Robust and Secure Code Watermarking for Large Language Models via ML/Crypto Codesign Ruisi Zhang et.al. 2502.02068 null
2025-02-04 AdaptBot: Combining LLM with Knowledge Graphs and Human Input for Generic-to-Specific Task Decomposition and Knowledge Refinement Shivam Singh et.al. 2502.02067 link
2025-02-04 Anticipate & Act : Integrating LLMs and Classical Planning for Efficient Task Execution in Household Environments Raghav Arora et.al. 2502.02066 null
2025-02-04 CASIM: Composite Aware Semantic Injection for Text to Motion Generation Che-Jui Chang et.al. 2502.02063 null
2025-02-04 Large Language Models for Recommendation with Deliberative User Preference Alignment Yi Fang et.al. 2502.02061 null
2025-02-04 Efficient Domain Adaptation of Multimodal Embeddings using Constrastive Learning Georgios Margaritis et.al. 2502.02048 null
2025-02-04 Contextual Memory Reweaving in Large Language Models Using Layered Latent State Reconstruction Frederick Dillon et.al. 2502.02046 null
2025-02-04 M2R2: Mixture of Multi-Rate Residuals for Efficient Transformer Inference Nikhil Bhendawade et.al. 2502.02040 null
2025-02-04 ContinuouSP: Generative Model for Crystal Structure Prediction with Invariance and Continuity Yuji Tone et.al. 2502.02026 null
2025-02-04 From Accidents to Insights: Leveraging Multimodal Data for Scenario-Driven ADS Testing Siwei Luo et.al. 2502.02025 null
2025-02-04 ComplexDec: A Domain-robust High-fidelity Neural Audio Codec with Complex Spectrum Modeling Yi-Chiao Wu et.al. 2502.02019 null
2025-02-04 Multi-Domain Graph Foundation Models: Robust Knowledge Transfer via Topology Alignment Shuo Wang et.al. 2502.02017 null
2025-02-04 A Periodic Bayesian Flow for Material Generation Hanlin Wu et.al. 2502.02016 link
2025-02-04 Layer by Layer: Uncovering Hidden Representations in Language Models Oscar Skean et.al. 2502.02013 null
2025-02-04 LLMSecConfig: An LLM-Based Approach for Fixing Software Container Misconfigurations Ziyang Ye et.al. 2502.02009 null
2025-02-04 Reasoning Bias of Next Token Prediction Training Pengxiao Lin et.al. 2502.02007 null
2025-02-04 FinRLlama: A Solution to LLM-Engineered Signals Challenge at FinRL Contest 2024 Arnav Grover et.al. 2502.01992 null
2025-02-04 Can LLMs Assist Annotators in Identifying Morality Frames? -- Case Study on Vaccination Debate on Social Media Tunazzina Islam et.al. 2502.01991 null
2025-02-04 Generative Data Mining with Longtail-Guided Diffusion David S. Hayden et.al. 2502.01980 null
2025-02-04 Gradient-Regularized Latent Space Modulation in Large Language Models for Structured Contextual Synthesis Derek Yotheringhay et.al. 2502.01979 null
2025-02-04 AutoGUI: Scaling GUI Grounding with Automatic Functionality Annotations from LLMs Hongxin Li et.al. 2502.01977 null
2025-02-04 CITER: Collaborative Inference for Efficient Large Language Model Decoding with Token-Level Routing Wenhao Zheng et.al. 2502.01976 null
2025-02-04 Token Cleaning: Fine-Grained Data Selection for LLM Supervised Fine-Tuning Jinlong Pang et.al. 2502.01968 null
2025-02-04 MPIC: Position-Independent Multimodal Context Caching System for Efficient MLLM Serving Shiju Zhao et.al. 2502.01960 null
2025-02-04 Local minima of the empirical risk in high dimension: General theorems and convex examples Kiana Asgari et.al. 2502.01953 null
2025-02-04 DAMO: Data- and Model-aware Alignment of Multi-modal LLMs Jinda Lu et.al. 2502.01943 null
2025-02-04 Can LLMs Maintain Fundamental Abilities under KV Cache Compression? Xiang Liu et.al. 2502.01941 null
2025-02-04 Toward a Low-Cost Perception System in Autonomous Vehicles: A Spectrum Learning Approach Mohammed Alsakabi et.al. 2502.01940 null
2025-02-04 Distributionally Robust Direct Preference Optimization Zaiyan Xu et.al. 2502.01930 null
2025-02-04 PANDAS: Improving Many-shot Jailbreaking via Positive Affirmation, Negative Demonstration, and Adaptive Sampling Avery Ma et.al. 2502.01925 null
2025-02-04 LAST SToP For Modeling Asynchronous Time Series Shubham Gupta et.al. 2502.01922 null
2025-02-04 Anomaly Detection via Autoencoder Composite Features and NCE Yalin Liao et.al. 2502.01920 null
2025-02-04 Unlocking Efficient Large Inference Models: One-Bit Unrolling Tips the Scales Arian Eamaz et.al. 2502.01908 null
2025-02-04 Rethinking Homogeneity of Vision and Text Tokens in Large Vision-and-Language Models Chia-Wen Kuo et.al. 2502.01906 null
2025-02-04 Conceptual Metaphor Theory as a Prompting Paradigm for Large Language Models Oliver Kramer et.al. 2502.01901 null
2025-02-03 Latent Lexical Projection in Large Language Models: A Novel Approach to Implicit Representation Refinement Ziad Shaker et.al. 2502.01882 null
2025-02-03 SE Arena: Benchmarking Software Engineering Chatbots with Iterative Interactions Zhimin Zhao et.al. 2502.01860 null
2025-02-03 Security and Quality in LLM-Generated Code: A Multi-Language, Multi-Model Analysis Mohammed Kharma et.al. 2502.01853 null
2025-02-03 Foundation Model-Based Apple Ripeness and Size Estimation for Selective Harvesting Keyi Zhu et.al. 2502.01850 link
2025-02-03 Relatively-Secure LLM-Based Steganography via Constrained Markov Decision Processes Yu-Shin Huang et.al. 2502.01827 link
2025-02-03 Agentic Bug Reproduction for Effective Automated Program Repair at Google Runxiang Cheng et.al. 2502.01821 null
2025-02-03 Score as Action: Fine-Tuning Diffusion Generative Models by Continuous-time Reinforcement Learning Hanyang Zhao et.al. 2502.01819 null
2025-02-03 SelfCheckAgent: Zero-Resource Hallucination Detection in Generative Large Language Models Diyana Muhammed et.al. 2502.01812 null
2025-02-03 Toward Neurosymbolic Program Comprehension Alejandro Velasco et.al. 2502.01806 null
2025-02-03 Discovering Chunks in Neural Embeddings for Interpretability Shuchen Wu et.al. 2502.01803 null
2025-02-03 Harmful Terms and Where to Find Them: Measuring and Modeling Unfavorable Financial Terms and Conditions in Shopping Websites at Scale Elisa Tsai et.al. 2502.01798 link
2025-01-31 Vintix: Action Model via In-Context Reinforcement Learning Andrey Polubarov et.al. 2501.19400 link
2025-01-31 Do LLMs Strategically Reveal, Conceal, and Infer Information? A Theoretical and Empirical Analysis in The Chameleon Game Mustafa O. Karabag et.al. 2501.19398 link
2025-01-31 Cache Me If You Must: Adaptive Key-Value Quantization for Large Language Models Alina Shutova et.al. 2501.19392 link
2025-01-31 Federated Sketching LoRA: On-Device Collaborative Fine-Tuning of Large Language Models Wenzhi Fang et.al. 2501.19389 link
2025-02-03 SELMA: A Speech-Enabled Language Model for Virtual Assistant Interactions Dominik Wagner et.al. 2501.19377 null
2025-01-31 Beyond Fixed Horizons: A Theoretical Framework for Adaptive Denoising Diffusions Sören Christensen et.al. 2501.19373 null
2025-01-31 We're Different, We're the Same: Creative Homogeneity Across LLMs Emily Wenger et.al. 2501.19361 null
2025-01-31 Mechanical Properties of the Meninges: Large Language Model Assisted Systematic Review of over 25,000 Studies Brandon P. Chelstrom et.al. 2501.19359 null
2025-01-31 The Energy Loss Phenomenon in RLHF: A New Perspective on Mitigating Reward Hacking Yuchun Miao et.al. 2501.19358 null
2025-01-31 Addressing the correlation of Stokes-shifted photons emitted from two quantum emitters Adrián Juan-Delgado et.al. 2501.19356 null
2025-01-31 Do Large Multimodal Models Solve Caption Generation for Scientific Figures? Lessons Learned from SCICAP Challenge 2023 Ting-Yao E. Hsu et.al. 2501.19353 null
2025-01-31 Towards Adaptive Self-Improvement for Smarter Energy Systems Alexander Sommer et.al. 2501.19340 null
2025-01-31 PixelWorld: Towards Perceiving Everything as Pixels Zhiheng Lyu et.al. 2501.19339 null
2025-01-31 Homogeneity Bias as Differential Sampling Uncertainty in Language Models Messi H. J. Lee et.al. 2501.19337 null
2025-01-31 Reward-Guided Speculative Decoding for Efficient LLM Reasoning Baohao Liao et.al. 2501.19324 null
2025-01-31 MINDSTORES: Memory-Informed Neural Decision Synthesis for Task-Oriented Reinforcement in Embodied Systems Anirudh Chari et.al. 2501.19318 null
2025-01-31 LLM-based Affective Text Generation Quality Based on Different Quantization Values Yarik Menchaca Resendiz et.al. 2501.19317 null
2025-01-31 Judge Decoding: Faster Speculative Sampling Requires Going Beyond Model Alignment Gregor Bachmann et.al. 2501.19309 null
2025-02-03 SETS: Leveraging Self-Verification and Self-Correction for Improved Test-Time Scaling Jiefeng Chen et.al. 2501.19306 null
2025-01-31 Beyond checkmate: exploring the creative chokepoints in AI text Nafis Irtiza Tripto et.al. 2501.19301 link
2025-01-31 Offline Learning for Combinatorial Multi-armed Bandits Xutong Liu et.al. 2501.19300 null
2025-01-31 Synthetic User Behavior Sequence Generation with Large Language Models for Smart Homes Zhiyao Xu et.al. 2501.19298 null
2025-01-31 Analysis of LLMs vs Human Experts in Requirements Engineering Cory Hymel et.al. 2501.19297 null
2025-01-31 Low-Cost and Comprehensive Non-textual Input Fuzzing with LLM-Synthesized Input Generators Kunpeng Zhang et.al. 2501.19282 null
2025-01-31 Pheromone-based Learning of Optimal Reasoning Paths Anirudh Chari et.al. 2501.19278 null
2025-01-31 From Assistance to Autonomy -- A Researcher Study on the Potential of AI Support for Qualitative Data Analysis Elisabeth Kirsten et.al. 2501.19275 null
2025-01-31 Jackpot! Alignment as a Maximal Lottery Roberto-Rafael Maura-Rivero et.al. 2501.19266 null
2025-01-31 Neuro-LIFT: A Neuromorphic, LLM-based Interactive Framework for Autonomous Drone FlighT at the Edge Amogh Joshi et.al. 2501.19259 null
2025-01-31 A Zero-Shot Generalization Framework for LLM-Driven Cross-Domain Sequential Recommendation Yunzhe Li et.al. 2501.19232 null
2025-01-31 Autonomous Legacy Web Application Upgrades Using a Multi-Agent System Valtteri Ala-Salmi et.al. 2501.19204 link
2025-02-03 Improving the Robustness of Representation Misdirection for Large Language Model Unlearning Dang Huu-Tien et.al. 2501.19202 link
2025-01-31 Efficient Reasoning with Hidden Thinking Xuan Shen et.al. 2501.19201 link
2025-01-31 Enhancing Model Defense Against Jailbreaks with Proactive Safety Reasoning Xianglin Yang et.al. 2501.19180 null
2025-01-31 No Foundations without Foundations -- Why semi-mechanistic models are essential for regulatory biology Luka Kovačević et.al. 2501.19178 null
2025-01-31 Position: Contextual Integrity Washing for Language Models Yan Shvartzshnaider et.al. 2501.19173 null
2025-01-31 Poison as Cure: Visual Noise for Mitigating Object Hallucinations in LVMs Kejia Zhang et.al. 2501.19164 null
2025-01-31 A theoretical framework for overfitting in energy-based modeling Giovanni Catania et.al. 2501.19158 null
2025-01-31 A Tensor-Train Decomposition based Compression of LLMs on Group Vector Systolic Accelerator Sixiao Huang et.al. 2501.19135 null
2025-01-31 Unraveling Zeroth-Order Optimization through the Lens of Low-Dimensional Structured Perturbations Sihwan Park et.al. 2501.19099 null
2025-01-31 Ambient Denoising Diffusion Generative Adversarial Networks for Establishing Stochastic Object Models from Noisy Image Data Xichen Xu et.al. 2501.19094 null
2025-01-31 Pivoting Factorization: A Compact Meta Low-Rank Representation of Sparsity for Efficient Inference in Large Language Models Jialin Zhao et.al. 2501.19090 null
2025-01-31 Fairness Analysis of CLIP-Based Foundation Models for X-Ray Image Classification Xiangyu Sun et.al. 2501.19086 null
2025-01-31 Enhancing Code Generation for Low-Resource Languages: No Silver Bullet Alessandro Giagnorio et.al. 2501.19085 null
2025-01-31 Concept Steerers: Leveraging K-Sparse Autoencoders for Controllable Generations Dahye Kim et.al. 2501.19066 link
2025-01-31 TeZO: Empowering the Low-Rankness on the Temporal Dimension in the Zeroth-Order Optimization for Fine-tuning LLMs Yan Sun et.al. 2501.19057 null
2025-01-31 Enabling Autonomic Microservice Management through Self-Learning Agents Fenglin Yu et.al. 2501.19056 null
2025-01-31 Text-to-CAD Generation Through Infusing Visual Feedback in Large Language Models Ruiyu Wang et.al. 2501.19054 null
2025-01-31 Swarm-Gen: Fast Generation of Diverse Feasible Swarm Behaviors Simon Idoko et.al. 2501.19042 link
2025-01-31 Towards the Worst-case Robustness of Large Language Models Huanran Chen et.al. 2501.19040 null
2025-01-31 Beyond Token Compression: A Training-Free Reduction Framework for Efficient Visual Processing in MLLMs Hongliang Li et.al. 2501.19036 null
2025-01-31 XRF V2: A Dataset for Action Summarization with Wi-Fi Signals, and IMUs in Phones, Watches, Earbuds, and Glasses Bo Lan et.al. 2501.19034 link
2025-01-31 Multilayer Networks in Neuroimaging Vesna Vuksanovic et.al. 2501.19024 null
2025-01-31 Calling a Spade a Heart: Gaslighting Multimodal Large Language Models via Negation Bin Zhu et.al. 2501.19017 null
2025-01-31 Importing Phantoms: Measuring LLM Package Hallucination Vulnerabilities Arjun Krishna et.al. 2501.19012 null
2025-01-31 Visual Autoregressive Modeling for Image Super-Resolution Yunpeng Qu et.al. 2501.18993 null
2025-01-31 Symmetric Pruning of Large Language Models Kai Yi et.al. 2501.18980 null
2025-01-31 BCAT: A Block Causal Transformer for PDE Foundation Models for Fluid Dynamics Yuxuan Liu et.al. 2501.18972 null
2025-01-31 Spend Wisely: Maximizing Post-Training Gains in Iterative Synthetic Data Boostrapping Pu Yang et.al. 2501.18962 link
2025-01-31 Intrinsic Tensor Field Propagation in Large Language Models: A Novel Approach to Contextual Information Flow Alfred Bexley et.al. 2501.18957 null
2025-01-31 LLMDet: Learning Strong Open-Vocabulary Object Detectors under the Supervision of Large Language Models Shenghao Fu et.al. 2501.18954 link
2025-01-31 TabFSBench: Tabular Benchmark for Feature Shifts in Open Environment Zi-Jian Cheng et.al. 2501.18935 link
2025-01-31 Language Games as the Pathway to Artificial Superhuman Intelligence Ying Wen et.al. 2501.18924 null
2025-01-31 KBQA-o1: Agentic Knowledge Base Question Answering with Monte Carlo Tree Search Haoran Luo et.al. 2501.18922 link
2025-01-31 LLM Program Optimization via Retrieval Augmented Search Sagnik Anupam et.al. 2501.18916 null
2025-01-31 Scaling Laws for Differentially Private Language Models Ryan McKenna et.al. 2501.18914 null
2025-01-31 Streamlining Security Vulnerability Triage with Large Language Models Mohammad Jalili Torkamani et.al. 2501.18908 null
2025-01-31 Trustworthy Evaluation of Generative AI Models Zijun Gao et.al. 2501.18897 null
2025-01-31 Can We Predict the Effect of Prompts? Jae Yong Lee et.al. 2501.18883 null
2025-01-31 Adaptivity and Convergence of Probability Flow ODEs in Diffusion Generative Models Jiaqi Tang et.al. 2501.18863 null
2025-01-31 BRiTE: Bootstrapping Reinforced Thinking Process to Enhance Language Model Reasoning Han Zhong et.al. 2501.18858 null
2025-01-31 Equivariant Hypergraph Diffusion for Crystal Structure Prediction Yang Liu et.al. 2501.18850 null
2025-01-31 Text Data Augmentation for Large Language Models: A Comprehensive Survey of Methods, Challenges, and Opportunities Yaping Chai et.al. 2501.18845 null
2025-01-31 Trading Inference-Time Compute for Adversarial Robustness Wojciech Zaremba et.al. 2501.18841 null
2025-01-31 Partially Rewriting a Transformer in Natural Language Gonçalo Paulo et.al. 2501.18838 link
2025-01-31 Constitutional Classifiers: Defending against Universal Jailbreaks across Thousands of Hours of Red Teaming Mrinank Sharma et.al. 2501.18837 null
2025-01-31 Pitfalls of defacing whole-head MRI: re-identification risk with diffusion models and compromised research potential Chenyu Gao et.al. 2501.18834 null
2025-01-31 Structural Embedding Projection for Contextual Large Language Model Inference Vincent Enoasmo et.al. 2501.18826 null
2025-01-31 Bridging the Reasoning Gap: Small LLMs Can Plan with Generalised Strategies Andrey Borro et.al. 2501.18817 link
2025-01-31 Large Language Models as Common-Sense Heuristics Andrey Borro et.al. 2501.18816 null
2025-01-30 Compositional Generalization Requires More Than Disentangled Representations Qiyao Liang et.al. 2501.18797 null
2025-01-30 Rope to Nope and Back Again: A New Hybrid Attention Strategy Bowen Yang et.al. 2501.18795 null
2025-01-30 Survey and Improvement Strategies for Gene Prioritization with Large Language Models Matthew Neeley et.al. 2501.18794 null
2025-01-30 LLM-Generated Heuristics for AI Planning: Do We Even Need Domain-Independence Anymore? Alexander Tuisov et.al. 2501.18784 null
2025-01-30 Navigating the Fragrance space Via Graph Generative Models And Predicting Odors Mrityunjay Sharma et.al. 2501.18777 link
2025-01-30 Probabilistic Joint Recovery Method for CO $_2$ Plume Monitoring Zijun Deng et.al. 2501.18761 null
2025-01-30 Synthetic Data Generation for Augmenting Small Samples Dan Liu et.al. 2501.18741 null
2025-01-30 Examining the Robustness of Large Language Models across Language Complexity Jiayi Zhang et.al. 2501.18738 null
2025-01-30 Exploring Audio Editing Features as User-Centric Privacy Defenses Against Emotion Inference Attacks Mohd. Farhan Israk Soumik et.al. 2501.18727 null
2025-01-30 Strong and Controllable 3D Motion Generation Canxuan Gang et.al. 2501.18726 null
2025-01-30 Zero-shot Large Language Models for Long Clinical Text Summarization with Temporal Reasoning Maya Kruse et.al. 2501.18724 null
2025-02-03 Invisible Traces: Using Hybrid Fingerprinting to identify underlying LLMs in GenAI Apps Devansh Bhardwaj et.al. 2501.18712 null
2025-01-30 Regularized second-order optimization of tensor-network Born machines Matan Ben-Dov et.al. 2501.18691 null
2025-01-30 Drag Your Gaussian: Effective Drag-Based Editing with Score Distillation for 3D Gaussian Splatting Yansong Qu et.al. 2501.18672 null
2025-01-30 Foundational Models for 3D Point Clouds: A Survey and Outlook Vishal Thengane et.al. 2501.18594 null
2025-01-30 Diffusion Autoencoders are Scalable Image Tokenizers Yinbo Chen et.al. 2501.18593 null
2025-02-03 Advances in Multimodal Adaptation and Generalization: From Traditional Approaches to Foundation Models Hao Dong et.al. 2501.18592 link
2025-01-30 Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs Yue Wang et.al. 2501.18585 null
2025-01-30 Token-Hungry, Yet Precise: DeepSeek R1 Highlights the Need for Multi-Step Reasoning Over Speed in MATH Evgenii Evstafev et.al. 2501.18576 null
2025-01-30 BounTCHA: A CAPTCHA Utilizing Boundary Identification in AI-extended Videos Lehao Lin et.al. 2501.18565 null
2025-01-30 SAM2Act: Integrating Visual Foundation Model with A Memory Architecture for Robotic Manipulation Haoquan Fang et.al. 2501.18564 link
2025-01-30 Semantic Web and Creative AI -- A Technical Report from ISWS 2023 Raia Abu Ahmad et.al. 2501.18542 null
2025-01-30 Illusions of Relevance: Using Content Injection Attacks to Deceive Retrievers, Rerankers, and LLM Judges Manveer Singh Tamber et.al. 2501.18536 link
2025-01-30 Differentially Private Steering for Large Language Model Alignment Anmol Goel et.al. 2501.18532 link
2025-01-30 Learn from the Past: Language-conditioned Object Rearrangement with Large Language Models Guanqun Cao et.al. 2501.18516 null
2025-01-30 Streaming DiLoCo with overlapping communication: Towards a Distributed Free Lunch Arthur Douillard et.al. 2501.18512 null
2025-01-30 WILDCHAT-50M: A Deep Dive Into the Role of Synthetic Data in Post-Training Benjamin Feuer et.al. 2501.18511 link
2025-01-30 CLEAR: Cue Learning using Evolution for Accurate Recognition Applied to Sustainability Data Extraction Peter J. Bentley et.al. 2501.18504 null
2025-01-30 Examining the Expanding Role of Synthetic Data Throughout the AI Development Pipeline Shivani Kapania et.al. 2501.18493 null
2025-01-30 A Tool for In-depth Analysis of Code Execution Reasoning of Large Language Models Changshu Liu et.al. 2501.18482 null
2025-01-30 CLoQ: Enhancing Fine-Tuning of Quantized LLMs via Calibrated LoRA Initialization Yanxia Deng et.al. 2501.18475 null
2025-01-30 Tuning Vision Foundation Model via Test-Time Prompt-Guided Training for VFSS Segmentations Chengxi Zeng et.al. 2501.18474 null
2025-01-30 ExeCoder: Empowering Large Language Models with Executability Representation for Code Translation Minghua He et.al. 2501.18460 null
2025-01-30 CALM: Unleashing the Cross-Lingual Self-Aligning Ability of Language Model Question Answering Yumeng Wang et.al. 2501.18457 null
2025-01-30 GENIE: Generative Note Information Extraction model for structuring EHR data Huaiyuan Ying et.al. 2501.18435 null
2025-01-30 Exploring Potential Prompt Injection Attacks in Federated Military LLMs and Their Mitigation Youngjoon Lee et.al. 2501.18416 null
2025-01-30 RbFT: Robust Fine-tuning for Retrieval-Augmented Generation against Retrieval Defects Yiteng Tu et.al. 2501.18365 link
2025-01-30 A Video-grounded Dialogue Dataset and Metric for Event-driven Activities Wiradee Imrattanatrai et.al. 2501.18324 link
2025-01-30 Leveraging LLM Agents for Automated Optimization Modeling for SASP Problems: A Graph-RAG based Approach Tianpeng Pan et.al. 2501.18320 null
2025-01-30 Mining for Species, Locations, Habitats, and Ecosystems from Scientific Papers in Invasion Biology: A Large-Scale Exploratory Study with Large Language Models Jennifer D'Souza et.al. 2501.18287 null
2025-01-30 Jailbreaking LLMs' Safeguard with Universal Magic Words for Text Embedding Models Haoyu Liang et.al. 2501.18280 null
2025-01-30 Collecting Cost-Effective, High-Quality Truthfulness Assessments with LLM Summarized Evidence Kevin Roitero et.al. 2501.18265 null
2025-01-30 How to Select Datapoints for Efficient Human Evaluation of NLG Models? Vilém Zouhar et.al. 2501.18251 link
2025-01-30 Statistical multi-metric evaluation and visualization of LLM system predictive performance Samuel Ackerman et.al. 2501.18243 null
2025-01-30 Contextually Structured Token Dependency Encoding for Large Language Models James Blades et.al. 2501.18205 null
2025-01-30 Economic Rationality under Specialization: Evidence of Decision Bias in AI Agents ShuiDe Wen et.al. 2501.18190 null
2025-01-30 Investigating Tax Evasion Emergence Using Dual Large Language Model and Deep Reinforcement Learning Powered Agent-based Simulation Teddy Lazebnik et.al. 2501.18177 null
2025-01-30 Continually Evolved Multimodal Foundation Models for Cancer Prognosis Jie Peng et.al. 2501.18170 null
2025-01-30 RepoAudit: An Autonomous LLM-Agent for Repository-Level Code Auditing Jinyao Guo et.al. 2501.18160 null
2025-01-30 Large Language Models for Cryptocurrency Transaction Analysis: A Bitcoin Case Study Yuchen Lei et.al. 2501.18158 null
2025-01-30 Mixed-Precision Graph Neural Quantization for Low Bit Large Language Models Wanlong Liu et.al. 2501.18154 null
2025-01-30 Self-supervised Quantized Representation for Seamlessly Integrating Knowledge Graphs with Large Language Models Qika Lin et.al. 2501.18119 null
2025-01-30 Scaling Inference-Efficient Language Models Song Bian et.al. 2501.18107 null
2025-01-30 Panacea: Mitigating Harmful Fine-tuning for Large Language Models via Post-fine-tuning Perturbation Yibo Wang et.al. 2501.18100 link
2025-01-30 AlphaAdam:Asynchronous Masked Optimization with Dynamic Alpha for Selective Updates Da Chang et.al. 2501.18094 null
2025-01-30 Normative Evaluation of Large Language Models with Everyday Moral Dilemmas Pratik S. Sachdeva et.al. 2501.18081 null
2025-01-30 FinanceQA: A Benchmark for Evaluating Financial Analysis Capabilities of Large Language Models Spencer Mateega et.al. 2501.18062 null
2025-01-29 RL-based Query Rewriting with Distilled LLM for online E-Commerce Systems Duy A. Nguyen et.al. 2501.18056 null
2025-01-29 Current Pathology Foundation Models are unrobust to Medical Center Differences Edwin D. de Jong et.al. 2501.18055 null
2025-01-29 A Proximal Operator for Inducing 2:4-Sparsity Jonas M Kübler et.al. 2501.18015 null
2025-01-29 Large Language Models Think Too Fast To Explore Effectively Lan Pan et.al. 2501.18009 null
2025-01-29 Fault Localization via Fine-tuning Large Language Models with Mutation Generated Stack Traces Neetha Jambigi et.al. 2501.18005 null
2025-01-29 InnerThoughts: Disentangling Representations and Predictions in Large Language Models Didier Chételat et.al. 2501.17994 null
2025-01-29 Can Generative LLMs Create Query Variants for Test Collections? An Exploratory Study Marwah Alaofi et.al. 2501.17981 link
2025-01-29 Think Smarter not Harder: Adaptive Reasoning with Inference Aware Optimization Zishun Yu et.al. 2501.17974 null
2025-01-29 "I Would Never Trust Anything Western": Kumu (Educator) Perspectives on Use of LLMs for Culturally Revitalizing CS Education in Hawaiian Schools Manas Mhasakar et.al. 2501.17942 null
2025-01-29 DReSS: Data-driven Regularized Structured Streamlining for Large Language Models Mingkuan Feng et.al. 2501.17905 null
2025-01-29 Learning Beyond the Surface: How Far Can Continual Pre-Training with LoRA Enhance LLMs' Domain-Specific Insight Learning? Pouya Pezeshkpour et.al. 2501.17840 link
2025-01-29 Aggregation Schemes for Single-Vector WSI Representation Learning in Digital Pathology Sobhan Hemati et.al. 2501.17822 null
2025-01-30 Leveraging Multimodal LLM for Inspirational User Interface Search Seokhyeon Park et.al. 2501.17799 link
2025-01-29 BreezyVoice: Adapting TTS for Taiwanese Mandarin with Enhanced Polyphone Disambiguation -- Challenges and Insights Chan-Jan Hsu et.al. 2501.17790 null
2025-01-29 AdditiveLLM: Large Language Models Predict Defects in Additive Manufacturing Peter Pak et.al. 2501.17784 null
2025-01-29 2SSP: A Two-Stage Framework for Structured Pruning of LLMs Fabrizio Sandri et.al. 2501.17771 link
2025-01-29 Generative Unordered Flow for Set-Structured Data Generation Yangming Li et.al. 2501.17770 null
2025-01-29 Hybrid Graphs for Table-and-Text based Question Answering using LLMs Ankush Agarwal et.al. 2501.17767 null
2025-01-29 On the Partitioning of GPU Power among Multi-Instances Tirth Vamja et.al. 2501.17752 null
2025-01-29 Early External Safety Testing of OpenAI's o3-mini: Insights from the Pre-Deployment Evaluation Aitor Arrieta et.al. 2501.17749 null
2025-01-29 A technical review of multi-omics data integration methods: from classical statistical to deep generative approaches Ana R. Baião et.al. 2501.17729 null
2025-01-29 Using Code Generation to Solve Open Instances of Combinatorial Design Problems Christopher D. Rosin et.al. 2501.17725 link
2025-01-29 RICoTA: Red-teaming of In-the-wild Conversation with Test Attempts Eujeong Choi et.al. 2501.17715 link
2025-01-29 Source-Channel Separation Theorems for Distortion Perception Coding Chao Tian et.al. 2501.17706 null
2025-01-29 Planning with Vision-Language Models and a Use Case in Robot-Assisted Teaching Xuzhe Dang et.al. 2501.17665 null
2025-01-30 In-Context Meta LoRA Generation Yihua Shao et.al. 2501.17635 null
2025-01-29 Uncertainty Quantification and Decomposition for LLM-based Recommendation Wonbin Kweon et.al. 2501.17630 link
2025-01-29 The Imitation Game According To Turing Sharon Temtsin et.al. 2501.17629 null
2025-01-29 Structured Context Recomposition for Large Language Models Using Probabilistic Layer Realignment Jonathan Teel et.al. 2501.17617 null
2025-01-29 Semantic Consistency Regularization with Large Language Models for Semi-supervised Sentiment Analysis Kunrong Li et.al. 2501.17598 null
2025-01-30 Technical report on label-informed logit redistribution for better domain generalization in low-shot classification with foundation models Behraj Khan et.al. 2501.17595 null
2025-01-29 GLLM: Self-Corrective G-Code Generation using Large Language Models with User Feedback Mohamed Abdelaal et.al. 2501.17584 null
2025-01-29 CSEval: Towards Automated, Multi-Dimensional, and Reference-Free Counterspeech Evaluation using Auto-Calibrated LLMs Amey Hengle et.al. 2501.17581 null
2025-01-29 Music2Latent2: Audio Compression with Summary Embeddings and Autoregressive Decoding Marco Pasini et.al. 2501.17578 null
2025-01-29 Query-Aware Learnable Graph Pooling Tokens as Prompt for Large Language Models Wooyoung Kim et.al. 2501.17549 null
2025-01-29 Towards Training-Free Open-World Classification with 3D Generative Models Xinzhe Xia et.al. 2501.17547 null
2025-01-29 Is Conversational XAI All You Need? Human-AI Decision Making With a Conversational XAI Assistant Gaole He et.al. 2501.17546 link
2025-01-29 Towards Supporting Penetration Testing Education with Large Language Models: an Evaluation and Comparison Martin Nizon-Deladoeuille et.al. 2501.17539 null
2025-01-29 Neural Spelling: A Spell-Based BCI System for Language Neural Decoding Xiaowei Jiang et.al. 2501.17489 null
2025-01-29 DFPE: A Diverse Fingerprint Ensemble for Enhancing LLM Performance Seffi Cohen et.al. 2501.17479 link
2025-01-29 AugmenTest: Enhancing Tests with LLM-Driven Oracles Shaker Mahmud Khandaker et.al. 2501.17461 null
2025-01-29 Large Language Models for Single-Step and Multi-Step Flight Trajectory Prediction Kaiwei Luo et.al. 2501.17459 null
2025-01-29 Virus: Harmful Fine-tuning Attack for Large Language Models Bypassing Guardrail Moderation Tiansheng Huang et.al. 2501.17433 link
2025-01-29 Actions Speak Louder than Words: Agent Decisions Reveal Implicit Biases in Language Models Yuxuan Li et.al. 2501.17420 null
2025-01-29 MultiChallenge: A Realistic Multi-Turn Conversation Evaluation Benchmark Challenging to Frontier LLMs Ved Sirdeshmukh et.al. 2501.17399 link
2025-01-29 Learning Free Token Reduction for Multi-Modal LLM Zihui Zhao et.al. 2501.17391 null
2025-01-29 Context-Aware Semantic Recomposition Mechanism for Large Language Models Richard Katrix et.al. 2501.17386 null
2025-01-28 Deep-and-Wide Learning: Enhancing Data-Driven Inference via Synergistic Learning of Inter- and Intra-Data Representations Md Tauhidul Islam et.al. 2501.17347 null
2025-01-28 Memorize and Rank: Elevating Large Language Models for Clinical Diagnosis Prediction Mingyu Derek Ma et.al. 2501.17326 null
2025-01-28 CardiCat: a Variational Autoencoder for High-Cardinality Tabular Data Lee Carlin et.al. 2501.17324 null
2025-01-30 Probing LLM World Models: Enhancing Guesstimation with Wisdom of Crowds Decoding Yun-Shiuan Chuang et.al. 2501.17310 null
2025-01-28 "Ownership, Not Just Happy Talk": Co-Designing a Participatory Large Language Model for Journalism Emily Tseng et.al. 2501.17299 null
2025-01-28 Mitigating Hallucinated Translations in Large Language Models with Hallucination-focused Preference Optimization Zilu Tang et.al. 2501.17295 null
2025-01-28 Fine-Tuning Open-Source Large Language Models to Improve Their Performance on Radiation Oncology Tasks: A Feasibility Study to Investigate Their Potential Clinical Applications in Radiation Oncology Peilong Wang et.al. 2501.17286 null
2025-01-30 From Natural Language to Extensive-Form Game Representations Shilong Deng et.al. 2501.17282 link
2025-01-28 Engineering Point Defects in MoS2 for Tailored Material Properties using Large Language Models Abdalaziz Al-Maeeni et.al. 2501.17279 null
2025-01-28 Tailored Truths: Optimizing LLM Persuasion with Personalization and Fabricated Statistics Jasper Timm et.al. 2501.17273 link
2025-01-28 Integrating Reinforcement Learning and AI Agents for Adaptive Robotic Interaction and Assistance in Dementia Care Fengpei Yuan et.al. 2501.17206 null
2025-01-28 SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training Tianzhe Chu et.al. 2501.17161 null
2025-01-28 FactCG: Enhancing Fact Checkers with Graph-Based Multi-Hop Data Deren Lei et.al. 2501.17144 link
2025-01-28 ASTRAL: Automated Safety Testing of Large Language Models Miriam Ugarte et.al. 2501.17132 null
2025-01-28 Optimizing Large Language Model Training Using FP4 Quantization Ruizhe Wang et.al. 2501.17116 null
2025-01-28 Unlocking Transparent Alignment Through Enhanced Inverse Constitutional AI for Principle Extraction Carl-Leander Henneking et.al. 2501.17112 null
2025-01-28 Goodness of Fit for Bayesian Generative Models with Applications in Population Genetics Guillaume Le Mailloux et.al. 2501.17107 link
2025-01-28 Token-by-Token Regeneration and Domain Biases: A Benchmark of LLMs on Advanced Mathematical Problem-Solving Evgenii Evstafev et.al. 2501.17084 null
2025-01-28 Contextual Self-paced Learning for Weakly Supervised Spatio-Temporal Video Grounding Akash Kumar et.al. 2501.17053 null
2025-01-28 Enhanced Retrieval of Long Documents: Leveraging Fine-Grained Block Representations with Large Language Models Minghan Li et.al. 2501.17039 null
2025-01-28 Challenges in Ensuring AI Safety in DeepSeek-R1 Models: The Shortcomings of Reinforcement Learning Strategies Manojkumar Parmar et.al. 2501.17030 null
2025-01-28 Automated Refactoring of Non-Idiomatic Python Code: A Differentiated Replication with LLMs Alessandro Midolo et.al. 2501.17024 link
2025-01-28 Mobile Manipulation Instruction Generation from Multiple Images with Automatic Metric Enhancement Kei Katsumata et.al. 2501.17022 link
2025-01-28 MIDI-GPT: A Controllable Generative Model for Computer-Assisted Multitrack Music Composition Philippe Pasquier et.al. 2501.17011 null
2025-01-28 Large Language Models for Code Generation: The Practitioners Perspective Zeeshan Rasheed et.al. 2501.16998 link
2025-01-28 Artificial Intelligence Clones Annie Liang et.al. 2501.16996 null
2025-01-28 FedEFM: Federated Endovascular Foundation Model with Unseen Data Tuong Do et.al. 2501.16992 null
2025-01-28 Generative quantum combinatorial optimization by means of a novel conditional generative quantum eigensolver Shunya Minami et.al. 2501.16986 null
2025-01-28 Over-Tokenized Transformer: Vocabulary is Generally Worth Scaling Hongzhi Huang et.al. 2501.16975 null
2025-01-28 Instantiation-based Formalization of Logical Reasoning Tasks using Language Models and Logical Solvers Mohammad Raza et.al. 2501.16961 null
2025-01-28 Multiple Abstraction Level Retrieve Augment Generation Zheng Zheng et.al. 2501.16952 null
2025-01-29 TAID: Temporally Adaptive Interpolated Distillation for Efficient Knowledge Transfer in Language Models Makoto Shing et.al. 2501.16937 null
2025-01-28 Detecting harassment and defamation in cyberbullying with emotion-adaptive training Peiling Yi et.al. 2501.16925 link
2025-01-28 RDMM: Fine-Tuned LLM Models for On-Device Robotic Decision Making with Enhanced Contextual Awareness in Specific Domains Shady Nasrat et.al. 2501.16899 link
2025-01-28 Machine-learning semi-local exchange-correlation functionals for Kohn-Sham density functional theory of the Hubbard model Eoghan Cronin et.al. 2501.16893 link
2025-01-28 Irony Detection, Reasoning and Understanding in Zero-shot Learning Peiling Yi et.al. 2501.16884 null
2025-01-28 Comparing Human and LLM Generated Code: The Jury is Still Out! Sherlock A. Licorish et.al. 2501.16857 null
2025-01-28 Adapting Network Information to Semantics for Generalizable and Plug-and-Play Multi-Scenario Network Diagnosis Tiao Tan et.al. 2501.16842 null
2025-01-28 Misspellings in Natural Language Processing: A survey Gianluca Sperduti et.al. 2501.16836 null
2025-01-28 DIRIGENt: End-To-End Robotic Imitation of Human Demonstrations Based on a Diffusion Model Josua Spisak et.al. 2501.16800 null
2025-01-28 Algorithm for Automatic Legislative Text Consolidation Matias Etcheverry et.al. 2501.16794 null
2025-01-28 Exponential Family Attention Kevin Christian Wibisono et.al. 2501.16790 link
2025-01-28 Exploring the Role of Explicit Temporal Modeling in Multimodal Large Language Models for Video Understanding Yun Li et.al. 2501.16786 null
2025-01-28 TORCHLIGHT: Shedding LIGHT on Real-World Attacks on Cloudless IoT Devices Concealed within the Tor Network Yumingzhi Pan et.al. 2501.16784 null
2025-01-28 A Stochastic Dynamical Theory of LLM Self-Adversariality: Modeling Severity Drift as a Critical Process Jack David Carson et.al. 2501.16783 null
2025-01-29 Beyond-Labels: Advancing Open-Vocabulary Segmentation With Vision-Language Models Muhammad Atta ur Rahman et.al. 2501.16769 null
2025-01-28 DiffSplat: Repurposing Image Diffusion Models for Scalable Gaussian Splat Generation Chenguo Lin et.al. 2501.16764 null
2025-01-28 HateBench: Benchmarking Hate Speech Detectors on LLM-Generated Content and Hate Campaigns Xinyue Shen et.al. 2501.16750 link
2025-01-28 Through the Prism of Culture: Evaluating LLMs' Understanding of Indian Subcultures and Traditions Garima Chhikara et.al. 2501.16748 null
2025-01-28 LLM Assisted Anomaly Detection Service for Site Reliability Engineers: Enhancing Cloud Infrastructure Resilience Nimesh Jha et.al. 2501.16744 null
2025-01-28 Distilling Large Language Models for Network Active Queue Management Deol Satish et.al. 2501.16734 null
2025-01-28 xJailbreak: Representation Space Guided Reinforcement Learning for Interpretable LLM Jailbreaking Sunbowen Lee et.al. 2501.16727 link
2025-01-28 One Head Eight Arms: Block Matrix based Low Rank Adaptation for CLIP-based Few-Shot Learning Chunpeng Zhou et.al. 2501.16720 null
2025-01-28 Outlier Synthesis via Hamiltonian Monte Carlo for Out-of-Distribution Detection Hengzhuang Li et.al. 2501.16718 link
2025-01-28 3D-MoE: A Mixture-of-Experts Multi-modal LLM for 3D Vision and Pose Diffusion via Rectified Flow Yueen Ma et.al. 2501.16698 null
2025-01-28 MME-Industry: A Cross-Industry Multimodal Evaluation Benchmark Dongyi Yi et.al. 2501.16688 null
2025-01-28 Auto-Differentiating Any LLM Workflow: A Farewell to Manual Prompting Li Yin et.al. 2501.16673 link
2025-01-28 VeriFact: Verifying Facts in LLM-Generated Clinical Text with Electronic Health Records Philip Chung et.al. 2501.16672 link
2025-01-28 Contextual Reinforcement in Multimodal Token Compression for Large Language Models Naderdel Piero et.al. 2501.16658 null
2025-01-28 Large Language Model Critics for Execution-Free Evaluation of Code Changes Aashish Yadavally et.al. 2501.16655 link
2025-01-28 Molecular-driven Foundation Model for Oncologic Pathology Anurag Vaidya et.al. 2501.16652 link
2025-01-28 DOCS: Quantifying Weight Similarity for Deeper Insights into Large Language Models Zeping Min et.al. 2501.16650 null
2025-01-28 An LLM Benchmark for Addressee Recognition in Multi-modal Multi-party Dialogue Koji Inoue et.al. 2501.16643 null
2025-01-28 CHiP: Cross-modal Hierarchical Direct Preference Optimization for Multimodal LLMs Jinlan Fu et.al. 2501.16629 link
2025-01-28 Few-Shot Optimized Framework for Hallucination Detection in Resource-Limited NLP Systems Baraa Hikal et.al. 2501.16616 null
2025-01-28 Sparse Autoencoders Trained on the Same Data Learn Different Features Gonçalo Paulo et.al. 2501.16615 null
2025-01-28 Fine-Tuned Language Models as Space Systems Controllers Enrico M. Zucchelli et.al. 2501.16588 null
2025-01-27 AffectGPT: A New Dataset, Model, and Benchmark for Emotion Understanding with Multimodal Large Language Models Zheng Lian et.al. 2501.16566 null
2025-01-27 LoRA-X: Bridging Foundation Models with Training-Free Cross-Model Adaptation Farzad Farhadzadeh et.al. 2501.16559 null
2025-01-27 Distributional Information Embedding: A Framework for Multi-bit Watermarking Haiyun He et.al. 2501.16558 null
2025-01-27 PackDiT: Joint Human Motion and Text Generation via Mutual Prompting Zhongyu Jiang et.al. 2501.16551 null
2025-01-27 PhysAnimator: Physics-Guided Generative Cartoon Animation Tianyi Xie et.al. 2501.16550 null
2025-01-27 Sample-Efficient Behavior Cloning Using General Domain Knowledge Feiyu Zhu et.al. 2501.16546 null
2025-01-27 Generalized Mission Planning for Heterogeneous Multi-Robot Teams via LLM-constructed Hierarchical Trees Piyush Gupta et.al. 2501.16539 null
2025-01-27 Targeting Alignment: Extracting Safety Classifiers of Aligned LLMs Jean-Charles Noirot Ferrand et.al. 2501.16534 null
2025-01-27 A comparison of data filtering techniques for English-Polish LLM-based machine translation in the biomedical domain Jorge del Pozo Lérida et.al. 2501.16533 null
2025-01-27 Programming by Examples Meets Historical Linguistics: A Large Language Model Based Approach to Sound Law Induction Atharva Naik et.al. 2501.16524 null
2025-01-27 How well can LLMs Grade Essays in Arabic? Rayed Ghazawi et.al. 2501.16516 null
2025-01-27 Deception in LLMs: Self-Preservation and Autonomous Goals in Large Language Models Sudarshan Kamath Barkur et.al. 2501.16513 null
2025-01-27 Smoothed Embeddings for Robust Language Models Ryo Hase et.al. 2501.16497 null
2025-01-27 Explaining GitHub Actions Failures with Large Language Models: Challenges, Insights, and Limitations Pablo Valenzuela-Toledo et.al. 2501.16495 null
2025-01-27 Generating customized prompts for Zero-Shot Rare Event Medical Image Classification using LLM Payal Kamboj et.al. 2501.16481 link
2025-01-27 Cross-Domain Semantic Segmentation with Large Language Model-Assisted Descriptor Generation Philip Hughes et.al. 2501.16467 null
2025-01-27 CoCoNUT: Structural Code Understanding does not fall out of a tree Claas Beger et.al. 2501.16456 link
2025-01-27 Detecting Zero-Day Attacks in Digital Substations via In-Context Learning Faizan Manzoor et.al. 2501.16453 null
2025-01-27 360Brew: A Decoder-only Foundation Model for Personalized Ranking and Recommendation Hamed Firooz et.al. 2501.16450 null
2025-01-27 DynAlign: Unsupervised Dynamic Taxonomy Alignment for Cross-Domain Segmentation Han Sun et.al. 2501.16410 null
2025-01-27 Evaluating The Performance of Using Large Language Models to Automate Summarization of CT Simulation Orders in Radiation Oncology Meiyun Cao et.al. 2501.16309 null
2025-01-27 RAPID: Retrieval-Augmented Parallel Inference Drafting for Text-Based Video Event Retrieval Long Nguyen et.al. 2501.16303 null
2025-01-27 Matryoshka Re-Ranker: A Flexible Re-Ranking Architecture With Configurable Depth and Width Zheng Liu et.al. 2501.16302 null
2025-01-27 Large Models in Dialogue for Active Perception and Anomaly Detection Tzoulio Chamiti et.al. 2501.16300 link
2025-01-27 FALCON: Resolving Visual Redundancy and Fragmentation in High-resolution Multimodal Large Language Models via Visual Registers Renshan Zhang et.al. 2501.16297 null
2025-01-27 Brain-Adapter: Enhancing Neurological Disorder Analysis with Adapter-Tuning Multimodal Large Language Models Jing Zhang et.al. 2501.16282 null
2025-01-27 Do LLMs Have Visualization Literacy? An Evaluation on Modified Visualizations to Test Generalization in Data Interpretation Jiayi Hong et.al. 2501.16277 link
2025-01-27 URAG: Implementing a Unified Hybrid RAG for Precise Answers in University Admission Chatbots -- A Case Study at HCMUT Long Nguyen et.al. 2501.16276 null
2025-01-27 A foundation model for human-AI collaboration in medical literature mining Zifeng Wang et.al. 2501.16255 null
2025-01-27 Multi-Agent Geospatial Copilots for Remote Sensing Workflows Chaehong Lee et.al. 2501.16254 null
2025-01-27 Zero-Shot Decision Tree Construction via Large Language Models Lucas Carrasco et.al. 2501.16247 null
2025-01-27 CLISC: Bridging clip and sam by enhanced cam for unsupervised brain tumor segmentation Xiaochuan Ma et.al. 2501.16246 null
2025-01-27 Phase Transitions in Large Language Models and the $O(N)$ Model Youran Sun et.al. 2501.16241 null
2025-01-27 AiGet: Transforming Everyday Moments into Hidden Knowledge Discovery with AI Assistance on Smart Glasses Runze Cai et.al. 2501.16240 null
2025-01-28 Distilling foundation models for robust and efficient models in digital pathology Alexandre Filiot et.al. 2501.16239 null
2025-01-27 Language-Based Bayesian Optimization Research Assistant (BORA) Abdoulatif Cissé et.al. 2501.16224 null
2025-01-27 Enhancing Visual Inspection Capability of Multi-Modal Large Language Models on Medical Time Series with Supportive Conformalized and Interpretable Small Specialized Models Huayu Li et.al. 2501.16215 link
2025-01-27 Provence: efficient and robust context pruning for retrieval-augmented generation Nadezhda Chirkova et.al. 2501.16214 null
2025-01-27 Raiders of the Lost Dependency: Fixing Dependency Conflicts in Python using LLMs Antony Bartlett et.al. 2501.16191 null
2025-01-27 SWIFT: Mapping Sub-series with Wavelet Decomposition Improves Time Series Forecasting Wenxuan Xie et.al. 2501.16178 link
2025-01-27 BAG: Body-Aligned 3D Wearable Asset Generation Zhongjin Luo et.al. 2501.16177 null
2025-01-27 Will Systems of LLM Agents Cooperate: An Investigation into a Social Dilemma Richard Willis et.al. 2501.16173 link
2025-01-27 MetaDecorator: Generating Immersive Virtual Tours through Multimodality Shuang Xie et.al. 2501.16164 null
2025-01-27 CITYWALK: Enhancing LLM-Based C++ Unit Test Generation via Project-Dependency Awareness and Language-Specific Knowledge Yuwei Zhang et.al. 2501.16155 null
2025-01-27 AdaCoT: Rethinking Cross-Lingual Factual Reasoning through Adaptive Chain-of-Thought Xin Huang et.al. 2501.16154 null
2025-01-27 AI Agents for Computer Use: A Review of Instruction-based Computer Control, GUI Automation, and Operator Assistants Pascal J. Sager et.al. 2501.16150 null
2025-01-27 PATCH: Empowering Large Language Model with Programmer-Intent Guidance and Collaborative-Behavior Simulation for Automatic Bug Fixing Yuwei Zhang et.al. 2501.16149 null
2025-01-27 SampleLLM: Optimizing Tabular Data Synthesis in Recommendations Jingtong Gao et.al. 2501.16125 null
2025-01-27 Using Generative Models to Produce Realistic Populations of UK Windstorms Yee Chun Tsoi et.al. 2501.16110 null
2025-01-27 Integration of LLM Quality Assurance into an NLG System Ching-Yi Chen et.al. 2501.16078 null
2025-01-27 PISCO: Pretty Simple Compression for Retrieval-Augmented Generation Maxime Louis et.al. 2501.16075 null
2025-01-27 A generative material transformer using Wyckoff representation Pierre-Paul De Breuck et.al. 2501.16051 null
2025-01-27 Skeleton-Guided-Translation: A Benchmarking Framework for Code Repository Translation with Fine-Grained Quality Evaluation Xing Zhang et.al. 2501.16050 null
2025-01-27 PRISMe: A Novel LLM-Powered Tool for Interactive Privacy Policy Assessment Vincent Freiberger et.al. 2501.16033 null
2025-01-27 FDLLM: A Text Fingerprint Detection Method for LLMs in Multi-Language, Multi-Domain Black-Box Environments Zhiyuan Fu et.al. 2501.16029 null
2025-01-27 Transformability reveals the interplay of dynamics across different network orders Ming Xie et.al. 2501.16016 null
2025-01-27 TOPLOC: A Locality Sensitive Hashing Scheme for Trustless Verifiable Inference Jack Min Ong et.al. 2501.16007 null
2025-01-27 EDSep: An Effective Diffusion-Based Method for Speech Source Separation Jinwei Dong et.al. 2501.15965 null
2025-01-27 Rethinking the Bias of Foundation Model under Long-tailed Distribution Jiahao Chen et.al. 2501.15955 null
2025-01-27 Understanding Long Videos via LLM-Powered Entity Relation Graphs Meng Chu et.al. 2501.15953 null
2025-01-27 TimeHF: Billion-Scale Time Series Models Guided by Human Feedback Yongzhi Qi et.al. 2501.15942 null
2025-01-27 SkillScope: A Tool to Predict Fine-Grained Skills Needed to Solve Issues on GitHub Benjamin C. Carter et.al. 2501.15922 null
2025-01-27 Parametric Retrieval Augmented Generation Weihang Su et.al. 2501.15915 link
2025-01-27 Robust Mobile Robot Path Planning via LLM-Based Dynamic Waypoint Generation Muhammad Taha Tariq et.al. 2501.15901 null
2025-01-27 Investigating the Sensitivity of Pre-trained Audio Embeddings to Common Effects Victor Deng et.al. 2501.15900 null
2025-01-27 Adaptive Width Neural Networks Federico Errica et.al. 2501.15889 null
2025-01-27 LCTG Bench: LLM Controlled Text Generation Benchmark Kentaro Kurihara et.al. 2501.15875 link
2025-01-27 LLM-attacker: Enhancing Closed-loop Adversarial Scenario Generation for Autonomous Driving with Large Language Models Yuewen Mei et.al. 2501.15850 null
2025-01-27 SpatialVLA: Exploring Spatial Representations for Visual-Language-Action Model Delin Qu et.al. 2501.15830 null
2025-01-27 Aging-aware CPU Core Management for Embodied Carbon Amortization in Cloud LLM Inference Tharindu B. Hewage et.al. 2501.15829 link
2025-01-27 MADP: Multi-Agent Deductive Planning for Enhanced Cognitive-Behavioral Mental Health Question Answer Qi Chen et.al. 2501.15826 null
2025-01-27 LemmaHead: RAG Assisted Proof Generation Using Large Language Models Tianbo Yang et.al. 2501.15797 null
2025-01-27 Can Multimodal Large Language Models be Guided to Improve Industrial Anomaly Detection? Zhiling Chen et.al. 2501.15795 null
2025-01-27 Harnessing Diverse Perspectives: A Multi-Agent Framework for Enhanced Error Detection in Knowledge Graphs Yu Li et.al. 2501.15791 link
2025-01-27 Memorization and Regularization in Generative Diffusion Models Ricardo Baptista et.al. 2501.15785 link
2025-01-27 Large Language Models to Diffusion Finetuning Edoardo Cetin et.al. 2501.15781 null
2025-01-27 Is It Navajo? Accurate Language Detection in Endangered Athabaskan Languages Ivory Yang et.al. 2501.15773 link
2025-01-27 GraphICL: Unlocking Graph Learning Potential in LLMs through Structured Prompt Design Yuanfu Sun et.al. 2501.15755 null
2025-01-27 IndicMMLU-Pro: Benchmarking the Indic Large Language Models Sankalp KJ et.al. 2501.15747 null
2025-01-27 Gensors: Authoring Personalized Visual Sensors with Multimodal Foundation Models and Reasoning Michael Xieyang Liu et.al. 2501.15727 null
2025-01-27 A Survey on Computational Pathology Foundation Models: Datasets, Adaptation Strategies, and Evaluation Tasks Dong Li et.al. 2501.15724 null
2025-01-27 On Parallelism in Music and Language: A Perspective from Symbol Emergence Systems based on Probabilistic Generative Models Tadahiro Taniguchi et.al. 2501.15721 null
2025-01-26 Adapting Biomedical Abstracts into Plain language using Large Language Models Haritha Gangavarapu et.al. 2501.15700 null
2025-01-26 TensorLLM: Tensorising Multi-Head Attention for Enhanced Reasoning and Compression in LLMs Yuxuan Gu et.al. 2501.15674 null
2025-01-26 Bringing Characters to New Stories: Training-Free Theme-Specific Image Generation via Dynamic Visual Prompting Yuxin Zhang et.al. 2501.15641 null
2025-01-26 BoKDiff: Best-of-K Diffusion Alignment for Target-Specific 3D Molecule Generation Ali Khodabandeh Yalabadi et.al. 2501.15631 link
2025-01-26 Improving Estonian Text Simplification through Pretrained Language Models and Custom Datasets Eduard Barbu et.al. 2501.15624 null
2025-01-26 Rethinking External Slow-Thinking: From Snowball Errors to Probability of Correct Reasoning Zeyu Gan et.al. 2501.15602 link
2025-01-26 Evaluating an LLM-Powered Chatbot for Cognitive Restructuring: Insights from Mental Health Professionals Yinzhou Wang et.al. 2501.15599 null
2025-01-26 Diffusion Generative Modeling for Spatially Resolved Gene Expression Inference from Histology Images Sichen Zhu et.al. 2501.15598 link
2025-01-26 SedarEval: Automated Evaluation using Self-Adaptive Rubrics Zhiyuan Fan et.al. 2501.15595 link
2025-01-26 SCP-116K: A High-Quality Problem-Solution Dataset and a Generalized Pipeline for Automated Extraction in the Higher Education Science Domain Dakuan Lu et.al. 2501.15587 link
2025-01-26 Error Classification of Large Language Models on Math Word Problems: A Dynamically Adaptive Framework Yuhong Sun et.al. 2501.15581 null
2025-01-26 Instruction Tuning for Story Understanding and Generation with Weak Supervision Yangshu Yuan et.al. 2501.15574 null
2025-01-26 Cross-Cultural Fashion Design via Interactive Large Language Models and Diffusion Models Spencer Ramsey et.al. 2501.15571 null
2025-01-26 ARWKV: Pretrain is not what we need, an RNN-Attention-Based Language Model Born from Transformer Lin Yueyu et.al. 2501.15570 link
2025-01-26 Ocean-OCR: Towards General OCR Application via a Vision-Language Model Song Chen et.al. 2501.15558 link
2025-01-26 Advancing Generative Artificial Intelligence and Large Language Models for Demand Side Management with Electric Vehicles Hanwen Zhang et.al. 2501.15544 null
2025-01-26 Estimating Committor Functions via Deep Adaptive Sampling on Rare Transition Paths Yueyang Wang et.al. 2501.15522 null
2025-01-26 Domain Adaptation from Generated Multi-Weather Images for Unsupervised Maritime Object Classification Dan Song et.al. 2501.15503 null
2025-01-26 Unveiling the Potential of Multimodal Retrieval Augmented Generation with Planning Xiaohan Yu et.al. 2501.15470 null
2025-01-26 Data-adaptive Safety Rules for Training Reward Models Xiaomin Li et.al. 2501.15453 null
2025-01-26 OpenCharacter: Training Customizable Role-Playing LLMs with Large-Scale Synthetic Personas Xiaoyang Wang et.al. 2501.15427 null
2025-01-26 Visual Generation Without Guidance Huayu Chen et.al. 2501.15420 link
2025-01-26 AnyEnhance: A Unified Generative Model with Prompt-Guidance and Self-Critic for Voice Enhancement Junan Zhang et.al. 2501.15417 null
2025-01-26 The Potential of Large Language Models in Supply Chain Management: Advancing Decision-Making, Efficiency, and Innovation Raha Aghaei et.al. 2501.15411 null
2025-01-26 Semantic Layered Embedding Diffusion in Large Language Models for Multi-Contextual Consistency Irin Kabakum et.al. 2501.15405 null
2025-01-26 How Green are Neural Language Models? Analyzing Energy Consumption in Text Summarization Fine-tuning Tohida Rehman et.al. 2501.15398 null
2025-01-26 Zero-Shot Interactive Text-to-Image Retrieval via Diffusion-Augmented Representations Zijun Long et.al. 2501.15379 null
2025-01-26 How to Mitigate Information Loss in Knowledge Graphs for GraphRAG: Leveraging Triple Context Restoration and Query-Driven Feedback Manzong Huang et.al. 2501.15378 null
2025-01-26 Evaluating the Effectiveness of XAI Techniques for Encoder-Based Language Models Melkamu Abay Mersha et.al. 2501.15374 null
2025-01-26 Scaling Large Vision-Language Models for Enhanced Multimodal Comprehension In Biomedical Image Analysis Robinson Umeike et.al. 2501.15370 null
2025-01-26 Decentralized Low-Rank Fine-Tuning of Large Language Models Sajjad Ghiasvand et.al. 2501.15361 null
2025-01-26 Large Language Models as Theory of Mind Aware Generative Agents with Counterfactual Reflection Bo Yang et.al. 2501.15355 null
2025-01-25 Fairness in LLM-Generated Surveys Andrés Abeliuk et.al. 2501.15351 null
2025-01-25 Between Puppet and Actor: Reframing Authorship in this Age of AI Agents Yuqian Sun et.al. 2501.15346 null
2025-01-25 Recognize Any Surgical Object: Unleashing the Power of Weakly-Supervised Data Jiajie Li et.al. 2501.15326 null
2025-01-25 ToMoE: Converting Dense Large Language Models to Mixture-of-Experts through Dynamic Structural Pruning Shangqian Gao et.al. 2501.15316 null
2025-01-25 The Multicultural Medical Assistant: Can LLMs Improve Medical ASR Errors Across Borders? Ayo Adedeji et.al. 2501.15310 null
2025-01-25 You Only Prune Once: Designing Calibration-Free Model Compression With Policy Learning Ayan Sengupta et.al. 2501.15296 null
2025-01-24 HERMES: A Unified Self-Driving World Model for Simultaneous 3D Scene Understanding and Generation Xin Zhou et.al. 2501.14729 link
2025-01-24 Do LLMs Provide Consistent Answers to Health-Related Questions across Languages? Ipek Baris Schlicht et.al. 2501.14719 null
2025-01-24 Towards Better Understanding Table Instruction Tuning: Decoupling the Effects from Data versus Models Naihao Deng et.al. 2501.14717 null
2025-01-24 FlexiGPT: Pruning and Extending Large Language Models with Low-Rank Weight Sharing James Seale Smith et.al. 2501.14713 null
2025-01-24 The Karp Dataset Mason DiCicco et.al. 2501.14705 null
2025-01-24 Rethinking Table Instruction Tuning Naihao Deng et.al. 2501.14693 null
2025-01-24 Rethinking Foundation Models for Medical Image Classification through a Benchmark Study on MedMNIST Fuping Wu et.al. 2501.14685 null
2025-01-24 An Empirical Study on LLM-based Classification of Requirements-related Provisions in Food-safety Regulations Shabnam Hassani et.al. 2501.14683 null
2025-01-24 Diffusion based Text-to-Music Generationwith Global and Local Text based Conditioning Jisi Zhang et.al. 2501.14680 null
2025-01-24 MedAgentBench: Dataset for Benchmarking LLMs as Agents in Medical Applications Yixing Jiang et.al. 2501.14654 link
2025-01-24 Investigating the (De)Composition Capabilities of Large Language Models in Natural-to-Formal Language Conversion Ziyao Xu et.al. 2501.14649 link
2025-01-24 Towards Scalable Topological Regularizers Hiu-Tung Wong et.al. 2501.14641 null
2025-01-24 Recommending Actionable Strategies: A Semantic Approach to Integrating Analytical Frameworks with Decision Heuristics Renato Ghisellini et.al. 2501.14634 null
2025-01-24 Extracting Problem Structure with LLMs for Optimized SAT Local Search André Schilder et.al. 2501.14630 null
2025-01-24 Single-neuron deep generative model uncovers underlying physics of neuronal activity in Ca imaging data Jordi Abante et.al. 2501.14615 null
2025-01-24 ReferDINO: Referring Video Object Segmentation with Visual Grounding Foundations Tianming Liang et.al. 2501.14607 null
2025-01-24 Leveraging ChatGPT's Multimodal Vision Capabilities to Rank Satellite Images by Poverty Level: Advancing Tools for Social Science Research Hamid Sarmadi et.al. 2501.14546 null
2025-01-24 VERUS-LM: a Versatile Framework for Combining LLMs with Symbolic Reasoning Benjamin Callewaert et.al. 2501.14540 null
2025-01-24 Design and Implementation of a Psychiatry Resident Training System Based on Large Language Models Zhenguang Zhong et.al. 2501.14530 link
2025-01-24 Scene Understanding Enabled Semantic Communication with Open Channel Coding Zhe Xiang et.al. 2501.14520 null
2025-01-24 Real-world Edge Neural Network Implementations Leak Private Interactions Through Physical Side Channel Zhuoran Liu et.al. 2501.14512 null
2025-01-24 Automated Assignment Grading with Large Language Models: Insights From a Bioinformatics Course Pavlin G. Poličar et.al. 2501.14499 null
2025-01-24 Evaluating and Improving Graph to Text Generation with Large Language Models Jie He et.al. 2501.14497 link
2025-01-24 RealCritic: Towards Effectiveness-Driven Evaluation of Language Model Critiques Zhengyang Tang et.al. 2501.14492 link
2025-01-24 Pesti-Gen: Unleashing a Generative Molecule Approach for Toxicity Aware Pesticide Design Taehan Kim et.al. 2501.14469 null
2025-01-24 Boundary Value Test Input Generation Using Prompt Engineering with LLMs: Fault Detection and Coverage Analysis Xiujing Guo et.al. 2501.14465 null
2025-01-24 Understanding and Mitigating Gender Bias in LLMs via Interpretable Neuron Editing Zeping Yu et.al. 2501.14457 null
2025-01-24 Domaino1s: Guiding LLM Reasoning for Explainable Answers in High-Stakes Domains Xu Chu et.al. 2501.14431 null
2025-01-24 GraphBC: Improving LLMs for Better Graph Data Processing Xu Chu et.al. 2501.14427 null
2025-01-24 CENTS: Generating synthetic electricity consumption time series for rare and unseen scenarios Michael Fuest et.al. 2501.14426 null
2025-01-24 DeepFlow: Serverless Large Language Model Serving at Scale Junhao Hu et.al. 2501.14417 null
2025-01-24 SKIL: Semantic Keypoint Imitation Learning for Generalizable Data-efficient Manipulation Shengjie Wang et.al. 2501.14400 null
2025-01-24 ECTIL: Label-efficient Computational Tumour Infiltrating Lymphocyte (TIL) assessment in breast cancer: Multicentre validation in 2,340 patients with breast cancer Yoni Schirris et.al. 2501.14379 link
2025-01-24 DRESSing Up LLM: Efficient Stylized Question-Answering via Style Subspace Editing Xinyu Ma et.al. 2501.14371 link
2025-01-24 Uncovering the bias in the evidence for dynamical dark energy through minimal and generalized modeling approaches Ziad Sakr et.al. 2501.14366 null
2025-01-24 FireRedASR: Open-Source Industrial-Grade Mandarin Speech Recognition Models from Encoder-Decoder to LLM Integration Kai-Tuo Xu et.al. 2501.14350 link
2025-01-24 Chain-of-Retrieval Augmented Generation Liang Wang et.al. 2501.14342 null
2025-01-24 Exploring the sustainable scaling of AI dilemma: A projective study of corporations' AI environmental impacts Clément Desroches et.al. 2501.14334 null
2025-01-24 Assessing Large Language Models in Comprehending and Verifying Concurrent Programs across Memory Models Ridhi Jain et.al. 2501.14326 null
2025-01-24 PAID: A Framework of Product-Centric Advertising Image Design Hongyu Chen et.al. 2501.14316 null
2025-01-24 Locality-aware Fair Scheduling in LLM Serving Shiyi Cao et.al. 2501.14312 null
2025-01-24 A Zero-Shot LLM Framework for Automatic Assignment Grading in Higher Education Calvin Yeung et.al. 2501.14305 link
2025-01-24 MASTER: A Multi-Agent System with LLM Specialized MCTS Bingzheng Gan et.al. 2501.14304 null
2025-01-24 Fast Think-on-Graph: Wider, Deeper and Faster Reasoning of Large Language Model on Knowledge Graph Xujian Liang et.al. 2501.14300 link
2025-01-24 Multi-stage Large Language Model Pipelines Can Outperform GPT-4o in Relevance Assessment Julian A. Schnabel et.al. 2501.14296 null
2025-01-24 Examining Alignment of Large Language Models through Representative Heuristics: The Case of Political Stereotypes Sullam Jeoung et.al. 2501.14294 link
2025-01-24 Advances in Temporal Point Processes: Bayesian, Deep, and LLM Approaches Feng Zhou et.al. 2501.14291 null
2025-01-24 Leveraging Online Olympiad-Level Math Problems for LLMs Training and Contamination-Resistant Evaluation Sadegh Mahdavi et.al. 2501.14275 link
2025-01-24 Siren: A Learning-Based Multi-Turn Attack Framework for Simulating Real-World Human Jailbreak Behaviors Yi Zhao et.al. 2501.14250 link
2025-01-24 Humanity's Last Exam Long Phan et.al. 2501.14249 null
2025-01-24 Multi-agent KTO: Reinforcing Strategic Interactions of Large Language Model in Language Game Rong Ye et.al. 2501.14225 null
2025-01-24 Top Ten Challenges Towards Agentic Neural Graph Databases Jiaxin Bai et.al. 2501.14224 null
2025-01-24 TFG-Flow: Training-free Guidance in Multimodal Generative Flow Haowei Lin et.al. 2501.14216 null
2025-01-24 Serving Long-Context LLMs at the Mobile Edge: Test-Time Reinforcement Learning-based Model Caching and Inference Offloading Minrui Xu et.al. 2501.14205 null
2025-01-24 VideoShield: Regulating Diffusion-based Video Generation Models via Watermarking Runyi Hu et.al. 2501.14195 link
2025-01-24 Distributed Multi-Agent Coordination Using Multi-Modal Foundation Models Saaduddin Mahmud et.al. 2501.14189 null
2025-01-24 GeoSim.AI: AI assistants for numerical simulations in geomechanics Yared W. Bekele et.al. 2501.14186 null
2025-01-24 AI Chatbots as Professional Service Agents: Developing a Professional Identity Wenwen Li et.al. 2501.14179 null
2025-01-24 Argos: Agentic Time-Series Anomaly Detection with Autonomous Rule Generation via Large Language Models Yile Gu et.al. 2501.14170 null
2025-01-24 Test-Time Code-Switching for Cross-lingual Aspect Sentiment Triplet Extraction Dongming Sheng et.al. 2501.14144 null
2025-01-23 Autonomous Structural Memory Manipulation for Large Language Models Using Hierarchical Embedding Augmentation Derek Yotheringhay et.al. 2501.14119 null
2025-01-23 Domain-Factored Untrained Deep Prior for Spectrum Cartography Subash Timilsina et.al. 2501.14116 null
2025-01-23 MedSlice: Fine-Tuned Large Language Models for Secure Clinical Note Sectioning Joshua Davis et.al. 2501.14105 link
2025-01-23 StreamingRAG: Real-time Contextual Retrieval and Generation Framework Murugan Sankaradas et.al. 2501.14101 null
2025-01-23 Enhancing Biomedical Relation Extraction with Directionality Po-Ting Lai et.al. 2501.14079 link
2025-01-23 LLMs are Vulnerable to Malicious Prompts Disguised as Scientific Language Yubin Ge et.al. 2501.14073 null
2025-01-23 Efficient 2D CT Foundation Model for Contrast Phase Classification Benjamin Hou et.al. 2501.14066 null
2025-01-23 Revisiting CLIP: Efficient Alignment of 3D MRI and Tabular Data using Domain-Specific Foundation Models Jakob Krogh Petersen et.al. 2501.14051 link
2025-01-23 LLM-guided Instance-level Image Manipulation with Diffusion U-Net Cross-Attention Maps Andrey Palaev et.al. 2501.14046 link
2025-01-23 Leveraging Large Language Models to Analyze Emotional and Contextual Drivers of Teen Substance Use in Online Discussions Jianfeng Zhu et.al. 2501.14037 null
2025-01-23 CRPO: Confidence-Reward Driven Preference Optimization for Machine Translation Guofeng Cui et.al. 2501.13927 null
2025-01-23 Improving Video Generation with Human Feedback Jie Liu et.al. 2501.13918 null
2025-01-23 Binary Diffusion Probabilistic Model Vitaliy Kinakh et.al. 2501.13915 null
2025-01-23 Analysis of Indic Language Capabilities in LLMs Aatman Vaidya et.al. 2501.13912 null
2025-01-23 Privacy-Preserving Personalized Federated Prompt Learning for Multimodal Large Language Models Linh Tran et.al. 2501.13904 null
2025-01-23 Exploring Finetuned Audio-LLM on Heart Murmur Features Adrian Florea et.al. 2501.13884 null
2025-01-23 The machine learning platform for developers of large systems Alexey Naikov et.al. 2501.13881 null
2025-01-23 A RAG-Based Institutional Assistant Gustavo Kuratomi et.al. 2501.13880 null
2025-01-23 On the Reasoning Capacity of AI Models and How to Quantify It Santosh Kumar Radha et.al. 2501.13833 null
2025-01-23 Predicting Compact Phrasal Rewrites with Large Language Models for ASR Post Editing Hao Zhang et.al. 2501.13831 null
2025-01-23 Hallucinations Can Improve Large Language Models in Drug Discovery Shuzhou Yuan et.al. 2501.13824 null
2025-01-23 Large Language Model driven Policy Exploration for Recommender Systems Jie Wang et.al. 2501.13816 null
2025-01-23 Enhancing LLMs for Governance with Human Oversight: Evaluating and Aligning LLMs on Expert Classification of Climate Misinformation for Detecting False or Misleading Claims about Climate Change Mowafak Allaham et.al. 2501.13802 null
2025-01-23 Parameter-Efficient Fine-Tuning for Foundation Models Dan Zhang et.al. 2501.13787 link
2025-01-23 Not Every AI Problem is a Data Problem: We Should Be Intentional About Data Scaling Tanya Rodchenko et.al. 2501.13779 null
2025-01-23 Explainable XR: Understanding User Behaviors of XR Environments using LLM-assisted Analytics Framework Yoonsang Kim et.al. 2501.13778 link
2025-01-23 Do Large Language Models Truly Understand Geometric Structures? Xiaofeng Wang et.al. 2501.13773 link
2025-01-23 Tune In, Act Up: Exploring the Impact of Audio Modality-Specific Edits on Large Audio Language Models in Jailbreak Erjia Xiao et.al. 2501.13772 null
2025-01-23 UGMathBench: A Diverse and Dynamic Benchmark for Undergraduate-Level Mathematical Reasoning with Large Language Models Xin Xu et.al. 2501.13766 null
2025-01-23 EICopilot: Search and Explore Enterprise Information over Large-scale Knowledge Graphs with LLM-driven Agents Yuhui Yun et.al. 2501.13746 null
2025-01-23 GPT-HTree: A Decision Tree Framework Integrating Hierarchical Clustering and Large Language Models for Explainable Classification Te Pei et.al. 2501.13743 null
2025-01-23 An Empirical Study of Retrieval-Augmented Code Generation: Challenges and Opportunities Zezhou Yang et.al. 2501.13742 link
2025-01-23 Pseudocode-Injection Magic: Enabling LLMs to Tackle Graph Computational Tasks Chang Gong et.al. 2501.13731 null
2025-01-23 RPO: Retrieval Preference Optimization for Robust Retrieval-Augmented Generation Shi-Qi Yan et.al. 2501.13726 null
2025-01-23 Musical ethnocentrism in Large Language Models Anna Kruspe et.al. 2501.13720 null
2025-01-23 A Mutual Information Perspective on Multiple Latent Variable Generative Models for Positive View Generation Dario Serez et.al. 2501.13718 null
2025-01-23 EventVL: Understand Event Streams via Multimodal Large Language Model Pengteng Li et.al. 2501.13707 null
2025-01-23 DI-BENCH: Benchmarking Large Language Models on Dependency Inference with Testable Repositories at Scale Linghao Zhang et.al. 2501.13699 null
2025-01-23 Question Answering on Patient Medical Records with Private Fine-Tuned LLMs Sara Kothari et.al. 2501.13687 null
2025-01-23 HumorReject: Decoupling LLM Safety from Refusal Prefix via A Little Humor Zihui Wu et.al. 2501.13677 link
2025-01-23 How to Complete Domain Tuning while Keeping General Ability in LLM: Adaptive Layer-wise and Element-wise Regularization Shezheng Song et.al. 2501.13669 null
2025-01-23 LVPruning: An Effective yet Simple Language-Guided Vision Token Pruning Approach for Multi-modal Large Language Models Yizheng Sun et.al. 2501.13652 null
2025-01-23 Sigma: Differential Rescaling of Query, Key and Value for Efficient Language Models Zhenghao Lin et.al. 2501.13629 null
2025-01-23 Text-to-SQL based on Large Language Models and Database Keyword Search Eduardo R. Nascimento et.al. 2501.13594 null
2025-01-23 Improving Contextual Faithfulness of Large Language Models via Retrieval Heads-Induced Optimization Lei Huang et.al. 2501.13573 null
2025-01-23 One-Prompt-One-Story: Free-Lunch Consistent Text-to-Image Generation Using a Single Prompt Tao Liu et.al. 2501.13554 link
2025-01-23 LLMs Can Plan Only If We Tell Them Bilgehan Sel et.al. 2501.13545 null
2025-01-23 ReasVQA: Advancing VideoQA with Imperfect Reasoning Process Jianxin Liang et.al. 2501.13536 null
2025-01-23 RECALL: Library-Like Behavior In Language Models is Enhanced by Self-Referencing Causal Cycles Munachiso Nwadike et.al. 2501.13491 link
2025-01-23 Adaptive Testing for LLM-Based Applications: A Diversity-based Approach Juyeon Yoon et.al. 2501.13480 null
2025-01-23 LDR-Net: A Novel Framework for AI-generated Image Detection via Localized Discrepancy Representation JiaXin Chen et.al. 2501.13475 null
2025-01-23 Streaming Video Understanding and Multi-round Interaction with Memory-enhanced Knowledge Haomiao Xiong et.al. 2501.13468 link
2025-01-23 Spurious Forgetting in Continual Learning of Language Models Junhao Zheng et.al. 2501.13453 link
2025-01-23 Softplus Attention with Re-weighting Boosts Length Extrapolation in Large Language Models Bo Gao et.al. 2501.13428 null
2025-01-23 Predicting Turbulence Structure In Street-Canyon Flows using Deep Generative Modeling Tomek Jaroslawski et.al. 2501.13415 null
2025-01-23 VulnBot: Autonomous Penetration Testing for A Multi-Agent Collaborative Framework He Kong et.al. 2501.13411 link
2025-01-23 Towards Intelligent Design: A Self-driven Framework for Collocated Clothing Synthesis Leveraging Fashion Styles and Textures Minglong Dong et.al. 2501.13396 null
2025-01-23 Can Large Language Models Understand Preferences in Personalized Recommendation? Zhaoxuan Tan et.al. 2501.13391 link
2025-01-23 Do as We Do, Not as You Think: the Conformity of Large Language Models Zhiyuan Weng et.al. 2501.13381 link
2025-01-23 Scalable Evaluation Framework for Foundation Models in Musculoskeletal MRI Bridging Computational Innovation with Clinical Utility Gabrielle Hoyer et.al. 2501.13376 null
2025-01-23 Generative Data Augmentation Challenge: Zero-Shot Speech Synthesis for Personalized Speech Enhancement Jae-Sung Bae et.al. 2501.13372 null
2025-01-23 Meta-Feature Adapter: Integrating Environmental Metadata for Enhanced Animal Re-identification Yuzhuo Li et.al. 2501.13368 null
2025-01-23 50 Shades of Deceptive Patterns: A Unified Taxonomy, Multimodal Detection, and Security Implications Zewei Shi et.al. 2501.13351 link
2025-01-23 MSF: Efficient Diffusion Model Via Multi-Scale Latent Factorize Haohang Xu et.al. 2501.13349 null
2025-01-23 Full-Stack Optimized Large Language Models for Lifelong Sequential Behavior Comprehension in Recommendation Rong Shan et.al. 2501.13344 null
2025-01-23 Multi-aspect Knowledge Distillation with Large Language Model Taegyeong Lee et.al. 2501.13341 link
2025-01-23 Generative Multi-Form Bayesian Optimization Zhendong Guo et.al. 2501.13337 null
2025-01-23 SplitLLM: Hierarchical Split Learning for Large Language Model over Wireless Network Songge Zhang et.al. 2501.13318 null
2025-01-23 Representing Visualization Insights as a Dense Insight Network Jane Hoffswell et.al. 2501.13309 null
2025-01-23 OSUM: Advancing Open Speech Understanding Models with Limited Resources in Academia Xuelong Geng et.al. 2501.13306 link
2025-01-23 Watching the AI Watchdogs: A Fairness and Robustness Analysis of AI Safety Moderation Classifiers Akshit Achara et.al. 2501.13302 link
2025-01-23 Hypothesis Generation for Materials Discovery and Design Using Goal-Driven and Constraint-Guided LLM Agents Shrinidhi Kumbhar et.al. 2501.13299 null
2025-01-23 RAMQA: A Unified Framework for Retrieval-Augmented Multi-Modal Question Answering Yang Bai et.al. 2501.13297 link
2025-01-23 Toyteller: AI-powered Visual Storytelling Through Toy-Playing with Character Symbols John Joon Young Chung et.al. 2501.13284 null
2025-01-22 MEDFORM: A Foundation Model for Contrastive Learning of CT Imaging and Clinical Numeric Data in Multi-Cancer Analysis Daeun Jung et.al. 2501.13277 link
2025-01-22 RAG-Reward: Optimizing RAG with Reward Modeling and RLHF Hanning Zhang et.al. 2501.13264 null
2025-01-22 Exploring GPT's Ability as a Judge in Music Understanding Kun Fang et.al. 2501.13261 link
2025-01-22 Bypassing Array Canaries via Autonomous Function Call Resolution Nathaniel Oh et.al. 2501.13256 link
2025-01-22 S-LoRA: Scalable Low-Rank Adaptation for Class Incremental Learning Yichen Wu et.al. 2501.13198 null
2025-01-22 Computational modelling of biological systems now and then: revisiting tools and visions from the beginning of the century Axel Loewe et.al. 2501.13142 null
2025-01-23 VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding Boqiang Zhang et.al. 2501.13106 link
2025-01-22 Robust Representation Consistency Model via Contrastive Denoising Jiachen Lei et.al. 2501.13094 link
2025-01-22 Refining Input Guardrails: Enhancing LLM-as-a-Judge Efficiency Through Chain-of-Thought Fine-Tuning and Alignment Melissa Kazemi Rad et.al. 2501.13080 null
2025-01-22 Does Table Source Matter? Benchmarking and Improving Multimodal Scientific Table Understanding and Reasoning Bohao Yang et.al. 2501.13042 link
2025-01-22 Pairwise RM: Perform Best-of-N Sampling with Knockout Tournament Yantao Liu et.al. 2501.13007 link
2025-01-22 Neural network enhanced cross entropy benchmark for monitored circuits Yangrui Hu et.al. 2501.13005 null
2025-01-22 Large Language Model-Based Semantic Communication System for Image Transmission Soheyb Ribouh et.al. 2501.12988 null
2025-01-22 LLM4WM: Adapting LLM for Wireless Multi-Tasking Xuanyu Liu et.al. 2501.12983 null
2025-01-22 Low-dimensional adaptation of diffusion models: Convergence in total variation Jiadong Liang et.al. 2501.12982 null
2025-01-22 OnionEval: An Unified Evaluation of Fact-conflicting Hallucination for Small-Large Language Models Chongren Sun et.al. 2501.12975 link
2025-01-22 Accessible Smart Contracts Verification: Synthesizing Formal Models with Tamed LLMs Jan Corazza et.al. 2501.12972 null
2025-01-22 It's complicated. The relationship of algorithmic fairness and non-discrimination regulations in the EU AI Act Kristof Meding et.al. 2501.12962 null
2025-01-22 Efficient Prompt Compression with Evaluator Heads for Long-Context Transformer Inference Weizhi Fei et.al. 2501.12959 null
2025-01-22 GANQ: GPU-Adaptive Non-Uniform Quantization for Large Language Models Pengxiang Zhao et.al. 2501.12956 null
2025-01-22 3D Object Manipulation in a Single Image using Generative Models Ruisi Zhao et.al. 2501.12935 null
2025-01-22 Correctness Assessment of Code Generated by Large Language Models Using Internal Representations Tuan-Dung Bui et.al. 2501.12934 link
2025-01-22 DynamicEarth: How Far are We from Open-Vocabulary Change Detection? Kaiyu Li et.al. 2501.12931 null
2025-01-22 A Functional Software Reference Architecture for LLM-Integrated Systems Alessio Bucaioni et.al. 2501.12904 null
2025-01-22 Architectural Fusion Through Contextual Partitioning in Large Language Models: A Novel Approach to Parameterized Knowledge Integration Offa Kingsleigh et.al. 2501.12901 null
2025-01-22 Test-Time Preference Optimization: On-the-Fly Alignment via Iterative Textual Feedback Yafu Li et.al. 2501.12895 link
2025-01-23 Generative AI Misuse Potential in Cyber Security Education: A Case Study of a UK Degree Program Carlton Shepherd et.al. 2501.12883 null
2025-01-22 WisdomBot: Tuning Large Language Models with Artificial Intelligence Knowledge Jingyuan Chen et.al. 2501.12877 null
2025-01-22 ACEBench: Who Wins the Match Point in Tool Learning? Chen Chen et.al. 2501.12851 null
2025-01-22 AMM-Diff: Adaptive Multi-Modality Diffusion Network for Missing Modality Imputation Aghiles Kebaili et.al. 2501.12840 null
2025-01-22 Adaptive Retrieval Without Self-Knowledge? Bringing Uncertainty Back Home Viktor Moskvoretskii et.al. 2501.12835 null
2025-01-22 Open or Closed LLM for Lesser-Resourced Languages? Lessons from Greek John Pavlopoulos et.al. 2501.12826 link
2025-01-22 Enhancing Monocular Depth Estimation with Multi-Source Auxiliary Tasks Alessio Quercia et.al. 2501.12824 null
2025-01-22 Certified Guidance for Planning with Deep Generative Models Francesco Giacomarra et.al. 2501.12815 null
2025-01-22 Revisit Self-Debugging with Self-Generated Tests for Code Generation Xiancai Chen et.al. 2501.12793 null
2025-01-22 LLMs as Repositories of Factual Knowledge: Limitations and Solutions Seyed Mahed Mousavi et.al. 2501.12774 null
2025-01-22 NExtLong: Toward Effective Long-Context Training without Long Documents Chaochen Gao et.al. 2501.12766 link
2025-01-22 Online Preference Alignment for Language Models via Count-based Exploration Chenjia Bai et.al. 2501.12735 link
2025-01-22 Paradigm-Based Automatic HDL Code Generation Using LLMs Wenhao Sun et.al. 2501.12702 null
2025-01-22 Training Dialogue Systems by AI Feedback for Improving Overall Dialogue Impression Kai Yoshida et.al. 2501.12698 null
2025-01-22 Combining Knowledge Graph and LLMs for Enhanced Zero-shot Visual Question Answering Qian Tao et.al. 2501.12697 null
2025-01-22 SoundSpring: Loss-Resilient Audio Transceiver with Dual-Functional Masked Language Modeling Shengshi Yao et.al. 2501.12696 null
2025-01-22 EchoLM: Accelerating LLM Serving with Real-time Knowledge Distillation Yifan Yu et.al. 2501.12689 null
2025-01-22 Distillation Quantification for Large Language Models Sunbowen Lee et.al. 2501.12619 link
2025-01-22 Deep Learning-Based Identification of Inconsistent Method Names: How Far Are We? Taiming Wang et.al. 2501.12617 null
2025-01-22 Kimi k1.5: Scaling Reinforcement Learning with LLMs Kimi Team et.al. 2501.12599 null
2025-01-22 Leveraging LLMs to Create a Haptic Devices' Recommendation System Yang Liu et.al. 2501.12573 null
2025-01-22 Understanding the LLM-ification of CHI: Unpacking the Impact of LLMs at CHI through a Systematic Literature Review Rock Yuren Pang et.al. 2501.12557 link
2025-01-21 Human-like conceptual representations emerge from language prediction Ningyu Xu et.al. 2501.12547 null
2025-01-21 How Does the Spatial Distribution of Pre-training Data Affect Geospatial Foundation Models? Mirali Purohit et.al. 2501.12535 null
2025-01-21 An Empirically-grounded tool for Automatic Prompt Linting and Repair: A Case Study on Bias, Vulnerability, and Optimization in Developer Prompts Dhia Elhaq Rzig et.al. 2501.12521 null
2025-01-21 A Domain Adaptation Framework for Speech Recognition Systems with Only Synthetic data Minh Tran et.al. 2501.12501 null
2025-01-21 The Journey Matters: Average Parameter Count over Pre-training Unifies Sparse and Dense Scaling Laws Tian Jin et.al. 2501.12486 null
2025-01-21 An Empirical Characterization of Outages and Incidents in Public Services for Large Language Models Xiaoyu Chu et.al. 2501.12469 link
2025-01-21 Adaptive PII Mitigation Framework for Large Language Models Shubhi Asthana et.al. 2501.12465 null
2025-01-21 Empowering AIOps: Leveraging Large Language Models for IT Operations ManagementOperations Management Arthur Vitui et.al. 2501.12461 link
2025-01-21 Deploying Privacy Guardrails for LLMs: A Comparative Analysis of Real-World Applications Shubhi Asthana et.al. 2501.12456 null
2025-01-21 Divide-Then-Aggregate: An Efficient Tool Learning Method via Parallel Tool Invocation Dongsheng Zhu et.al. 2501.12432 null
2025-01-21 FREYR: A Framework for Recognizing and Executing Your Requests Roberto Gallotta et.al. 2501.12423 link
2025-01-21 CroMe: Multimodal Fake News Detection using Cross-Modal Tri-Transformer and Metric Learning Eunjee Choi et.al. 2501.12422 null
2025-01-22 InternVideo2.5: Empowering Video MLLMs with Long and Rich Context Modeling Yi Wang et.al. 2501.12386 link
2025-01-21 Accelerating Pulsar Parameter Estimation Using Convolutional Neural Networks Greg Olmschenk et.al. 2501.12383 null
2025-01-21 MMVU: Measuring Expert-Level Multi-Discipline Video Understanding Yilun Zhao et.al. 2501.12380 link
2025-01-22 Video Depth Anything: Consistent Depth Estimation for Super-Long Videos Sili Chen et.al. 2501.12375 null
2025-01-21 Expertise elevates AI usage: experimental evidence comparing laypeople and professional artists Thomas F. Eisenmann et.al. 2501.12374 link
2025-01-21 Is Long Context All You Need? Leveraging LLM's Extended Context for NL2SQL Yeounoh Chung et.al. 2501.12372 null
2025-01-21 Automatic Labelling with Open-source LLMs using Dynamic Label Schema Integration Thomas Walshe et.al. 2501.12332 null
2025-01-21 Cinepro: Robust Training of Foundation Models for Cancer Detection in Prostate Ultrasound Cineloops Mohamed Harmanani et.al. 2501.12331 link
2025-01-21 VARGPT: Unified Understanding and Generation in a Visual Autoregressive Multimodal Large Language Model Xianwei Zhuang et.al. 2501.12327 link
2025-01-21 LLM-Assisted Knowledge Graph Completion for Curriculum and Domain Modelling in Personalized Higher Education Recommendations Hasan Abu-Rasheed et.al. 2501.12300 null
2025-01-21 MoGERNN: An Inductive Traffic Predictor for Unobserved Locations in Dynamic Sensing Networks Qishen Zhou et.al. 2501.12281 link
2025-01-21 Condor: Enhance LLM Alignment with Knowledge-Driven Data Synthesis and Refinement Maosong Cao et.al. 2501.12273 link
2025-01-21 FOCUS: First Order Concentrated Updating Scheme Yizhou Liu et.al. 2501.12243 null
2025-01-21 InsTALL: Context-aware Instructional Task Assistance with Multi-modal Large Language Models Pha Nguyen et.al. 2501.12231 null
2025-01-21 CDW-CoT: Clustered Distance-Weighted Chain-of-Thoughts Reasoning Yuanheng Fang et.al. 2501.12226 null
2025-01-21 Leveraging Large Language Models for Realizing Truly Intelligent User Interfaces Allard Oelen et.al. 2501.12221 null
2025-01-21 You Can't Eat Your Cake and Have It Too: The Performance Degradation of LLMs with Jailbreak Defense Wuyuao Mai et.al. 2501.12210 null
2025-01-21 Explainability for Vision Foundation Models: A Survey Rémi Kazmierczak et.al. 2501.12203 null
2025-01-22 Hunyuan3D 2.0: Scaling Diffusion Models for High Resolution Textured 3D Assets Generation Zibo Zhao et.al. 2501.12202 link
2025-01-21 BiMarker: Enhancing Text Watermark Detection for Large Language Models with Bipolar Watermarks Zhuang Li et.al. 2501.12174 null
2025-01-21 Contextualizing Recommendation Explanations with LLMs: A User Study Yuanjun Feng et.al. 2501.12152 null
2025-01-21 Improving Influence-based Instruction Tuning Data Selection for Balanced Learning of Diverse Capabilities Qirun Dai et.al. 2501.12147 null
2025-01-21 Do LLMs Provide Links to Code Similar to what they Generate? A Study with Gemini and Bing CoPilot Daniele Bifolco et.al. 2501.12134 null
2025-01-21 Evaluating Efficiency and Engagement in Scripted and LLM-Enhanced Human-Robot Interactions Tim Schreiter et.al. 2501.12128 null
2025-01-21 Can open source large language models be used for tumor documentation in Germany? -- An evaluation on urological doctors' notes Stefan Lenz et.al. 2501.12106 link
2025-01-21 Dissecting the NVIDIA Hopper Architecture through Microbenchmarking and Multiple Level Analysis Weile Luo et.al. 2501.12084 null
2025-01-21 Phishing Awareness via Game-Based Learning Argianto Rahartomo et.al. 2501.12077 link
2025-01-21 PINNsAgent: Automated PDE Surrogation with Large Language Models Qingpo Wuwu et.al. 2501.12053 null
2025-01-21 Harnessing Generative Pre-Trained Transformer for Datacenter Packet Trace Generation Chen Griner et.al. 2501.12033 null
2025-01-21 Comparative Analysis of Pre-trained Deep Learning Models and DINOv2 for Cushing's Syndrome Diagnosis in Facial Analysis Hongjun Liu et.al. 2501.12023 null
2025-01-21 Are Traditional Deep Learning Model Approaches as Effective as a Retinal-Specific Foundation Model for Ocular and Systemic Disease Detection? Samantha Min Er Yew et.al. 2501.12016 null
2025-01-21 Rate-Aware Learned Speech Compression Jun Xu et.al. 2501.11999 null
2025-01-21 Linear Feedback Control Systems for Iterative Prompt Optimization in Large Language Models Rupesh Raj Karn et.al. 2501.11979 null
2025-01-21 Leveraging Graph Structures and Large Language Models for End-to-End Synthetic Task-Oriented Dialogues Maya Medjad et.al. 2501.11977 link
2025-01-21 Bridging Visualization and Optimization: Multimodal Large Language Models on Graph-Structured Combinatorial Optimization Jie Zhao et.al. 2501.11968 null
2025-01-21 A Hybrid Attention Framework for Fake News Detection with Large Language Models Xiaochuan Xu et.al. 2501.11967 null
2025-01-21 **TAD-Bench: A Comprehensive Benchmark for Embedding-Based Text Anom

About

Automatically update arXiv papers about LLM Reasoning, LLM Evaluation, LLM & MLLM and Video Understanding using Github Actions.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages