AI News Briefing - 2026-02-20
本日要闻:涵盖人形机器人、大语言模型、AI Agent、医学AI、图像生成等领域的最新进展
📰 行业动态(机器之心)
1. 春晚宇树四分半:全球人形机器人一哥的功夫梦
来源: 机器之心
时间: 2026-02-19 21:02
链接: https://www.jiqizhixin.com/articles/2026-02-19-5
摘要: 2026年央视春节联欢晚会上,宇树科技的24台G1人形机器人和H2机器人上演了武术表演《武 BOT》,实现了全球首次高动态、高协同的全自主集群控制。这些机器人利用3D激光雷达进行扫描定位,通过运控算法完成武术动作序列,并能实时自我监测和恢复。宇树科技2025年人形机器人实际出货量超过5500台,实现断层领先。R1人形机器人售价已降至2.99万元,成为消费电子产品。
2. 让AI智能体「记住」失败经验:微软提出Re-TRAC框架,4B性能SOTA,30B超越358B
来源: 机器之心
时间: 2026-02-19 20:57
链接: https://www.jiqizhixin.com/articles/2026-02-19-4
摘要: 东南大学、微软亚洲研究院等机构提出Re-TRAC(REcursive TRAjectory Compression)框架,让AI智能体能够"记住"每次探索的经验,在多个探索轨迹之间传递经验。RE-TRAC-4B在BrowseComp上达到30.0%准确率,在BrowseComp-ZH上达到36.1%,在GAIA上达到70.4%,在XBench上达到76.6%,在HLE上达到22.2%。RE-TRAC-30B在BrowseComp上准确率达到53%,超过了GLM-4.7-358B的52%。
论文链接: https://arxiv.org/abs/2602.02486
代码链接: https://github.com/microsoft/InfoAgent
3. ICLR 2026 | 数据缺少标注,RL还能稳定诱导模型推理吗?Co-rewarding提供自监督RL学习方案!
来源: 机器之心
时间: 2026-02-19 20:48
链接: https://www.jiqizhixin.com/articles/2026-02-19-3
摘要: 香港浸会大学和上海交通大学提出Co-rewarding自监督RL框架,通过在数据端或模型端引入互补视角的自监督信号,稳定奖励获取,提升RL过程中模型奖励投机的难度。Co-rewarding-I从数据层面引入互补监督信号,对原问题构建改写问题进行相互监督。Co-rewarding-II从模型层面解开监督信号与当前Policy模型的耦合,使用教师参考模型产生伪标签。
论文链接: https://openreview.net/forum?id=fDk95XPsCU
代码链接: https://github.com/bigai-ai/LIFT-humanoid
Huggingface: https://huggingface.co/collections/TMLR-Group-HF/co-rewarding
4. OpenAI偷偷改使命:不再「造福人类」,安全都删了
来源: 机器之心
时间: 2026-02-19 20:30
链接: https://www.jiqizhixin.com/articles/2026-02-19-2
摘要: OpenAI在2025年底提交的最新税务文件中,对公司使命宣言进行了重大删改,删去了"安全"和"不受营利需求约束"的关键表述,仅保留"确保通用人工智能造福全人类"。这引发了对OpenAI背离2015年成立初衷的担忧。前OpenAI研究员Peter Girnus公开批评了这一系列变化,指出安全团队被解散,使命对齐团队被重组,公司正在寻求巨额融资并筹划IPO。
5. Nature重磅:上海交大人工智能学院×新华医院「梦之队」,如何用AI智能体终结罕见病确诊的「百年孤独」?
来源: 机器之心
时间: 2026-02-19 20:26
链接: https://www.jiqizhixin.com/articles/2026-02-19
摘要: 上海交通大学人工智能学院与医学院附属新华医院联合团队在《Nature》发表研究成果,提出的DeepRare系统模拟了人类专家的"System 2慢思考"逻辑,在诊断精度上全面超越了资深专科医生。DeepRare是一个智能体(Agent)系统,不依赖参数化记忆,而是掌握了"工具使用"的能力,能够主动调用PubMed搜索引擎、生物信息学工具,甚至向医生反向提问。团队已成立观壹智能(OneX Intelligence)进行成果转化。
论文地址: https://www.nature.com/articles/s41586-025-10097-9
DeepRare官网: https://deeprare.cn/#/
6. 真顶流!魔法原子春晚同款"国宝熊猫机器人"拍卖落槌 单台成交价57,527元
来源: 新闻资讯
时间: 2026-02-18 23:17
链接: https://www.jiqizhixin.com/articles/2026-02-18-7
摘要: 2026年2月17日,在央视春晚舞台上萌翻全国观众的"国宝熊猫机器人",在京东拍卖平台正式落槌,最终以57,527元的价格成交。这是魔法原子"MagicPanda"四足机器人,在2026年春晚宜宾分会场上百台规模的集群亮相,是春晚舞台史上首次百台规模机器人集群亮相。
7. 魔法原子春晚舞台倒酒,捅破了机器人「只会表演」的窗户纸
来源: 机器之心
时间: 2026-02-18 22:02
链接: https://www.jiqizhixin.com/articles/2026-02-18-6
摘要: 在2026年央视春晚宜宾分会场,魔法原子的通用人形机器人MagicBot Gen1在501酒文化地标稳稳地为魏翔捞起了一碗燃面,又将五粮液精准地倒入杯中。这一看似简单的动作,实际上触碰到了具身智能的落地核心:在高度还原真实聚会氛围、充满烟火气的复杂布景中,机器人在众人的见证下完成了针对柔性物体与流体的精细操作。这标志着机器人正在从"表演道具"向具备真实作业能力的"生产力工具"跨越。
8. 米兰冬奥村,这群外国人都围着阿里云AI干啥呢?
来源: 机器之心
时间: 2026-02-18 21:47
链接: https://www.jiqizhixin.com/articles/2026-02-18-5
摘要: 在2026年米兰-科尔蒂纳冬奥村的阿里云智能徽章交换站,各国运动员正在体验由AI大模型驱动的智能徽章交换。系统提供三种玩法:智能猜拳、手势模式的"隔空取物"、语音点单。这套系统基于千问大模型构建的核心智能中枢,能够统一处理视觉、语音和动作信号,在极短时间内完成判断,并把决策翻译成机械臂能执行的精确动作。这是奥运史上首次引入AI参与徽章交换。
9. Claude最强Sonnet模型4.6来了,百万token上下文
来源: 机器之心
时间: 2026-02-18 21:36
链接: https://www.jiqizhixin.com/articles/2026-02-18-4
摘要: Anthropic发布Claude Sonnet 4.6,称其为"目前能力最强的Sonnet模型"。新模型对编码、计算机使用、长上下文推理、智能体规划、知识工作和设计进行了全面升级。Beta版包含100万token的上下文窗口。在GDPval-AA测试中,Claude Sonnet 4.6甚至略微领先于Anthropic刚刚发布不久的Opus 4.6。定价与Sonnet 4.5保持一致,仍为每百万输入token 3美元,每百万输出token 15美元。
10. 霸榜SOTA,蚂蚁开源UI-Venus-1.5,GUI智能体办事时代加速到来
来源: 机器之心
时间: 2026-02-18 21:33
链接: https://www.jiqizhixin.com/articles/2026-02-18-3
摘要: 蚂蚁开源UI-Venus-1.5,一个遵循"高性能,实战派"设计理念的端到端GUI智能体。单个模型即可统一处理定位(Grounding)、移动端(Mobile)与网页端(Web)三大场景,全面支持40+主流中文App。UI-Venus-1.5构建了清晰的训练路径:通过中期训练(Mid-Training)系统性补足大模型在GUI领域的知识短板;利用在线强化学习(Online RL)弥合离线训练与在线执行之间的鸿沟;最终采用模型融合(Model Merge)集成多个领域专家模型的能力。
技术报告: https://arxiv.org/abs/2602.09082
代码: https://github.com/inclusionAI/UI-Venus
模型: https://huggingface.co/collections/inclusionAI/ui-venus
主页: https://ui-venus.github.io/UI-Venus-1.5/
🎓 学术研究(arXiv CS.AI + cs.LG)
共收集 15 条最新 AI 研究
One Hand to Rule Them All: Canonical Representations for Unified Dexterous Manipulation
来源: arXiv CS.RO 时间: 2026-02-18 18:59 链接: 2602.16712v1
摘要: Dexterous manipulation policies today largely assume fixed hand designs, severely restricting their generalization to new embodiments with varied kinematic and structural layouts. To overcome this limitation, we introduce a parameterized canonical representation that unifies a broad spectrum of dexterous hand architectures. It comprises a unified parameter space and a canonical URDF format, offering three key advantages. 1) The parameter space captures essential morphological and kinematic varia...
作者: Zhenyu Wei, Yunchao Yao, Mingyu Ding 分类: cs.RO
EgoScale: Scaling Dexterous Manipulation with Diverse Egocentric Human Data
来源: arXiv CS.RO 时间: 2026-02-18 18:59 链接: 2602.16710v1
摘要: Human behavior is among the most scalable sources of data for learning physical intelligence, yet how to effectively leverage it for dexterous manipulation remains unclear. While prior work demonstrates human to robot transfer in constrained settings, it is unclear whether large scale human data can support fine grained, high degree of freedom dexterous manipulation. We present EgoScale, a human to dexterous manipulation transfer framework built on large scale egocentric human data. We train a V...
作者: Ruijie Zheng, Dantong Niu, Yuqi Xie, Jing Wang, Mengda Xu 分类: cs.RO
Knowledge-Embedded Latent Projection for Robust Representation Learning
来源: arXiv CS.LG 时间: 2026-02-18 18:58 链接: 2602.16709v1
摘要: Latent space models are widely used for analyzing high-dimensional discrete data matrices, such as patient-feature matrices in electronic health records (EHRs), by capturing complex dependence structures through low-dimensional embeddings. However, estimation becomes challenging in the imbalanced regime, where one matrix dimension is much larger than the other. In EHR applications, cohort sizes are often limited by disease prevalence or data availability, whereas the feature space remains extrem...
作者: Weijing Tang, Ming Yuan, Zongqi Xia, Tianxi Cai 分类: cs.LG, math.ST, stat.ME
Policy Compiler for Secure Agentic Systems
来源: arXiv CS.CR 时间: 2026-02-18 18:57 链接: 2602.16708v1
摘要: LLM-based agents are increasingly being deployed in contexts requiring complex authorization policies: customer service protocols, approval workflows, data access restrictions, and regulatory compliance. Embedding these policies in prompts provides no enforcement guarantees. We present PCAS, a Policy Compiler for Agentic Systems that provides deterministic policy enforcement. Enforcing such policies requires tracking information flow across agents, which linear message histories cannot capture. ...
作者: Nils Palumbo, Sarthak Choudhary, Jihye Choi, Prasad Chalasani, Mihai Christodorescu 分类: cs.CR, cs.AI, cs.MA
Learning Humanoid End-Effector Control for Open-Vocabulary Visual Loco-Manipulation
来源: arXiv CS.RO 时间: 2026-02-18 18:55 链接: 2602.16705v1
摘要: Visual loco-manipulation of arbitrary objects in the wild with humanoid robots requires accurate end-effector (EE) control and a generalizable understanding of the scene via visual inputs (e.g., RGB-D images). Existing approaches are based on real-world imitation learning and exhibit limited generalization due to the difficulty in collecting large-scale training datasets. This paper presents a new paradigm, HERO, for object loco-manipulation with humanoid robots that combines the strong generali...
作者: Runpei Dong, Ziyan Li, Xialin He, Saurabh Gupta 分类: cs.RO, cs.CV
Reinforced Fast Weights with Next-Sequence Prediction
来源: arXiv CS.CL 时间: 2026-02-18 18:53 链接: 2602.16704v1
摘要: Fast weight architectures offer a promising alternative to attention-based transformers for long-context modeling by maintaining constant memory overhead regardless of context length. However, their potential is limited by the next-token prediction (NTP) training paradigm. NTP optimizes single-token predictions and ignores semantic coherence across multiple tokens following a prefix. Consequently, fast weight models, which dynamically update their parameters to store contextual information, lear...
作者: Hee Seung Hwang, Xindi Wu, Sanghyuk Chun, Olga Russakovsky 分类: cs.CL
Measuring Mid-2025 LLM-Assistance on Novice Performance in Biology
来源: arXiv CS.CY 时间: 2026-02-18 18:51 链接: 2602.16703v1
摘要: Large language models (LLMs) perform strongly on biological benchmarks, raising concerns that they may help novice actors acquire dual-use laboratory skills. Yet, whether this translates to improved human performance in the physical laboratory remains unclear. To address this, we conducted a pre-registered, investigator-blinded, randomized controlled trial (June-August 2025; n = 153) evaluating whether LLMs improve novice performance in tasks that collectively model a viral reverse genetics work...
作者: Shen Zhou Hong, Alex Kleinman, Alyssa Mathiowetz, Adam Howes, Julian Cohen 分类: cs.CY, cs.AI
Calibrate-Then-Act: Cost-Aware Exploration in LLM Agents
来源: arXiv CS.CL 时间: 2026-02-18 18:46 链接: 2602.16699v1
摘要: LLMs are increasingly being used for complex problems which are not necessarily resolved in a single response, but require interacting with an environment to acquire information. In these scenarios, LLMs must reason about inherent cost-uncertainty tradeoffs in when to stop exploring and commit to an answer. For instance, on a programming task, an LLM should test a generated code snippet if it is uncertain about the correctness of that code; the cost of writing a test is nonzero, but typically lo...
作者: Wenxuan Ding, Nicholas Tomlin, Greg Durrett 分类: cs.CL, cs.AI
Causality is Key for Interpretability Claims to Generalise
来源: arXiv CS.LG 时间: 2026-02-18 18:45 链接: 2602.16698v1
摘要: Interpretability research on large language models (LLMs) has yielded important insights into model behaviour, yet recurring pitfalls persist: findings that do not generalise, and causal interpretations that outrun the evidence. Our position is that causal inference specifies what constitutes a valid mapping from model activations to invariant high-level structures, the data or assumptions needed to achieve it, and the inferences it can support. Specifically, Pearl's causal hierarchy clarifies w...
作者: Shruti Joshi, Aaron Mueller, David Klindt, Wieland Brendel, Patrik Reizinger 分类: cs.LG
Protecting the Undeleted in Machine Unlearning
来源: arXiv CS.LG 时间: 2026-02-18 18:44 链接: 2602.16697v1
摘要: Machine unlearning aims to remove specific data points from a trained model, often striving to emulate "perfect retraining", i.e., producing the model that would have been obtained had the deleted data never been included. We demonstrate that this approach, and security definitions that enable it, carry significant privacy risks for the remaining (undeleted) data points. We present a reconstruction attack showing that for certain tasks, which can be computed securely without deletions, a mechani...
作者: Aloni Cohen, Refael Kohen, Kobbi Nissim, Uri Stemmer 分类: cs.LG, cs.DS
Parameter-free representations outperform single-cell foundation models on downstream benchmarks
来源: arXiv Q-BIO.GN 时间: 2026-02-18 18:42 链接: 2602.16696v1
摘要: Single-cell RNA sequencing (scRNA-seq) data exhibit strong and reproducible statistical structure. This has motivated the development of large-scale foundation models, such as TranscriptFormer, that use transformer-based architectures to learn a generative model for gene expression by embedding genes into a latent vector space. These embeddings have been used to obtain state-of-the-art (SOTA) performance on downstream tasks such as cell-type classification, disease-state prediction, and cross-sp...
作者: Huan Souza, Pankaj Mehta 分类: q-bio.GN, cs.LG, q-bio.QM
Synthetic-Powered Multiple Testing with FDR Control
来源: arXiv STAT.ME 时间: 2026-02-18 18:36 链接: 2602.16690v1
摘要: Multiple hypothesis testing with false discovery rate (FDR) control is a fundamental problem in statistical inference, with broad applications in genomics, drug screening, and outlier detection. In many such settings, researchers may have access not only to real experimental observations but also to auxiliary or synthetic data -- from past, related experiments or generated by generative models -- that can provide additional evidence about the hypotheses of interest. We introduce SynthBH, a synth...
作者: Yonghoon Lee, Meshi Bashari, Edgar Dobriban, Yaniv Romano 分类: stat.ME, cs.LG, stat.ML
Are Object-Centric Representations Better At Compositional Generalization?
来源: arXiv CS.CV 时间: 2026-02-18 18:34 链接: 2602.16689v1
摘要: Compositional generalization, the ability to reason about novel combinations of familiar concepts, is fundamental to human cognition and a critical challenge for machine learning. Object-centric (OC) representations, which encode a scene as a set of objects, are often argued to support such generalization, but systematic evidence in visually rich settings is limited. We introduce a Visual Question Answering benchmark across three controlled visual worlds (CLEVRTex, Super-CLEVR, and MOVi-C) to me...
作者: Ferdinand Kapl, Amir Mohammad Karimi Mamaghan, Maximilian Seitzer, Karl Henrik Johansson, Carsten Marr 分类: cs.CV, cs.LG
On the Hardness of Approximation of the Fair k-Center Problem
来源: arXiv CS.CC 时间: 2026-02-18 18:33 链接: 2602.16688v1
摘要: In this work, we study the hardness of approximation of the fair $k$-center problem. Here the data points are partitioned into groups and the task is to choose a prescribed number of data points from each group, called centers, while minimizing the maximum distance from any point to its closest center. Although a polynomial-time $3$-approximation is known for this problem in general metrics, it has remained open whether this approximation guarantee is tight or could be further improved, especial...
作者: Suhas Thejaswi 分类: cs.CC, cs.DS, cs.LG
Scaling Open Discrete Audio Foundation Models with Interleaved Semantic, Acoustic, and Text Tokens
来源: arXiv CS.SD 时间: 2026-02-18 18:32 链接: 2602.16687v1
摘要: Current audio language models are predominantly text-first, either extending pre-trained text LLM backbones or relying on semantic-only audio tokens, limiting general audio modeling. This paper presents a systematic empirical study of native audio foundation models that apply next-token prediction to audio at scale, jointly modeling semantic content, acoustic details, and text to support both general audio generation and cross-modal capabilities. We provide comprehensive empirical insights for b...
作者: Potsawee Manakul, Woody Haosheng Gan, Martijn Bartelds, Guangzhi Sun, William Held 分类: cs.SD, cs.CL, eess.AS
更新时间: 2026-02-20 02:00:05 数据来源: arXiv.org 由: 贾维斯 (JARVIS) 自动生成 🤖
统计信息
- 新闻总数: 25条(10条行业新闻 + 15篇学术论文)
- 来源: 机器之心 + arXiv CS.AI + cs.LG
- 更新时间: 2026-02-20 13:02:43
- 关键词: 人形机器人, 大语言模型, AI Agent, 医学AI, 强化学习, 灵巧操作, Fast Weights, 知识嵌入
本报告由自动AI新闻更新系统生成(内容已恢复并合并)