Mellen Y. Pu · Oct 27, 2025 · 2:18 AM UTC

Mellen Y. Pu

Mellen Y. Pu @CassielYM

Oct 27

github.com/amair-lab/PiFlow We have open-sourced PiFlow, welcome use cases and stars!

GitHub - amair-lab/PiFlow: [preprint] PiFlow: Principle-aware Scientific Discovery with Multi-Agent...

[preprint] PiFlow: Principle-aware Scientific Discovery with Multi-Agent Collaboration - amair-lab/PiFlow

github.com

Mellen Y. Pu · Oct 27, 2025 · 2:17 AM UTC

Mellen Y. Pu @CassielYM

Oct 27

We find that PiFlow is exceptionally robust in advancing scientific discovery by optimizing scientific principles—it is intuitive to use, efficient to learn, and highly effective. Moreover, principles discovered later exhibit a better balance between exploration and exploitation.

Rohan Paul · Oct 12, 2025 · 1:58 AM UTC

Mellen Y. Pu retweeted

Rohan Paul

@rohanpaul_ai

Oct 12

This survey shows multimodal models can self-improve with less human work. The loop has 3 parts, collection, organization, and optimization. Collection uses random sampling, guided prompts, and hard negatives. Organization verifies outputs with rules, peer judges, or environment feedback. Optimization trains by supervised fine-tuning, reinforcement learning, or preference optimization. A seed model starts the loop, and each round strengthens it. The survey defines 6 autonomy levels from human guided to self-run. Verifiable rewards tend to lift reasoning on tasks. Preference or AI feedback tends to cut hallucinations and errors. ---- Paper – arxiv. org/abs/2510.02665 Paper Title: "Self-Improvement in Multimodal LLMs: A Survey"

100

Mellen Y. Pu · Oct 3, 2025 · 7:49 AM UTC

Mellen Y. Pu @CassielYM

Oct 3

github.com/amair-lab/PiFlow Welcome to use our #PiFlow for any domain specific tasks supported!

GitHub - amair-lab/PiFlow: [preprint] PiFlow: Principle-aware Scientific Discovery with Multi-Agent...

[preprint] PiFlow: Principle-aware Scientific Discovery with Multi-Agent Collaboration - amair-lab/PiFlow

github.com

Mellen Y. Pu · Oct 3, 2025 · 7:48 AM UTC

Mellen Y. Pu @CassielYM

Oct 3

A new #paradigm for #LLM agents for scientific discovery - an information theoretical framework! 📝 arxiv.org/abs/2505.15047 💻Open-sourced 👑Plug-and-Play (free-integrate with your agents) 🔮Unified for searching tasks (bio, chem, physics, etc)

PiFlow: Principle-aware Scientific Discovery with Multi-Agent Collaboration

Large Language Model (LLM)-based multi-agent systems (MAS) demonstrate remarkable potential for scientific discovery. Existing approaches, however, often automate scientific discovery using...

arxiv.org

Jason Weston · Sep 9, 2025 · 2:01 AM UTC

Mellen Y. Pu retweeted

Jason Weston

@jaseweston

Sep 9

🌀New Test-time scaling method 🌀 📝: arxiv.org/abs/2509.06870 - Use RL to train an LLM solution aggregator – Reasons, reviews, reconciles, and synthesizes a final solution -> Much better than existing techniques! - Simple new method. Strong results across 4 math benchmarks. 🧵1/5

117

707

Tz · Aug 30, 2025 · 2:31 PM UTC

Mellen Y. Pu retweeted

@Tz_2022

Aug 30

这是是到目前为止我看到过的最喜欢，最有生命力的 AIGC 生成视频~

1,183

5,213

Tom Huang · Aug 24, 2025 · 9:19 AM UTC

Mellen Y. Pu retweeted

Tom Huang

@tuturetom

Aug 24

这个 Youtube 博主的内容真的好的令人难以置信... 😂

286

1,989

Sebastian Raschka · Aug 24, 2025 · 3:44 PM UTC

Mellen Y. Pu retweeted

Sebastian Raschka

@rasbt

Aug 24

Pretty cool that they open sourced the actual full-sized production model. Here’s the Grok 2.5 architecture overview next to a roughly similarly sized Qwen3 model. The MoE residual is quite interesting. Kind of like a shared expert. I don't think I've seen this setup before.

Elon Musk

@elonmusk

Aug 23

The @xAI Grok 2.5 model, which was our best model last year, is now open source. Grok 3 will be made open source in about 6 months. huggingface.co/xai-org/grok-…

203

1,669

Mellen Y. Pu · Aug 26, 2025 · 1:21 AM UTC

Mellen Y. Pu @CassielYM

Aug 26

arxiv.org/abs/2505.15047 #Principle, the scientific insights, make the scientific discovery rubost, generalizable and promising.

PiFlow: Principle-aware Scientific Discovery with Multi-Agent Collaboration

Large Language Model (LLM)-based multi-agent systems (MAS) demonstrate remarkable potential for scientific discovery. Existing approaches, however, often automate scientific discovery using...

arxiv.org

Rohan Paul · Aug 23, 2025 · 3:50 AM UTC

Mellen Y. Pu retweeted

Rohan Paul

@rohanpaul_ai

Aug 23

Hunyuan 3D-2.1 turns any flat image into studio-quality 3D models. And you can do it on this @huggingface space for free.

208

1,665

Kilian Lieret · Aug 20, 2025 · 2:59 PM UTC

Mellen Y. Pu retweeted

Kilian Lieret @KLieret

Aug 20

What if your agent uses a different LM at every turn? We let mini-SWE-agent randomly switch between GPT-5 and Sonnet 4 and it scored higher on SWE-bench than with either model separately. Read more in the SWE-bench blog 🧵

272

Cameron R. Wolfe, Ph.D. · Aug 14, 2025 · 11:15 PM UTC

Mellen Y. Pu retweeted

Cameron R. Wolfe, Ph.D.

@cwolferesearch

Aug 14

The gpt-oss models from OpenAI are a synthesis of ideas from prior research. Here are 10 interesting papers that were directly used in gpt-oss… (1) Longformer: Introduces sliding window attention, a form of sparse attention that is utilized in alternating layers of both gpt-oss models. (2) StreamingLLM: Describes the concept of attention sinks in large language models (LLMs)—these are tokens within a sequence that the model assigns high attention or weight to, simply because the softmax operation prevents the model from assigning attention to no tokens at all. (3) Off-by-one attention: Proposes a solution to attention sinks by allowing the attention mechanism to assign no attention to any token. This is achieved by adding a bias term of 1 to the denominator of the softmax operation within attention. In gpt-oss models, a similar approach is used, but the bias term is learned rather than fixed at 1. (4) Switch Transformer: Presents several ideas foundational to modern mixture-of-experts (MoE) based LLMs. It’s important to note that many other papers, in addition to Switch Transformer, have contributed to this field. (5) RMSNorm: A streamlined variant of layer normalization that is both more efficient and has fewer trainable parameters. Both gpt-oss models employ RMSNorm. (6) RoPE: Stands for Rotary Positional Encoding, a hybrid absolute/relative positional encoding method used by gpt-oss models. RoPE encodes absolute position using a rotation matrix and incorporates relative position information directly into the self-attention mechanism. (7) YaRN: A method for extending the context window in LLMs, which is adopted by gpt-oss models. YaRN works by adjusting the frequency basis used within RoPE and further training the LLM to handle longer contexts. (8) Flash Attention: Utilized by gpt-oss models, flash attention leverages system-level optimizations to significantly improve the computational and memory efficiency of the attention operation. (9) DeepSeek-R1: While the specific reasoning or reinforcement learning (RL) training strategies used by gpt-oss models are not fully detailed, the DeepSeek-R1 technical report offers a comprehensive overview of how RL training with verifiable rewards is implemented at scale. (10) Deliberative alignment: This is the safety training approach used by gpt-oss models, designed to teach the models how to reason through safety specifications and determine when it is appropriate to refuse a request.

420

Mizuho Aoki · Aug 15, 2025 · 10:36 AM UTC

Mellen Y. Pu retweeted

Mizuho Aoki @mizuhoaoki1998

Aug 15

🚗Nullspace MPC🚙 A novel multi-objective control framework explicitly handling task priorities — demonstrated here on a swerve drive robot navigating autonomously through tight spaces. Also includes MPPI as a baseline in the open source project. github.com/MizuhoAOKI/nullsp…

366

Higgsfield AI 🧩 · Aug 15, 2025 · 6:00 PM UTC

Mellen Y. Pu retweeted

Higgsfield AI 🧩

@higgsfield_ai

Aug 15

Meet Higgsfield Product-to-Video! Drop your product straight into the pic. Or start blank & build your MOST SELLING frame from 0. This is POWERFUL: Perfect Product Placement with 0 prompts & ALL our models. Retweet = full P2V Playbook in your DM.

190

721

345

2,694

AIGCLINK · Aug 12, 2025 · 12:33 AM UTC

Mellen Y. Pu retweeted

AIGCLINK

@aigclink

Aug 12

苹果开源了一款用于大规模嵌入向量的交互式可视化工具：Embedding Atlas，可对嵌入向量及其元数据可视化、交叉筛选及搜索可以缩放、旋转、拖动视图等等，从不同的角度观察数据，支持实时搜索与最近邻查找可以自动把相似的数据分组，并给每组数据贴上标签，自动分类可以显示密度，显示哪些区域的点比较密集，哪些区域比较稀疏具备高性能渲染能力，能处理大规模数据，支持多视图联动 #EmbeddingAtlas #向量数据

109

455

𝙩𝙮≃𝙛{𝕩}^A𝕀²·ℙarad𝕚g𝕞 · Aug 8, 2025 · 1:29 PM UTC

Mellen Y. Pu retweeted

𝙩𝙮≃𝙛{𝕩}^A𝕀²·ℙarad𝕚g𝕞

@TaNGSoFT

Aug 8

这些发现表明，LLM不是有原则的推理者，而是类似推理文本的复杂模拟器。

The AI Conference · Jul 10, 2025 · 5:22 PM UTC

Mellen Y. Pu retweeted

The AI Conference @AIconference

Jul 10

This is a roadmap to the future of AI. Hear directly from engineers, researchers, and product leads at Google, NVIDIA, GitHub, Uber, Riot Games, Pfizer, and more. This is where the right conversations happen.

The Conference That Moves You Forward

aiconference.com

164

歸藏(guizang.ai) · Aug 9, 2025 · 2:30 AM UTC

Mellen Y. Pu retweeted

歸藏(guizang.ai)

@op7418

Aug 9

阿里这次难得一见的大方啊直接给 Qwen Code 提供每天两千次的免费请求，开始学谷歌了

Qwen

@Alibaba_Qwen

Aug 8

💡 You get 2,000 free Qwen Code runs every day! Run this one simple command: npx @qwen-code/qwen-code@latest Hit Enter, and that’s it! 🚀 Now with Qwen OAuth support — super easy to use. Try it now and supercharge your vibe code! 💻⚡ Github：github.com/QwenLM/qwen-code

242

howie.serious · Aug 8, 2025 · 7:20 AM UTC

Mellen Y. Pu retweeted

howie.serious

@howie_serious

Aug 8

截至今天，只有o3和gpt-5 thinking 可以答对这道理：女儿考了38分。 gemini-2.5 pro、claude 4 opus都不行。

106

719