Yanlai Yang · Aug 15, 2025 · 4:56 PM UTC

Yanlai Yang

agentic learning ai lab retweeted

Yanlai Yang @YanlaiYang

Aug 15

Excited to present my work at CoLLAs 2025 @CoLLAs_Conf! In our paper arxiv.org/abs/2501.12254, we tackle the challenge of self-supervised learning from scratch with continuous, unlabeled egocentric video streams, where we propose to use temporal segmentation and a two-tier memory.

NYU Center for Data Science · Jun 11, 2025 · 6:54 PM UTC

agentic learning ai lab retweeted

NYU Center for Data Science @NYUDataScience

Jun 11

CDS Asst. Prof. Mengye Ren (@mengyer), Courant PhD students @alexandernwang and Christopher Hoang, and @ylecun introduce PooDLe: a self-supervised learning method enhancing AI vision in real-world videos by improving small object detection. nyudatascience.medium.com/le…

Learning to See Like Animals: How Small Objects in Dense Video Scenes Challenge AI Vision

PooDLe enhances AI vision in real-world videos by improving small object detection.

nyudatascience.medium.com

agentic learning ai lab · Apr 20, 2025 · 8:22 PM UTC

agentic learning ai lab

@agentic_ai_lab

Apr 20

Check out our latest paper on representation learning from naturalistic videos →

Mengye Ren @mengyer

Apr 20

How can we leverage naturalistic videos for visual SSL? Naturalistic, i.e. uncurated, videos are abundant and can emulate the egocentric perspective. Our paper at ICLR 2025, PooDLe🐩, proposes a new SSL method to address the challenges of learning from naturalistic videos. 🧵

CoLLAs 2025 · Feb 19, 2025 · 4:51 PM UTC

agentic learning ai lab retweeted

CoLLAs 2025

@CoLLAs_Conf

Feb 19

🚀 Why Lifelong Learning Matters 🚀 Modern ML systems struggle in non-stationary environments, while humans adapt seamlessly. How do we bridge this gap? 📖 Read our latest blog on the vision behind #CoLLAs2025 and the future of lifelong learning research: 🔗 lifelong-ml.cc/blogs/1 #MachineLearning #ContinualLearning #AI #LifelongLearning

NYU Center for Data Science · Jan 22, 2025 · 4:55 PM UTC

agentic learning ai lab retweeted

NYU Center for Data Science @NYUDataScience

Jan 22

New research by CDS MS student Amelia (Hui) Dai, PhD student Ryan Teehan (@rteehas), and Asst. Prof. Mengye Ren (@mengyer) shows that models’ accuracy on current events drops 20% over time—even when given the source articles. Presented at #NeurIPS2024. nyudatascience.medium.com/la…

Language Models’ Prediction of Current Events Degrades Over Time, Even With Latest Information

Language models lose accuracy on predicting events over time, even with access to up-to-date information.

nyudatascience.medium.com

Yanlai Yang · Dec 17, 2024 · 6:24 AM UTC

agentic learning ai lab retweeted

Yanlai Yang @YanlaiYang

17 Dec 2024

Just finished my first in-person NeurIPS journey. It’s great to meet many friends, old ones and new ones. Happy to see that my work is well-received in the poster session!

Ying Wang · Dec 10, 2024 · 7:54 PM UTC

agentic learning ai lab retweeted

Ying Wang @yingwww_

10 Dec 2024

Thrilled to be back at NYU CDS and continuing my research journey! The MSDS program provides incredible research opportunities that shaped my path. If you’re passionate about data science, this is the place to be!

NYU Center for Data Science @NYUDataScience

10 Dec 2024

Alumni Spotlight: CDS Master's grad Ying Wang (@yingwww_) ('23) turned her 3 research projects into publications, then returned to CDS as a PhD studying multimodal learning with Profs @andrewgwils and Prof. @mengyer. "Follow your curiosity!" she tells aspiring data scientists.

agentic learning ai lab · Nov 14, 2024 · 9:31 PM UTC

agentic learning ai lab

@agentic_ai_lab

14 Nov 2024

For more info, check out the links below! arXiv preprint: arxiv.org/abs/2411.08324 Project webpage and dataset download: agenticlearning.ai/daily-ora… Work done by @ameliadai_, @rteehas, @mengyer

Are LLMs Prescient? A Continuous Evaluation using Daily News as the Oracle

Daily Oracle: a continuous evaluation benchmark using automatically generated QA pairs from daily news to assess how the future prediction capabilities of LLMs evolve over time

agenticlearning.ai

agentic learning ai lab · Nov 14, 2024 · 9:31 PM UTC

agentic learning ai lab

@agentic_ai_lab

14 Nov 2024

There are still many interesting open questions: Would there be a limit to in-context learning? Do we need continuous pretraining to keep the model up-to-date?

agentic learning ai lab · Nov 14, 2024 · 9:31 PM UTC

agentic learning ai lab

@agentic_ai_lab

14 Nov 2024

Same even when feeding the models with the gold article (a.k.a. the actual news article for QA).

agentic learning ai lab · Nov 14, 2024 · 9:31 PM UTC

agentic learning ai lab

@agentic_ai_lab

14 Nov 2024

Can RAG save us from declining? Only partially. Retrieving relevant news articles can sometimes help, but it does not stop the declining trend.

agentic learning ai lab · Nov 14, 2024 · 9:31 PM UTC

agentic learning ai lab

@agentic_ai_lab

14 Nov 2024

We find a smooth temporal trend—LLMs' performance degrades over time, with a sharp decline beyond its "knowledge cutoff" period.

agentic learning ai lab · Nov 14, 2024 · 9:31 PM UTC

agentic learning ai lab

@agentic_ai_lab

14 Nov 2024

Will LLMs ever get out-dated? Can LLMs predict the future? Today, we release Daily Oracle, a daily news QA benchmark testing LLM’s temporal generalization and forecasting capability. 🧵

Mengye Ren · May 5, 2024 · 7:48 PM UTC

agentic learning ai lab retweeted

Mengye Ren @mengyer

5 May 2024

Humans and animals learn visual knowledge through continuous streams of experiences. How do we perform unsupervised continual learning (UCL) in the wild? Yipeng's latest paper reveals three essential components for UCL success in real-world scenarios: Plasticity, Stability, and Cross-Task Consolidation. Our Osiris algorithm is currently the state-of-the-art on UCL benchmarks. Excitingly, on new structured environment benchmarks, Osiris surpasses even IID training. Surprising? Perhaps, but an optimal continual learner should exploit temporal structures instead of falling behind in random sequences. Thrilled to share that our paper will be presented at #CoLLAs2024! arxiv.org/abs/2404.19132

Yipeng Zhang @yipengzz

5 May 2024

What should you care about when continually improving your model’s representations with self-supervised learning? Check out our paper titled *Integrating Present and Past in Unsupervised Continual Learning* to appear at #CLVision2024 and #CoLLAs2024! arxiv.org/abs/2404.19132 1/🧵

Mengye Ren · Mar 20, 2024 · 2:43 PM UTC

agentic learning ai lab retweeted

Mengye Ren @mengyer

20 Mar 2024

🔍 New LLM Research 🔍 Conventional wisdom says that deep neural networks suffer from catastrophic forgetting as we train them on a sequence of data points with distribution shifts. But conventions are meant to be challenged! In our recent paper led by @YanlaiYang, we discovered a curious behavior in overparameterized networks, especially LLMs—as we train the network on a cyclic sequence of documents, it starts to anticipate the next document and reverses the forgetting trend! ⤴️ ▶️ After 3-4 cycles, the network reverses over 90% of the forgetting right before seeing the original document again. ▶️ The amount of anticipation emerges with the size of the network. LLMs <= 160M show no anticipation. ▶️ We showed that you can reproduce such an effect in a toy network! Check out more details in our arXiv preprint on anticipatory recovery: Reawakening knowledge: Anticipatory recovery from catastrophic interference via structured training. 🚀 arxiv.org/abs/2403.09613 🚀 #LLM #AI #Research

217

Mengye Ren · Apr 29, 2023 · 7:32 AM UTC

agentic learning ai lab retweeted

Mengye Ren @mengyer

29 Apr 2023

Wondering about how to train deep neural networks without backprop? Check out our ICLR 2023 paper: arxiv.org/abs/2210.03310 Forward gradient computes gradient information from forward pass. But it is slow and noisy — it computes the directional gradient along a random weight perturbation. Gradient variance explodes with deeper and wider networks. Our key insight is to utilize local activity perturbation and we introduce a new architecture with many local losses throughout the network so that each loss is associated with a small number of weights. Both supervised and self-supervised (contrastive) learning works. Much better than a lot of backprop-free methods on large scale problems. Joint work with @geoffreyhinton @skornblith @lrjconan Our ICLR poster will be presented on Tuesday May 2, in Kigali, Rwanda!

Scaling Forward Gradient With Local Losses

Forward gradient learning computes a noisy directional gradient and is a biologically plausible alternative to backprop for learning deep neural networks. However, the standard forward gradient...

arxiv.org

211

926

Mengye Ren · Dec 13, 2023 · 6:52 PM UTC

agentic learning ai lab retweeted

Mengye Ren @mengyer

13 Dec 2023

Introducing LifelongMemory, an LLM-based personalized AI for egocentric video natural language query (NLQ). This amazing work is led by Ying Wang @yingwww_

GIF

Mengye Ren · Dec 21, 2023 · 11:44 AM UTC

agentic learning ai lab retweeted

Mengye Ren @mengyer

21 Dec 2023

🚨 New Research Alert! People have found safety training of LLMs can be easily undone through finetuning. How can we ensure safety in customized LLM finetuning while making finetuning still useful? Check out our latest work led by Jiachen Zhao! @jcz12856876 🔍 Our study reveals: 1️⃣ Traditional sequential safety finetuning? It leads to forgetting of unsafe data but also useful downstream data. 2️⃣ Surprising find: LLMs are much more likely to forget unsafe examples than other downstream examples after safety finetuning. The selective forgetting behavior only emerges in large scale models. 3️⃣ Enter ForgetFilter! Inspired by the selective forgetting behavior, we can filter out both explicit and implicit unsafe content based on forgetting, ensuring LLMs' safety without compromising task performance. 4️⃣ Bonus results: Interleaving between safe/unsafe data, LLMs can recall "forgotten" unsafe knowledge info despite safety finetuning. Long-term safety requires data filtering. 🔗 arxiv.org/abs/2312.12736 #AI #MachineLearning #DataSafety #LLMs #ForgetFilter #LargeLanguageModels