Janek Mann · Nov 3, 2025 · 7:27 PM UTC

Janek Mann

Janek Mann

@janekm

Nov 3

I’m sure safety protocols were excellent (and MSR are supposed to be inherently more safe)… and yet… these pictures have that Sci-Fi “in case something goes wrong, we put it in a remote desert” energy…

People's Daily, China

@PDChina

Nov 3

The 2-MW Thorium Molten Salt Reactor (TMSR) in northwest China's Gansu Province has successfully achieved the first-ever thorium to uranium nuclear fuel conversion and obtained valid experimental data following thorium fuel loading, making it currently the only operational molten-salt reactor in the world loaded with thorium fuel. This realization marks a milestone in TMSR development and confirms the technical feasibility of thorium utilization in a molten-salt reactor nuclear energy system. The project's ultimate goal is to construct a 100-MW demonstration project and realize its demonstration application by 2035.

rishit dagli · Oct 28, 2025 · 2:40 PM UTC

Janek Mann retweeted

rishit dagli @rishit_dagli

Oct 28

released torch-diffsim: minimal parallelizable physics simulator supporting differentiation entirely in torch put it in your training loop, and it just works out-of-the-box by allowing you to use torch.autograd github.com/Rishit-dagli/torc…

359

Tengfei Wang · Oct 23, 2025 · 8:36 AM UTC

Janek Mann retweeted

Tengfei Wang @DylanTFWang

Oct 23

⚡️Your video → a 3D scene. Just in seconds.⚡️ Only *one* model. No more steps❌. Just results🔝. #WorldMirror reconstructs everything (3DGS, depth, cameras) from any inputs (image, video, 3D prior) *all-at-once*! code: github.com/Tencent-Hunyuan/H… arxiv: arxiv.org/abs/2510.10726

112

728

Janek Mann · Oct 19, 2025 · 8:13 PM UTC

Janek Mann

@janekm

Oct 19

This is really great…

Daniił Vołkaū

@machina9000

Oct 19

Au Lait Cru | Short film Humans ferment the future in cheese-centres trying to achieve AGI. If we're midwives to transformation. Then who are the rats? Or do you ask yourself if the baby is a god that forgets its parents? PS I do remember how Ben Affleck said not to long ago that AI has no taste. Well AI doesn’t need to have it but a human using it has. Here is a short film that is not sci-fi, it is not about vfx, it is not about remaking any existing movie. No UGC. It is pure cinematography and visual prose. Apply it to anyone who is still struggling to get that AI is a tool that will be used by humans. And it will stay with us.

POM · Oct 19, 2025 · 6:56 PM UTC

Janek Mann retweeted

POM

@peteromallet

Oct 19

Introducing InSubject 0.5, a QwenEdit LoRA trained for creating highly consistent characters and objects w/ just a single image ref. Together w/ InStyle, it significantly outperforms Nano Banana, etc. on style + character tasks. Output attached, 🤗 link + dataset below!

281

Nupur Kumari · Oct 17, 2025 · 6:49 PM UTC

Janek Mann retweeted

Nupur Kumari

@nupurkmr9

Oct 17

🚀 New preprint! We present NP-Edit, a framework for training an image editing diffusion model without paired supervision. We use differentiable feedback from Vision-Language Models (VLMs) combined with distribution-matching loss (DMD) to learn editing directly. webpage: nupurkmr9.github.io/npedit/ w/ @ShengYuWang6,Cherry (N.X.) Zhao, @YotamNitzan, Yuheng Li, Krishna Kumar Singh, @rzhang88, @elishechtman, @junyanz89, @xxunhuang

173

GIF

Janek Mann · Oct 6, 2025 · 9:05 PM UTC

Janek Mann

@janekm

Oct 6

Some Grok video. It’s really not a bad video model! Though it does want to make everyone kiss 😅

Janek Mann · Oct 5, 2025 · 5:57 PM UTC

Janek Mann

@janekm

Oct 5

Big `signs of AGI` moment with Sora2... How did it figure out that this is a plausible way for a cockapoo to ride a unicycle? "a cockapoo riding a unicycle wearing a pink tutu while holding 5 different colored balloons"

Janek Mann · Sep 30, 2025 · 10:17 AM UTC

Janek Mann

@janekm

Sep 30

Literally the first prompt I try with Sonnet4.5... Nah, not re-subscribing to this clown show of a company. And they have the gall to brag about improved alignment 🤣

Janek Mann · Sep 19, 2025 · 9:28 AM UTC

Janek Mann

@janekm

Sep 19

Another Seedream4 test... this time remixing some of my old photography, first image my original photo, second Seedream remix! (likeness is still a weakness)

Janek Mann · Sep 15, 2025 · 10:13 PM UTC

Janek Mann

@janekm

Sep 15

Some more Seedream Fashion/Editorial tests

Janek Mann · Sep 14, 2025 · 10:16 AM UTC

Janek Mann

@janekm

Sep 14

This is a very helpful comparison between Banana and Seedream strengths/capabilities, and touches on the difference in Chinese prompt comprehension... It's worth trying to translate prompts to Chinese for Seedream, sometimes it works better!

草木青

@CaomuQ625

Sep 14

即梦4.0 VS Nano Banana 深度评测：我挖出了被99%教程忽略的核心技术差异最近AI圈被 Nano Banana 和即梦4.0 刷屏了。但我发现一个尴尬现象：铺天盖地的都是“照片变手办”这种过时玩法，没人聊点干货。大家缺的不是“玩法说明书”，而是能看透本质的深度评测分析。所以，我写了这篇评测报告。👇 这篇文章不玩花活，直击核心：它俩技术有何不同？产品策略是啥？你到底该用哪个？所有答案都在这篇正面对决里。篇幅有限，万字评测可见原文：mp.weixin.qq.com/s/5g1YjTxNg… 以下是评测总评： 1⃣Photoshop的“蛮力”与设计师的“巧劲” 好了，经过上面十几个回合的“贴身肉搏”，相信大家已经对即梦4.0和 Nano Banana的脾气秉性有了非常直观的感受。现在，让我们从这些眼花缭乱的测试内容中跳出来，来一场真正的复盘，聊聊这些现象背后的技术本质。 2⃣评测结果的“规律”：一个“听话的全才”，一个“偏科的专才” 首先，我们总结一下能从评测中直接观察到的规律：在处理复杂指令、保持主体一致性、理解微妙的上下文关系上，Nano Banana几乎是碾压式的胜利。无论是“九宫格”任务中对布局指令的严格遵守，还是“建筑户型图”和“样衣示意图”中对原图细节的高度还原，Nano Banana都展现出了一个顶级模型应有的、强大的“指令跟随”能力。而即梦4.0则表现出了明显的“偏科”。它在中文文字生成和特定商业风格（如小红书、电商海报）上表现惊艳，几乎达到了“设计稿”级别。但在面对稍微复杂一点的通用任务时，就显得力不从心，频繁出现不听指令、丢失主体细节、甚至“自由发挥”的情况。 3⃣为什么会这样？难道是即梦4.0背后的模型能力不如谷歌Gemini吗？当然，也有这方面原因。不过，我要探讨的是它们背后两条截然不同的技术路线。通用基础模型 vs. 垂直领域精调 Nano Banana，可以说是典型的“通用基础模型”的产物。你可以把它想象成一个用全世界图书馆的书、博物馆的画、互联网上亿万张图片喂养出来的“通才”。它的特点是：知识面广，能力均衡：因为它见过的数据足够多、足够杂，所以它对世界有着更广泛、更底层的理解。这解释了为什么它能轻松处理各种天马行空的、跨领域的复杂指令。它就像Photoshop，本身没有预设的“风格”或“目的”，它只是一个极其强大的工具，忠实地执行你的每一个像素级指令。 “所说即所得”：它倾向于完全信任用户的输入。你给的Prompt越精确、越复杂，它能给出的结果就越接近你的想象。它不会去“猜测”你是不是想要别的。这种高控制性，对于需要精准创作的专业人士来说至关重要，这与Stable Diffusion、Midjourney的设计非常相似。而即梦4.0的行为模式，则指向了另一条路——“垂直领域精调”。你可以把它想象成一位“专才”。它可能和 Nano Banana一样，也学习了海量的通用知识，但在此之后，它的“老板”（字节）又给它找了一位“专业导师”，用成千上万份中国市场的电商海报、社交媒体帖子、广告设计稿等垂直数据，对它进行了“魔鬼式”的专项训练。 OS：大胆猜测一波，可能是字节高层感受到了 Nano Banana爆火的压力，内部团队紧急对即梦3.0进行专项训练，仓促下推出了即梦4.0。这种“精调”带来了两个直接后果，也完美解释了我们在评测中看到的一切： “专业领域”的超能力：在中文排版、营销氛围感营造这些方面，即梦4.0之所以能比Nano Banana强一些，正是因为这些“精调数据”让它深刻理解了中文语境下的商业美学。它不再是一个冷冰冰的图像生成器，而是一个被注入了“设计灵魂”的解决方案。 “通用能力”的钝化：然而，精调是一把双刃剑。当一个模型被过度训练去适应某个特定领域时，它在其他领域的“通用性”和“灵活性”就可能会下降。这就像一个顶级的广告设计师，你让他去画严谨的建筑施工图，他可能会下意识地加入一些美化和氛围渲染，反而忽略了图纸的精确性。这就是为什么即梦4.0在处理“建筑户型图”和“样衣示意图”时会“跑题”——这些任务超出了它被“精调”过的知识范围。 4⃣即梦4.0是不是真的优化了用户的提示词？我的答案是：是的。这在技术上被称为“自动提示词重写”。一些 AI产品为了降低用户使用门槛，会在后台内置一个语言模型，先分析用户的原始指令，然后把它“优化”成一个模型更容易理解、更容易出好效果的“标准版”指令，再拿去生成图片。这下，即梦4.0的很多“迷惑行为”就说得通了：当你在“九宫格”任务中强调“输出在一张图片上”时，它的“优化器”可能觉得“为用户生成九张高质量单图效果更好”，于是自作主张地修改了你的核心指令。这种“优化”在简单、常规的任务上可能是“锦上添花”，但在需要精准控制的复杂任务上，就变成了“画蛇添足”，甚至是“灾难”。它破坏了专家用户对生成过程的精确控制。所以，这再次反映了产品战略的根本差异：Nano Banana把你当成“创作者”，而即梦4.0把你当成“客户”。前者提供工具，后者提供服务。我应该入坑哪一个？没有最优解，只有最优选讲到这里，结论已经非常清晰了。这场对决，我想说的没有谁胜谁输——双方都在自己的预设战场上，取得了胜利。为了让你更清晰地理解，我做了下面这个总结表：（图） 5⃣经验技巧 Get 若你不满意你使用 AI生成的图片，即便你抽卡几次也无济于事，那么你可以尝试去降低任务难度（提示词复杂度）或者降低原图复杂性。在人物生成时，最好选取一张主体较少，背景简单的图片。当然，若你还是想用这张背景较为复杂的图片，那么，你可以分步骤进行，先让 AI进行抠图，或者更换一个纯色背景，再进行你的最终任务。如@AIExplorerTim分享的帖子，这点与我的经验不谋而合x.com/AIExplorerTim/status/1…。OS：咦，这不就是妥妥的上下文工程吗？ 📷 可以广泛收集，打造你专属的“AI工具箱”。还记得我们评测中那个有趣的插曲吗？在 Nano Banana和即梦4.0都搞不定的公众号头像设计上，反而是 ChatGPT给了我们意外的惊喜。这给了我最重要的一个启示：顶级玩家从不迷信某一个“万能工具”，他们拥有一个丰富的“工具箱”。如我这篇文章提到的一个工作流，在让 AI撰写论文文献综述时，可以先使用 Gemini深度研究输出研究计划，接着使用 ChatGPT进行深度研究，输出文献综述。集各个 AI的优势于一个工作流中。好了，以上就是本次分享的全部内容。希望能帮到各位“稀有学生”建立一些认知。若对你有帮助，不妨点个赞，加个关注。

Janek Mann · Sep 13, 2025 · 9:04 PM UTC

Janek Mann

@janekm

Sep 13

Some Seedream 4.0 portraits

Janek Mann · Sep 10, 2025 · 9:50 PM UTC

Janek Mann

@janekm

Sep 10

Some Seedream 4.0 experiments

Janek Mann · Aug 29, 2025 · 2:07 PM UTC

Janek Mann

@janekm

Aug 29

I enjoyed this mini-story a lot… great example how we’ve reached the stage where the tools can just disappear in the background and it’s just a creative little movie now, not an AI-video.

Kevin

@brightlight_88

Aug 29

Last Monday I decided to give @Runway Gen:48 Aleph Edition a try at the very end for some fun. With only few hours free I raced to finish an entry. Here is the result. #runway #gen48 #aleph

Janek Mann · Aug 29, 2025 · 8:58 AM UTC

Janek Mann

@janekm

Aug 29

This looks like an interesting new model (I'll give it a try), and I love this way of introducing it by highlighting prompts from some great creators!

LetzAI

@letz_ai

Aug 28

Introducing: Cinematic Mode, our new mode for ultra cinematic visuals. Now available for everyone. Only on LetzAI. Here's what it can do 👇

Janek Mann · Aug 27, 2025 · 2:45 PM UTC

Janek Mann

@janekm

Aug 27

Nano Banana with an upscaler (here Magnific Precision via Freepik) produces great outputs. A tad over-produced looking (like there was a full photography crew with reflectors and scrims). Tempted to start a dogfluencer Instagram account 😅 @javilopen

Janek Mann · Aug 23, 2025 · 11:42 AM UTC

Janek Mann

@janekm

Aug 23

Crazy promising results. Looks solid from first glance.

Jiawei Zhao

@jiawzhao

Aug 22

Introducing DeepConf: Deep Think with Confidence 🚀 First method to achieve 99.9% on AIME 2025 with open-source models! Using GPT-OSS-120B even without tools, we reached this almost-perfect accuracy while saving up to 85% generated tokens. It also delivers many strong advantages for parallel thinking: 🔥 Performance boost: ~10% accuracy across models & datasets ⚡ Ultra-efficient: Up to 85% fewer tokens generated 🔧 Plug & play: Works with ANY existing model - zero training needed (no hyperparameter tuning as well!) ⭐ Easy to deploy: Just ~50 lines of code in vLLM (see PR below) 📚 Paper: arxiv.org/pdf/2508.15260 🌐 Project: jiaweizzhao.github.io/deepco… joint work with: @FuYichao123 , xuewei_wang, @tydsh (see details in the comments below)

Janek Mann · Aug 16, 2025 · 5:39 AM UTC

Janek Mann

@janekm

Aug 16

Very interesting results from this replication. Really glad to see they ran some ablation studies as well to get to the bottom of it.

François Chollet

@fchollet

Aug 15

We were able to reproduce the strong findings of the HRM paper on ARC-AGI-1. Further, we ran a series of ablation experiments to get to the bottom of what's behind it. Key findings: 1. The HRM model architecture itself (the centerpiece of the paper) is not an important factor. 2. The outer refinement loop (barely mentioned in the paper) is the main driver of performance. 3. Cross-task transfer learning is not very helpful. What matters is training on the tasks you will test on. 4. You can use much fewer data augmentations, especially at inference time. Finding 2 & 3 mean that this approach is a case of *zero-pretraining test-time training*, similar to the recently published "ARC-AGI without pretraining" paper by Liao et al.

Janek Mann · Aug 13, 2025 · 3:30 PM UTC

Janek Mann

@janekm

Aug 13

What a fantastic idea! The prompts could also make a great set of benchmark prompts in the future...

Eike Drescher @eikedrescher

Aug 13

Introducing Spielwerk – The Tiktok for vibecoded mini games! Scroll through an endless feed of mini games, all created inside the app by other people. You can like, comment and remix any game, and beat your friends high scores. 1/6 More Below 👇