Major in ML DL and NLP/CV/AIGC😋.

Beijing
Joined April 2019
We have released our first Chinese-English bilingual anime model AniMemory-alpha. Tech report is in progress. Just try it. Main features: -- Good bilingual prompt following -- The model is mainly にじげん(二次元) style -- Competitive image quality -- Impressive creative ability
1
1
3
Melison retweeted
Hunyuan 3D-2.1 turns any flat image into studio-quality 3D models. And you can do it on this @huggingface space for free.
Melison retweeted
Super
5
1
21
Melison retweeted
Image to 3D - 1536 pro is a revolutionary AI model for 3D creation. - Upload 1 image - Select resolution and texture option - Hit generate
1
1
11
Melison retweeted
3DCG Modeler @choco_ikarashi has unveiled what has to be the most convincing 2D-like 3D model, depicting The Apothecary Diaries' Maomao. Approved by @DillonGoo and @andrewpprice: 80.lv/articles/it-s-almost-i…
9
235
8
2,954
0
Melison retweeted
Launched the initial version of 3D Model Viewer for MacOS Download here: 3dviewer.xyz/ (<15mb) & share your feedback QuickLook and Open 15+ 3D model formats, like gltf, fbx, obj, 3dm, stl, usdz, and more without any complicated software #Apple #threejs #3D #3DModel
Melison retweeted
random model find; might be the best 3D multi-person pose estimation I have ever seen; it's Apache-2.0 Multi-HMR: Regressing Whole-Body Human Meshes for Multiple Persons in a Single Shot; ECCV2024
Melison retweeted
Digital Artist @blid17889363 has unveiled a jaw-dropping 3D character model that looks like a 2D hand-painted sketch, made with 3ds Max, ZBrush, and Cinema 4D. More renders here: 80.lv/articles/jaw-dropping-…
Melison retweeted
discover Nomad Sculpt — my favorite 3D app (mithaaf_)
6
19
0
Melison retweeted
Two weeks ago I fixed one of my teeth with algorithms I wrote a couple of years ago! I got hooked by 3D scanning when I started to work for a software shop in Zurich that was programming 3D computational geometry algorithms for denture scanning to produce crowns (and more). Back then, a typical reconstruction pipeline was like: scan the patient’s teeth using an intraoral scanner, reconstruct the surface mesh, design the restoration digitally, and finally mill the crown out of ceramic. We were working mostly with point clouds and meshes, but it wasn’t just math, it was craftsmanship translated into a digital process. Every micron mattered. You could literally see how a good algorithm meant a better fit in someone’s mouth. Gaussian Splatting isn’t about surface reconstruction, it’s about appearance reconstruction. It doesn’t care about explicit topology, it captures how light interacts with the scene. In a sense, it’s the opposite philosophy of the dental world: instead of modeling what the object is, it models how the object looks. 3D Gaussian Splatting enables applications like training self driving cars, teaching robots to understand their environment, creating virtual worlds, or monitoring real sites. It represents scenes as millions of small Gaussians rendered in real time without the need for meshes or textures. Coming from a world where precision geometry was everything, this shift felt natural. It’s still about reconstruction, but with a different goal: not manufacturing a perfect object, but reproducing how the world actually looks. Two weeks ago I got my first dental crown, made with the same software, reconstruction algorithms, and Swiss precision I once helped develop. I haven’t worked there in two years, but sitting in that chair and seeing the process from the other side was a proud moment. It reminded me why I love this field.
Introducing Generative View Stitching (GVS), a non-autoregressive sampling method for length extrapolation of video diffusion models. GVS enables collision-free camera-guided video generation for predefined trajectories, including Oscar Reutersvärd's Impossible Staircase (1/9).
Melison retweeted
This gaussian splat distills a 6gb Unreal Engine scene w/ 100+ real-time lights & millions of tris into a 40mb asset that runs anywhere 🫠 My MacBook would choke on this via Unreal. Synthetically splatted via @sparkjsdev? Max FPS in *iOS Safari* and runs alongside standard WebGL
Melison retweeted
Somehow a 4.5 million polygon model fits into 1mb and renders like 🫰💥 Process: 221mb .glb model ➡️ 180 shots from Blender spherical photo rig ➡️ 50k splats in PostShot training ➡️ 1mb .sog file with splat-transform
Melison retweeted
Gemini 2.5 Flash Nano Banana on Google AI Studio { "Objective": "Generate a cinematic 3-frame collage using the facial features of the attached photo as reference, portraying a woman in a lush green meadow with a contemplative, natural, and emotional tone.", "Visual_Concept": { "Theme": "Connection between human emotion and nature", "Tone": "Cinematic realism blending raw authenticity with poetic serenity", "Lighting": "Soft, diffused natural daylight under overcast conditions", "Color_Style": { "Overall_Grade": "Moody, natural color grading", "Contrast": "Soft highlights and diffused shadows", "Color_Mix": "Rich, balanced color across all frames for emotional continuity", "Profile": "Slightly desaturated greens, warm midtones, soft contrast curve" }, "Texture_and_Finish": { "Focus_Transitions": "Soft transitions to emphasize tactile details (skin, grass, light)", "Grain": "Subtle film grain for nostalgic realism", "Tone_Curve": "Filmic curve to maintain cinematic aesthetic" } }, "Frame_Sequence": { "Top_Frame": { "Description": "The woman stands in an open meadow, arching her back and lifting her arms gracefully toward the tree canopy above. Soft light filters through the leaves as her auburn hair glows in the natural sky light.", "Mood": "Liberation and connection with nature", "Composition": { "Framing": "Wide environmental portrait", "Depth": "Emphasis on the subject’s movement and natural surroundings" } }, "Middle_Frame": { "Description": "A close-up shot of the woman’s face in warm, natural color tones. She smiles softly, her expression conveying quiet joy and self-awareness. Her expressive eyes and freckles are illuminated by diffused light, while a loose strand of auburn hair drifts across her cheek, adding warmth and intimacy to the frame.", "Mood": "Serene happiness and emotional openness", "Composition": { "Framing": "Close-up portrait", "Color_Palette": "Warm midtones with soft greens and natural skin tones", "Lighting": "Natural overcast light emphasizing gentle smile and facial texture" }, "Emotion": { "Expression": "Soft smile with relaxed eyes", "Feeling": "Contentment and peaceful reflection" } }, "Bottom_Frame": { "Description": "The woman reclines in the grass, extending her hand gently toward the camera with a tender, introspective expression. Tall grass and trees sway behind her, enhancing the dreamy, cinematic mood.", "Mood": "Vulnerability and quiet connection", "Composition": { "Framing": "Mid-shot with environmental depth", "Focus": "Selective sharpness on hand and face" } } }, "Camera_Settings": { "Lens": "50mm f/1.4", "Aperture": "f/2.0", "Shutter_Speed": "1/320 sec", "ISO": 200, "White_Balance": "6000K", "Lighting": "100% natural overcast daylight", "Focus_Mode": "Manual (for selective sharpness on eyes and hand details)", "Color_Profile": { "Greens": "Slightly desaturated", "Midtones": "Warm", "Contrast": "Soft curve" } }, "Collage_Layout": { "Frames": 3, "Orientation": "Vertical", "Layout_Type": "Cinematic 3×1 sequence", "Aspect_Ratio_Per_Frame": "3:4" }, "Artistic_Guidelines": { "Facial_Integration": "Use the facial features from the attached reference photo to ensure likeness and emotional continuity across all frames.", "Balance": "Combine realism with poetic emotion through body language, texture, and light.", "Emotional_Arc": "Transition from expressive movement (freedom) → warm introspection (serenity) → gentle connection (resolution)." }, "Output_Format": { "Type": "Cinematic collage (image composition)", "Resolution": "8K", "Purpose": "High-quality visual narrative for editorial or artistic showcase" } }
SoftMimic: Learning Compliant Whole-body Control from Examples Project: gmargo11.github.io/softmimic… Paper: arxiv.org/abs/2510.17792 This new work from MIT enables robots to respond compliantly to external forces while maintaining balance and posture by leveraging an inverse kinematics solver to generate an augmented dataset of feasible compliant motions to train RL policy for tracking reference motions, which benefits generalization and safety of the generated motions. - Key idea to learn compliant behavior: (1) Use an IK solver to build dataset of feasible compliant trajectories. (2) Train an RL policy that observes the robot state and original reference motion, but rewards it for tracking the pre-computed compliant trajectory from the augmented trajectory. This formation forces the policy to infer external forces while trying to reacting with the demonstrated compliant behavior. - The training episodes selection consists of a motion clip, a desired robot stiffness and an external force profile that pulls a selected link of the robot towards a moving setpoint. During inference time, user can input different stiffness to invoke different robot behavior for the same example motion. - In real-world deployment, this approach enables application of robots generalizing skills, absorbing collisions, interacting gently, avoiding damage, admitting disturbance, accommodating payload, etc.
4
21
1
119
Melison retweeted
Flutter UI library inspired by shadcn/ui
1
3
24
Melison retweeted
Synthetic 3d gaussian splatting still underused and underrated. get super compact, hyper realistic assets that render like butter with @sparkjsdev
Somehow a 4.5 million polygon model fits into 1mb and renders like 🫰💥 Process: 221mb .glb model ➡️ 180 shots from Blender spherical photo rig ➡️ 50k splats in PostShot training ➡️ 1mb .sog file with splat-transform
2
16
260
Melison retweeted
Ollama now supports all Qwen3-VL models locally! Give it a try! 🚀
ollama run qwen3-vl Ollama's engine now supports all the Qwen 3 VL models locally. 2B to 235B parameter sizes. The smaller models work exceptionally well for their size. The latest version of Ollama v0.12.7 is needed! Give it a try! 👇👇👇
Melison retweeted
Manifold Muon stabilizes large model training, but it's expensive 💰 -- requiring an inner loop solve for each update. 💡 But you can significantly accelerate it, leading to 2.3x speedup on @thinkymachines's experiment with no performance loss! Blog and 🧵below…
5
25
2
268
Melison retweeted
Can someone explain the billion update rules here? What are the desiderata and what are the tradeoffs?
Kimi Linear Tech Report is dropped! 🚀 huggingface.co/moonshotai/Ki… Kimi Linear: A novel architecture that outperforms full attention with faster speeds and better performance—ready to serve as a drop-in replacement for full attention, featuring our open-sourced KDA kernels! Kimi Linear offers up to a 75% reduction in KV cache usage and up to 6x decoding throughput at a 1M context length. Key highlights: 🔹 Kimi Delta Attention: A hardware-efficient linear attention mechanism that refines the gated delta rule. 🔹 Kimi Linear Architecture: The first hybrid linear architecture to surpass pure full attention quality across the board. 🔹 Empirical Validation: Scaled, fair comparisons + open-sourced KDA kernels, vLLM integration, and checkpoints. The future of agentic-oriented attention is here! 💡