Harrison Kinsley · Nov 6, 2025 · 8:01 PM UTC

Harrison Kinsley · Nov 6, 2025 · 8:01 PM UTC

Harrison Kinsley

Harrison Kinsley

@Sentdex

Nov 6

can anyone actually explain, with hard facts, how it's possible AMD still doesn't meaningfully compete in the deep learning space? I understand they lack certain pieces of software. I do not understand how they could reasonably still lack it though.

Nov 6, 2025 · 8:01 PM UTC

293

Blake Edwards · Nov 6, 2025 · 8:04 PM UTC

Blake Edwards

@bitstream_blake

Nov 6

Replying to @Sentdex

Software and community

Harrison Kinsley · Nov 6, 2025 · 8:06 PM UTC

Harrison Kinsley

@Sentdex

Nov 6

no. I want more than just general words. What software? Why would this software take 5+ years since realizing nvidia would be the biggest public company of all time? The community would show up immediately if the software was there. I'd buy 10 AMD gpus right now.

more replies

J.C. · Nov 7, 2025 · 2:24 AM UTC

J.C.

@Voodoo319E

Nov 7

Replying to @Sentdex

They don't have the alien tech... 🤷‍♂️

Harrison Kinsley · Nov 7, 2025 · 2:35 AM UTC

Harrison Kinsley

@Sentdex

Nov 7

Are we watching the same youtubers lmaooo i think i just heard someone propose this for the first time today.

more replies

sudo rm -rf · Nov 6, 2025 · 9:53 PM UTC

sudo rm -rf @itsjustmarky

Nov 6

Replying to @Sentdex

They do, just not at the consumer level

Harrison Kinsley · Nov 6, 2025 · 9:57 PM UTC

Harrison Kinsley

@Sentdex

Nov 6

I hear this, to some slight extent, but the only evidence of this is money sort of just moving around with future deals. The entire tech sector is doing this money moving thing, I am curious if any real actual processing, besides payments, has occurred lol

more replies

Turan Erdem · Nov 6, 2025 · 8:19 PM UTC

Turan Erdem @turannerdem

Nov 6

Replying to @Sentdex

By waiting, AMD lets NVDIA shoulder huge R&D & mature space. When standards solidify & open-source grows, AMD enters with competitive HW at lower barriers. Smart late-mover, tho software gap from underinvestment

Harrison Kinsley · Nov 6, 2025 · 8:22 PM UTC

Harrison Kinsley

@Sentdex

Nov 6

I mean. I could sorta see that, but tbh what are we talking about here? When is AMD gonna enter and sweep? BC rn NVIDIA is running the US economy basically. What's AMD waiting for?

more replies

Zhiqiang Wang · Nov 6, 2025 · 8:56 PM UTC

Zhiqiang Wang @zhiqiangarch

Nov 6

Replying to @Sentdex

To build another CUDA is actually more difficult than it seems to be. Because a large portion of code has to be rewritten for different hardware in order to keep API stable. Chris Lattner had a blog series on Mojo trying to demystify CUDA effect.

Harrison Kinsley · Nov 6, 2025 · 8:59 PM UTC

Harrison Kinsley

@Sentdex

Nov 6

I do not doubt it's very difficult. But we're literally talking about being the most valuable public company here. It's not too hard for this.

more replies

the tiny corp · Nov 7, 2025 · 4:05 AM UTC

the tiny corp

@__tinygrad__

Nov 7

Replying to @Sentdex

We've been working on it for 3 years. The software is super hard. Give it 2 more.

Ignacio de Gregorio · Nov 7, 2025 · 3:47 PM UTC

Ignacio de Gregorio

@thewhiteboxAI

Nov 7

Replying to @Sentdex

Besides software, as you mention, it’s a scale-up issue; they’ve yet to release a single rack scale solution. Once they achieve that next year, we’ll see if the software factor is just too lagging or it explodes

Calcs · Nov 7, 2025 · 12:14 AM UTC

Calcs

@0000CCS

Nov 7

Replying to @Sentdex

There’s too much money at play for someone to not rise up for some of that market share. It’s also possible AMD CEO has conflicts of interest (just my opinion). I’m sure thanksgiving dinners would be pretty rough if she competes with her family

Kim Noël ⚡ 📖 · Nov 7, 2025 · 4:58 AM UTC

Kim Noël ⚡ 📖

@KimNoel399

Nov 7

Replying to @Sentdex

AMD recently hired Sharon Zhou as a VP of AI. I think she is working right now to make AMD bigger in the training stage

AMD

@AMD

Jun 17

Open source isn’t just a philosophy, it’s a force multiplier for AI progress. @realSharonZhou, VP of AI at AMD, shares why open ecosystems are critical to unlocking the full potential of generative AI. From enabling community-driven innovation to fueling a virtuous cycle of data and model improvement, open source is shaping a more inclusive, more capable AI future. #AdvancingAI

Oversized Moose With Socks (xlmoose.eth) · Nov 7, 2025 · 12:45 PM UTC

Oversized Moose With Socks (xlmoose.eth)

@undacappn

Nov 7

Replying to @Sentdex

It’s Temu nvidia bro, always has been.

Hot Aisle · Nov 7, 2025 · 2:43 AM UTC

Hot Aisle

@HotAisle

Nov 7

Replying to @Sentdex

I could spend hours on this topic alone.

Gareth · Nov 6, 2025 · 8:36 PM UTC

Gareth

@hardwareguy_

Nov 6

Replying to @Sentdex

It goes far beyond missing software. It's an ecosystem advantage that NVIDIA has fortified over more than 15 years, fueled by its massive scale and AMD's split priorities. They are just really far behind. @__tinygrad__ are doing some interesting work with AMD cards though!

solomūn · Nov 7, 2025 · 8:09 PM UTC

solomūn

@solomooon_

Nov 7

Replying to @Sentdex

Agree it's incomprehensible Just reproducing CUDA should not take so long and they could do it in the open to benefit from other people helping Or is the amount of engineers capable of this fast so minimal?

Donny Donuts · Nov 7, 2025 · 6:11 AM UTC

Donny Donuts

@barry_nevio

Nov 7

Replying to @Sentdex

AMD had better hardware than Nvidia around 2013. Then AMD released the horrible bulldozer processor and banked everything on Vega which took forever. Nvidia drops pascal leaving AMD in dust. Lisa Su finally takes over, has no choice but to let Vega happen, it's awful. They release ryzen and focus on CPU. They over take Intel in market share and suddenly they need to play catch up on GPUs. Had they not dropped the ball with Vega back then, they probably would've had some alternative to the h100 in 2022 and had a duopoly.

Jason Head · Nov 7, 2025 · 7:45 AM UTC

Jason Head

@JasonMHead

Nov 7

Replying to @Sentdex

Perhaps they could write their own CUDA implementation, like was done for Java. Using the API, but AMD code under the hood. Evaluating the aspects that make CUDA so successful.

Martin Kemka · Nov 7, 2025 · 6:49 AM UTC

Martin Kemka

@mkemka_

Nov 7

Replying to @Sentdex

The MI300x has had some more support in terms of tutorials and hackathons recently.

sdmat · Nov 6, 2025 · 11:23 PM UTC

sdmat

@sdmat123

Nov 6

Replying to @Sentdex

OAI recently committed to buy a vast amount of AMD hardware - I think that counts! If you mean training specifically it's about software and networking, both of which are advancing rapidly.

Thomas Kelly 🇻🇦🗝️ · Nov 7, 2025 · 8:02 PM UTC

Thomas Kelly 🇻🇦🗝️

@Thomasmarkelly

Nov 7

Replying to @Sentdex

"inference. $MSFT has built some toolkits to help convert CUDA models to $AMD's ROCm so that you could use it on an $AMD 300x, and they are getting a lot of inquiries about $AMD's path and the 400x and 450X: »We're actually working with AMD on that to see what we can do to maximize that.«

IVProduced · Nov 8, 2025 · 10:19 PM UTC

IVProduced

@_IAmInvictus

Replying to @Sentdex

Someone quoted and I agree. Prolly cuz the cost is low (good thing) and usability is also low. Took me 3 days to get my dual r9700s working properly.

Lee Penkman · Nov 7, 2025 · 1:54 AM UTC

Lee Penkman

@LeeLeepenkman

Nov 7

Replying to @Sentdex

indeed the network effect, so nvidia comes out with cagra search, then i with a nvidia card decide to use cuvs for that then think hmm i can make even faster by fusing some of my existing cuda embedding model with cuvs so i will rewrite my own cuda for that (me and claude made lee101/gobed a cuda enabled search engine github.com/lee101/gobed ) i dont have a amd gpu, not many other deep learning researchers do and so i dont care about amd support. its only really for me and my cuda machines. This is replaying like many thousands of times with various open source researchers building ontop of other cuda libraries on their cuda machines eg all these: AI Dump of fun cuda words: cuBLASGPU-accelerated Basic Linear Algebra Subprograms (matrix/vector ops).cuSPARSESparse matrix operations (CSR/COO formats).cuSOLVERDense and sparse linear solvers (LU, QR, Cholesky, eigenvalues).cuFFTFast Fourier Transforms on GPU.cuRANDRandom number generation on GPU.cuTENSORHigh-performance tensor algebra (Einstein summation, contractions).cuDNNDeep Neural Network primitives (convolutions, RNNs, activations, etc).cuBLASLt"Lightweight" version of cuBLAS with advanced heuristics and mixed precision.cuSPARSELtSparse matrix acceleration for deep learning inference. ⚙️ Systems & Runtime Libraries LibraryPurposeCUDA RuntimeHigh-level host/device management layer (kernel launches, memory copies).CUDA Driver APILower-level control of GPU contexts and execution.NCCL (NVIDIA Collective Communication Library)Multi-GPU / multi-node collective ops (all-reduce, broadcast, etc).NVTX (NVIDIA Tools Extension)Instrumentation markers for profiling with Nsight tools.NVML (NVIDIA Management Library)GPU monitoring, thermals, utilization, power management.NVRTCRuntime compilation of CUDA kernels (JIT).NPP (NVIDIA Performance Primitives)Image, video, and signal processing primitives.ThrustC++ STL-like parallel algorithms library (map/reduce/sort/etc). 🧩 Domain-Specific CUDA SDKs DomainLibraries / SDKsAI / MLcuDNN, cuTENSOR, TensorRT, cuSPARSELt, cuBLASLtData AnalyticsRAPIDS (cuDF, cuML, cuGraph, cuSpatial), NVTabular, DALIComputer VisionNPP, VPI (Vision Programming Interface), CV-CUDA3D / Simulation / PhysicsCUDA PhysX, Flex, Omniverse Kit SDKVideo / ImagingNVENC, NVDEC, NPP, DeepStream SDKRendering / GraphicsOptiX (ray tracing), IndeX (volume visualization), RTXGINetworking / HPCNCCL, Magnum IO, GPUDirect RDMA, UCX, NVSHMEMAutonomous MachinesJetPack SDK, DriveWorks SDK, Isaac SDK 🧬 Bioinformatics / Scientific (your “BioCUDA” mention) Library / ProjectDescriptionnvBIONVIDIA bioinformatics library for DNA/RNA sequence alignment, assembly, etc.cuQuantumGPU-accelerated quantum simulation (state vector + tensor network).cuDF / cuML (RAPIDS)Can be applied in bioinformatics for data science workflows.Clara ParabricksGPU-accelerated genomics pipeline (variant calling, alignment).BioCUDA (community term)Often refers to custom CUDA kernels for genomics / molecular dynamics.AMBER / GROMACS / NAMD GPU buildsMolecular dynamics engines using CUDA backend. 🧮 High-Level Ecosystems EcosystemComponentsRAPIDScuDF (DataFrame), cuML (ML), cuGraph, cuSpatial, cuCIM (imaging).TensorRTInference optimization and deployment engine (on top of cuDNN + cuBLAS).DeepStreamVideo analytics framework for AI + IoT.OmniverseRTX-based simulation + rendering platform using CUDA + OptiX + MDL.Magnum IOSuite for multi-GPU and multi-node data transfer (NCCL, NVSHMEM, UCX, GPUDirect). 🔧 Developer / Tooling Stack ToolDescriptionNsight SystemsSystem-wide performance analysis.Nsight ComputeKernel-level profiler.Nsight GraphicsGraphics debugging/profiling.CUDA-GDBCUDA debugger.CUDA-MEMCHECKMemory error detection tool. 💡 Other Specialized Libraries LibraryAreaCUTLASSTemplate library for building custom GEMMs (used by PyTorch).cuDLADeep Learning Accelerator runtime (for Jetson).NVInfer / TensorRTNeural network inference optimization.CV-CUDAOpen-source CV primitives for cloud inference.cuQuantum / cuStateVec / cuTensorNetQuantum simulation primitives.cuPHY5G baseband physical layer acceleration.

GitHub - lee101/gobed: static embedding models in golang

static embedding models in golang. Contribute to lee101/gobed development by creating an account on GitHub.

github.com

Florian Bansac · Nov 7, 2025 · 12:28 AM UTC

Florian Bansac

@FlorianBansac

Nov 7

Replying to @Sentdex

I hear cuda is a work of art?

Thomas Kelly 🇻🇦🗝️ · Nov 7, 2025 · 8:00 PM UTC

Thomas Kelly 🇻🇦🗝️

@Thomasmarkelly

Nov 7

Replying to @Sentdex

x.com/RihardJarc/status/1986… maybe knows

Rihard Jarc

@RihardJarc

Nov 7

A MUST-read interview with a high-ranking $MSFT employee on data centers and what is happening right now ( $NVDA/ $AMD, liquid cooling, and HHD): 1. The challenges that $MSFT is having right now are energy and liquid cooling. To improve its goodwill with municipalities, $MSFT is setting up wastewater treatment facilities near its data centers, which also benefits the municipalities, not just $MSFT. 2. He mentions that they have been deploying a lot of $NVDA GB200s lately, but not as much as $META or X. There were some design challenges initially, but right now, there is a pretty good uptick with those with a lot of their customers. By and large, H100s are probably still their biggest pool. 3. They are seeing a slowdown in training compared to inferencing. Over the last 3-4 months, there has been increased interest in savings costs with inference. $MSFT has built some toolkits to help convert CUDA models to $AMD's ROCm so that you could use it on an $AMD 300x, and they are getting a lot of inquiries about $AMD's path and the 400x and 450X: »We're actually working with AMD on that to see what we can do to maximize that.« 4. According to him, $MSFT hasn't really brushed off OpenAI, but OpenAI is partnering with others and trying to get as much compute as they can. He is questioning how financially sustainable that becomes as OpenAI is still hemorrhaging money, but their balance sheet is actually getting better month over month. 5. He doesn't think that you can overbuild capacity at this point, as it takes time for data centers to be set up. He believes the tipping point of overbuild will be in 2029 or 2030, at least according to their projections. 6. He also gives some clarity as $MSFT is open to working with former bitcoin miners who want to transform to AI. Still, the biggest challenge with them is water and its availability, as many of them do not require liquid cooling for bitcoin mining. 7. He does mention that there is an HDD shortage because a few years ago, many HDD manufacturers cut back production to focus on SSDs and ultra SSDs. That being said, there is a ceiling that $MSFT's Azure is willing to pay for hard drives, and they are pushing back against the Seagate, Western Digital, Toshiba, and Samsungs. He believes the capacity is being added and that it will be better in the first half of 2026. found on @AlphaSenseInc