Bilgin Ibryam · Nov 8, 2025 · 12:24 PM UTC

Bilgin Ibryam

sangkilpark retweeted

Bilgin Ibryam

@bibryam

13h

AI Agents Ecosystem aiagentstore.ai/ecosystem

Dhanian 🗯️ · Nov 8, 2025 · 5:37 AM UTC

sangkilpark retweeted

Dhanian 🗯️

@e_opore

20h

Pre-training Objectives for LLMs ✓ Pre-training is the foundational stage in developing Large Language Models (LLMs). ✓ It involves exposing the model to massive text datasets and training it to learn grammar, structure, meaning, and reasoning before it is fine-tuned for specific tasks. ✓ The objective functions used during pre-training determine how effectively the model learns language representations. → Why Pre-training Matters ✓ Teaches the model general linguistic and world knowledge. ✓ Builds a base understanding of syntax, semantics, and logic. ✓ Reduces data requirements during later fine-tuning. ✓ Enables the model to generalize across multiple domains and tasks. → Main Pre-training Objectives 1. Causal Language Modeling (CLM) ✓ Also known as Autoregressive Training, used by models like GPT. ✓ Objective → Predict the next token given all previous tokens. ✓ Example: → Input: “The sky is” → Target: “blue” ✓ The model learns word sequences and context flow — ideal for text generation and completion. ✓ Formula (simplified): → Maximize P(w₁, w₂, ..., wₙ) = Π P(wᵢ | w₁, ..., wᵢ₋₁) 2. Masked Language Modeling (MLM) ✓ Introduced with BERT, a bidirectional training objective. ✓ Objective → Predict missing words randomly masked in a sentence. ✓ Example: → Input: “The [MASK] is blue.” → Target: “sky” ✓ Allows the model to see context from both left and right, capturing deeper semantic relationships. ✓ Formula (simplified): → Maximize P(masked_token | visible_tokens) 3. Denoising Autoencoding ✓ Used by models like BART and T5. ✓ Objective → Corrupt the input text (e.g., mask, shuffle, or remove parts) and train the model to reconstruct the original sentence. ✓ Encourages robust understanding and recovery of meaning from noisy or incomplete inputs. ✓ Example: → Input: “The cat ___ on the mat.” → Target: “The cat sat on the mat.” 4. Next Sentence Prediction (NSP) ✓ Used alongside MLM in early BERT training. ✓ Objective → Predict whether one sentence logically follows another. ✓ Example: → Sentence A: “He opened the door.” → Sentence B: “He entered the room.” → Label: True ✓ Helps the model learn coherence and discourse-level relationships. 5. Permutation Language Modeling (PLM) ✓ Used by XLNet, combining autoregressive and bidirectional learning. ✓ Objective → Predict tokens in random order rather than fixed left-to-right. ✓ Enables the model to capture broader context and dependencies without masking. 6. Contrastive Learning Objectives ✓ Used in multimodal and instruction-based pretraining. ✓ Objective → Maximize similarity between semantically related pairs (e.g., a caption and its image) and minimize similarity between unrelated pairs. ✓ Builds robust cross-modal and conceptual understanding. → Modern Combined Objectives ✓ Modern LLMs often merge multiple pre-training objectives for richer learning. ✓ Example: → T5 uses denoising + text-to-text generation. → GPT-4 expands causal modeling with instruction-tuned objectives and reinforcement learning (RLHF). ✓ These hybrid objectives enable models to perform a wide range of generative and comprehension tasks effectively. → Quick tip ✓ Pre-training objectives teach LLMs how to predict, reconstruct, and reason over text. ✓ CLM → next-word prediction. ✓ MLM → masked token recovery. ✓ Denoising & NSP → structure and coherence. ✓ Contrastive → cross-domain learning. ✓ Together, they form the foundation for the deep understanding and fluency that define modern LLMs. 📘 Grab this ebook to Master LLMs : codewithdhanian.gumroad.com/…

230

Min Choi · Nov 7, 2025 · 7:55 PM UTC

sangkilpark retweeted

Min Choi

@minchoi

Nov 7

Qwen Image Edit w/ Camera Control is wild 🤯 Quickly rotate the camera, switch between bird's eye and worm's eye views using just clicks. Here's how plus 7 wild examples:👇

448

3,626

Tom Dörr · Nov 7, 2025 · 5:24 PM UTC

sangkilpark retweeted

Tom Dörr

@tom_doerr

Nov 7

Builds data pipelines from a single SQL script

258

Akshay 🚀 · Nov 7, 2025 · 3:08 PM UTC

sangkilpark retweeted

Akshay 🚀

@akshay_pachaar

Nov 7

A simple trick cuts your LLM costs by 50%! Just stop using JSON and use this instead: TOON (Token-Oriented Object Notation) slashes your LLM token usage in half while keeping data perfectly readable. Here's why it works: TOON's sweet spot: uniform arrays with consistent fields per row. It merges YAML's indentation and CSV's tabular structure, optimized for minimal tokens. Look at the example below. JSON: { "𝘂𝘀𝗲𝗿𝘀": [ { "𝗶𝗱": 𝟭, "𝗻𝗮𝗺𝗲": "𝗔𝗹𝗶𝗰𝗲", "𝗿𝗼𝗹𝗲": "𝗮𝗱𝗺𝗶𝗻" }, { "𝗶𝗱": 𝟮, "𝗻𝗮𝗺𝗲": "𝗕𝗼𝗯", "𝗿𝗼𝗹𝗲": "𝘂𝘀𝗲𝗿" } ] } Toon: 𝘂𝘀𝗲𝗿𝘀[𝟮]{𝗶𝗱,𝗻𝗮𝗺𝗲,𝗿𝗼𝗹𝗲}: 𝟭,𝗔𝗹𝗶𝗰𝗲,𝗮𝗱𝗺𝗶𝗻 𝟮,𝗕𝗼𝗯,𝘂𝘀𝗲𝗿 It's obvious how few tokens are being used to represent the same information! To summarise, here are the key features: 💸 30–60% fewer tokens than JSON 🔄 Borrows the best from YAML & CSV 🤿 Built-in validation with explicit lengths & fields 🍱 Minimal syntax (no redundant braces, brackets, etc.) IMPORTANT!! That said, for deeply nested or non-uniform data, JSON might be more efficient. In the next tweet, I've shared some benchmark results demonstrating the effectiveness of this technique in reducing the number of tokens and improving retrieval accuracy with popular LLM providers. Where do you think this could be effective in your existing workflows? Find the relevant links in the next tweet!

137

940

Jason Weston · Nov 7, 2025 · 1:53 AM UTC

sangkilpark retweeted

Jason Weston

@jaseweston

Nov 7

Scaling Agent Learning via Experience Synthesis 📝: arxiv.org/abs/2511.03773 Scaling training environments for RL by simulating them with reasoning LLMs! Environment models + Replay-buffer + New tasks = cheap RL for any environments! - Strong improvements over non-RL-ready environments and multiple model families! - Works better in sim-2-real RL settings → Warm-start for high-cost environments 🧵1/7

474

Akshay 🚀 · Nov 7, 2025 · 12:30 PM UTC

sangkilpark retweeted

Akshay 🚀

@akshay_pachaar

Nov 7

Multi-head attention in LLMs, visually explained:

236

GIF

AK · Nov 8, 2025 · 4:58 AM UTC

sangkilpark retweeted

@_akhaliq

20h

Qwen-Image-Edit-2509-Light_restoration app

414

Google Cloud Tech · Nov 6, 2025 · 6:00 PM UTC

sangkilpark retweeted

Google Cloud Tech

@GoogleCloudTech

Nov 6

From silicon to software, here's a deep dive into Ironwood TPU’s co-designed AI stack → goo.gle/3LoTrHM

347

Tom Dörr · Nov 7, 2025 · 1:33 PM UTC

sangkilpark retweeted

Tom Dörr

@tom_doerr

Nov 7

Terminal-based window manager with mouse support

1,040

Ben Dicken · Nov 7, 2025 · 3:45 PM UTC

sangkilpark retweeted

Ben Dicken

@BenjDicken

Nov 7

What happens when you INSERT a row in Postgres? Postgres needs to ensure that data is durable while maintaining good write performance + crash recovery ability. The key is in the Write-Ahead Log (WAL). (1) Postgres receives the query and determines what data page to place it in. This could already be in memory (buffer pool), or it may have to load one from disk, or even create a new one. (2) The new record is written to this page in memory only. The page is marked as “dirty” meaning it needs to get flushed to disk in the future, but not immediately. (3) A new record is inserted into the memory buffer for the WAL. It contains all the information needed to reconstruct the insert. (4) The WAL is flushed to disk (via fsync or similar) to ensure the data resides in durable storage. After this succeeds, Postgres return success to the client. When you get a success at the client, the data has definitely been written to the sequential WAL (good write performance) but not necessarily to the table data file (less predictable I/O patterns). The latter happens later on via checkpointing, background jobs, or forced flushes due to memory page eviction. If a server crash happens before the data is flushed, the log is replayed to restore committed data. The WAL is the key to all of this! It facilitates high-performance I/O and crash recovery.

637

Wildminder · Nov 7, 2025 · 9:26 AM UTC

sangkilpark retweeted

Wildminder

@wildmindai

Nov 7

CamCloneMaster: replicating camera movements from reference videos without needing camera parameters; based on Wan2.1-1.3B; built using a large-scale Camera Clone Dataset. Solid results. camclonemaster.github.io/

332

Tom Dörr · Nov 6, 2025 · 1:55 PM UTC

sangkilpark retweeted

Tom Dörr

@tom_doerr

Nov 6

Stream processing engine using SQL, DuckDB, and Apache Arrow

256

Kube Architect · Nov 7, 2025 · 3:11 PM UTC

sangkilpark retweeted

Kube Architect @K8sArchitect

Nov 7

KubeDiagrams reads Kubernetes manifests, Helm charts, helmfiles or live cluster state and produces visual architecture diagrams (DOT, SVG, PNG, PDF, etc.), with support for custom resources, clustering, and interactive views ➤ ku.bz/kJ7zQXF13

Bilgin Ibryam · Nov 7, 2025 · 3:39 PM UTC

sangkilpark retweeted

Bilgin Ibryam

@bibryam

Nov 7

LLM Evaluation: Practical Tips at Booking.com booking.ai/llm-evaluation-pr…

157

꿈꾸는유니콘 · Nov 7, 2025 · 12:07 AM UTC

sangkilpark retweeted

꿈꾸는유니콘

@hyedn8131

Nov 7

진짜 아픈티는 꼭 내야한다! 내몸 챙기는건 나밖에 없다.🏥🤒💊 말 안하면 아무도 안챙겨준다

176

666

캐나다 부자엄마 · Nov 7, 2025 · 2:09 AM UTC

sangkilpark retweeted

캐나다 부자엄마

@Happyfe_et

Nov 7

완벽을 기대하지 않고 그냥 하는 것.

360

1,920

캐나다 부자엄마 · Nov 6, 2025 · 7:12 AM UTC

sangkilpark retweeted

캐나다 부자엄마

@Happyfe_et

Nov 6

742

3,362

LangChain · Nov 6, 2025 · 4:32 PM UTC

sangkilpark retweeted

LangChain

@LangChainAI

Nov 6

🤖 Deep Agents JS Deep Agents is now available in JS! Written on top of LangChain and LangGraph 1.0, this brings the power of agents harnesses to the JS ecosystem Comes with planning tools, subagents, and filesystem access Try it out now: npm i deepagents Repo: github.com/langchain-ai/deep… Docs: docs.langchain.com/oss/javas…

296

Google AI Studio · Nov 6, 2025 · 6:41 PM UTC

sangkilpark retweeted

Google AI Studio

@GoogleAIStudio

Nov 6

x.com/i/article/198650304425…

138

821