Posts

Showing posts from 2025

Fine-Tuning Language Models on NVIDIA DGX Spark

Image
 Complete How-To Guide Copyright: Sanjay Basu Overview This guide provides comprehensive instructions for fine-tuning open-source language models on the NVIDIA DGX Spark personal AI supercomputer. The DGX Spark’s unique 128GB unified memory architecture enables local training of models that would traditionally require cloud infrastructure. Fine-tuning allows you to customize pre-trained models for specific tasks, domains, or response styles while preserving their general capabilities. This guide covers three fine-tuning strategies: Full fine-tuning for maximum customization, LoRA for memory-efficient adaptation, and QLoRA for training even larger models within memory constraints. DGX Spark Hardware Advantages The NVIDIA DGX Spark provides several key advantages for local AI development: 128GB Unified Memory: CPU and GPU share the same memory pool via NVLink-C2C, eliminating memory transfer bottlenecks Grace Blackwell Architecture: Purpose-built for AI workloads with up to 1 PF...

The Grammar of Structure

Image
What the Langlands Program Might Tell Us About Learning Machines Copyright: Sanjay Basu I. Introduction There is a persistent mystery at the heart of mathematics. Objects that appear entirely unrelated, defined in different languages, studied by different communities, sometimes turn out to encode the same information. A question about prime numbers becomes equivalent to a question about symmetries of certain functions. A problem in algebra transforms into a problem in analysis. The translation is not metaphorical. It is exact. This phenomenon troubles people who encounter it for the first time. Mathematics is supposed to be about definitions and consequences. If you define two things differently, why should they be the same? And yet they are. Again and again. Robert Langlands, working at the Institute for Advanced Study in the late 1960s, proposed something ambitious: that these scattered coincidences were not accidents but symptoms of a deeper unity. His conjectures suggested that...

Run multiple LLMs on your DGX Spark with flashtensors

Image
  Leverage 128GB unified memory for instant model hot-swapping Copyright: Sanjay Basu The Model Loading Problem Waiting for a large AI model to initialize often involves a long, frustrating delay. During this time, your GPU remains idle as weights are transferred through multiple bottlenecks, leading to significant latency. For those operating local AI setups, this startup delay can determine whether the system feels quick and responsive or sluggish and vexing.. Now imagine running multiple large models on a single GPU, and switching between them in seconds. That’s exactly what flashtensors enables, and on the DGX Spark’s 128GB unified memory architecture, this capability becomes particularly powerful. Why DGX Spark is Ideal for flashtensors The DGX Spark’s Grace Blackwell architecture provides unique advantages for flashtensors’ direct memory streaming approach: Copyright: Sanjay Basu The shared memory architecture removes the old bottleneck caused by data transfers between ...

The Synthetic Mind

Image
  At the Threshold of Consciousness, Computation, and the Questions We’re Afraid to Ask Copyright: Sanjay Basu Here’s a confession that will irritate the techno-utopians and the AI doomers in equal measure. I spent Thanksgiving break not thinking about artificial intelligence. I failed spectacularly. Between books that ostensibly had nothing to do with machine learning, between long walks through autumn leaves that should have cleared my head of tensor operations and attention mechanisms, between conversations with family members who still think “the cloud” is a weather phenomenon, the questions kept surfacing. Not the questions that dominate LinkedIn feeds and venture capital pitch decks. Not “Will AI take my job?” or “When will we achieve AGI?” Those are the wrong questions, asked by people who haven’t yet realized they’re asking the wrong questions. The real questions are older. Much older. They’re the questions philosophers have wrestled with for millennia, now dressed in ...

Thanatos

Image
  Copyright: Sanjay Basu Freud’s Dark Idea That Explains More About Life Than Death Freud had a talent for dropping theoretical grenades into polite conversation. One of his most explosive? “That part of you that wants to die.” Imagine saying that at a dinner party. People would reach for the wine faster than you can say psychoanalysis. But that’s what Freud meant by Thanatos — the death drive. A quiet, persistent whisper inside us pulling toward dissolution, stillness, oblivion. Not in a dramatic gothic way, but in the subtle ways we sabotage progress, repeat bad patterns, and drift toward entropy when nobody’s watching. Thanatos, in Freud’s world, isn’t some spooky shadow lurking in your bedroom at night. It’s the reason you sometimes choose the option that harms you, confuses you, or makes no rational sense. It’s gravity for the psyche. And like gravity, you barely notice it until you trip. ☠️ We’re living in a golden age of self-optimisation. Mindfulness apps nudge us toward se...