Posts

Showing posts from January, 2025

Between Reason and Transcendence

Image
  Copyright: Sanjay Basu My Journey Through Atheism to Vedanta My journey to reconciling atheism and Vedanta began in the crucible of scientific rationalism. As a teenager, Darwin’s “Origin of Species” cracked open my worldview, revealing a universe that needed no divine watchmaker to tick forward through time. Freud’s psychological insights further dismantled my childhood religious certainties, suggesting that perhaps our gods were merely projections of our own psychological needs and fears. These early encounters with scientific thought prepared me for the powerful arguments of the New Atheists. I still remember the electric atmosphere of Christopher Hitchens’ lecture hall, where his razor-sharp wit cut through theological pretensions. Richard Dawkins’ evolutionary insights and Daniel Dennett’s neurological and philosophical clarity further reinforced my atheistic worldview. Their arguments resonated with my growing conviction that the universe required no supernatural explanatio...

On “Agents are Not Enough” — A Practitioner’s Perspective!!

Image
  arXiv:2412.16241v1 [cs.AI] 19 Dec 2024 I am pretty involved in working with and developing agents and, lately, have a feeling that something is missing; I chanced upon this paper by Chirag and Ryen. I took a printout and embarked on my journey to Luxor. As I was reading it, I was making notes on the margins. Here are my consolidated notes. Cognitive Architectures The research paper discusses the limitations and challenges of current cognitive architectures, such as SOAR and ACT-R, which were designed to model human cognition by integrating perception, memory, and reasoning. These are sophisticated designs, but these architectures have struggled with scalability and real-time performance, often resulting in high computational costs and limited practical applications. Margin of page 2 ☝️ Figure 1 — Envisioning a new eco-system with Agents, Sims, and Assistants from page 3. Figure 1: Envisioning a new eco-system with Agents, Sims, and Assistants. The paper notes that while some simp...

Comprehensive Analysis of FLOP Calculations in Large Language Models

Image
  Copyright: Sanjay Basu A Detailed Mathematical Framework Abstract This scholarly article presents a detailed examination of the methodologies and mathematical frameworks used to calculate floating-point operations (FLOPs) in training large language models (LLMs). We expand upon the fundamental principles while incorporating additional perspectives and practical examples, providing a comprehensive resource for researchers and practitioners in the field. 1. Breaking Down the Model Architecture: A Comprehensive Analysis The foundation of FLOP calculation begins with a thorough understanding of the model’s architectural components. This section examines each key parameter and its role in the overall computational complexity. 1.1 Key Architectural Parameters Number of Layers (L) The depth of the model is represented by the number of transformer blocks. Each block consists of: Multi-head self-attention sublayer Feed-forward network sublayer Residual connections Layer normalization comp...