<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.10.0">Jekyll</generator><link href="https://www.dylanamartin.com/blog.xml" rel="self" type="application/atom+xml" /><link href="https://www.dylanamartin.com/" rel="alternate" type="text/html" /><updated>2026-03-12T01:40:23+00:00</updated><id>https://www.dylanamartin.com/blog.xml</id><title type="html">Dylan Martin</title><subtitle>Compacted Context is the personal website of Dylan Martin, where I publish essays about software engineering, career reflections, or whatever else I&apos;m thinking about, and digests of what I&apos;m reading (or occasionally watching).</subtitle><author><name>Dylan</name></author><entry><title type="html">Steelman: an adversarial reasoning tool for decision-making</title><link href="https://www.dylanamartin.com/2026/03/11/announcing-steelman.html" rel="alternate" type="text/html" title="Steelman: an adversarial reasoning tool for decision-making" /><published>2026-03-11T00:00:00+00:00</published><updated>2026-03-11T00:00:00+00:00</updated><id>https://www.dylanamartin.com/2026/03/11/announcing-steelman</id><content type="html" xml:base="https://www.dylanamartin.com/2026/03/11/announcing-steelman.html"><![CDATA[<p>I’ve been thinking a lot about how I make decisions; especially the hard ones, where I have a strong opinion and I’m not totally sure if it’s right.  The kind where you walk into a meeting, lay out your case, and someone asks a question you hadn’t considered, and suddenly you’re on your back foot, revising your argument in real time.</p>

<p>That experience of having your position challenged well and coming out the other side with something sharper is genuinely valuable.  But it doesn’t scale.  You can’t always find the right person to push back on your thinking at the right time.  And most of us don’t seek out that kind of friction voluntarily.  What we <em>do</em> instead, increasingly, is reach for an AI — and the AI mostly tells us we’re right.  It validates, polishes, and helps us build on assumptions we never examined.  We walk away feeling sharper when really we just feel more comfortable.  I wanted something that would make me <em>less</em> comfortable with my position before I committed to it.</p>

<p>So I built <a href="https://steelman.dylanamartin.com/">Steelman</a>.</p>

<h2 id="what-it-does">What it does</h2>

<p>Steelman is an adversarial reasoning tool.  You state a position — “we should rewrite this service in Rust,” “single-payer healthcare is the only workable option,” “this essay’s thesis is airtight,”<sup id="fnref:1" role="doc-noteref"><a href="#fn:1" class="footnote" rel="footnote">1</a></sup> whatever — and it puts your argument through a structured gauntlet.</p>

<p>Here’s how it works:</p>

<ol>
  <li>
    <p><strong>Claim decomposition.</strong>  You write your position, and the AI breaks it down into empirical claims (things that are verifiably true or false) and value judgments (trade-offs and priorities).  This is my favorite part tbh; seeing your argument decomposed into its load-bearing components is clarifying in a way that’s hard to describe until you’ve experienced it.</p>
  </li>
  <li>
    <p><strong>Three rounds of adversarial challenge.</strong>  Three escalating personas target the weakest parts of your argument. Each round, you defend your position. The AI assesses whether your responses actually address the challenges and updates the status of each claim accordingly.</p>
  </li>
  <li>
    <p><strong>A Decision Record.</strong>  At the end, you get a structured document: your refined position, the challenges you faced, which claims survived and which didn’t, and (crucially) falsification criteria.  Conditions under which you’d change your mind.</p>
  </li>
</ol>

<p>The important design constraint: the AI never writes <em>for</em> you.  It decomposes, mirrors, challenges, and structures, but every word in the final Decision Record is yours.  I didn’t want a tool that generates opinions.  I wanted one that pressure-tests them.</p>

<h2 id="why-i-built-it">Why I built it</h2>

<p>Honestly, I’m worried about what’s happened to decision-making in the age of AI. I’m <a href="/2026/02/02/spinning-the-wheel.html">as guilty of this as anyone</a> — we’re all going full-bore into using these thinking machines, and mostly that’s great.  But the default mode of every major AI chat app is sycophancy: you state a position, and the model validates it, maybe adds some caveats for plausibility, and helps you build on a premise it never questioned.  It’s not that AI <em>can’t</em> help with decisions — it’s that the way we’re using it trains us to outsource the thinking rather than sharpen it.  Vaughn Tan has a <a href="https://vaughntan.org/aiux">great piece on this</a> — he argues that mainstream AI interfaces create a “seductive mirage” of talking to a meaningmaking entity, when really they’re just tools, and that we need to design AI experiences that clearly separate the subjective judgment work only humans can do from the non-meaningmaking work that machines are good at.  That framing resonated with me.  The right role for AI in decisions isn’t to <em>make</em> them for you — it’s to force you to make them better yourself.  Steelman is my attempt at that: an AI tool that stays on its side of the line, structuring and challenging your reasoning while every decision about what matters and what to believe remains yours.  The default mode shouldn’t be “yes, and.”  It should be “okay, but have you considered.”</p>

<h2 id="the-stack">The stack</h2>

<p>For the folks who care about this kind of thing: it’s a Next.js app using Claude (via the Vercel AI SDK) for the structured generation, Supabase for persistence, and Tailwind for the UI.  I used Zod schemas to constrain the AI outputs into predictable structures — claim objects, challenge objects, assessment objects — which was essential for making the multi-round flow feel deterministic rather than vibes-based.</p>

<h2 id="try-it-out">Try it out</h2>

<p>Steelman is currently in closed beta.  If you’re interested, you can <a href="https://steelman.dylanamartin.com/">sign up for the waitlist</a>.</p>

<p>I’m especially curious to hear from people who make a lot of high-stakes decisions — engineering managers, staff+ engineers, architects, but also founders, policy folks, anyone who writes arguments for a living — about whether this maps to how they actually think through problems.  The adversarial personas currently skew toward infrastructure and systems decisions, but Steelman works on any kind of argument, and I’d like to expand the persona set to cover more domains.</p>

<p>If you try it out and have thoughts, feel free to reach out to me via <a href="mailto:me@dylanamartin.com">email</a>.  I’m iterating on this actively and feedback from real users is worth more than any amount of me arguing with myself about what to build next (though Steelman is useful for that too).</p>

<p>We’re all going to keep using AI to help us think.  The question is whether it makes our thinking better or just makes us more confident.  Steelman is a bet that the right AI tool for decisions is one that challenges you, not one that agrees with you.</p>

<div class="footnotes" role="doc-endnotes">
  <ol>
    <li id="fn:1" role="doc-endnote">
      <p>These are deliberately terse examples for illustration.  In practice, the more detail you provide up front — context, constraints, prior art, why you believe what you believe — the better the adversarial challenges will be.  Steelman rewards specificity. <a href="#fnref:1" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
  </ol>
</div>]]></content><author><name>Dylan</name></author><category term="AI" /><category term="decision-making" /><category term="reasoning" /><category term="tools" /><category term="announcement" /><summary type="html"><![CDATA[I’ve been thinking a lot about how I make decisions; especially the hard ones, where I have a strong opinion and I’m not totally sure if it’s right. The kind where you walk into a meeting, lay out your case, and someone asks a question you hadn’t considered, and suddenly you’re on your back foot, revising your argument in real time.]]></summary></entry><entry><title type="html">Contra Yang, et al</title><link href="https://www.dylanamartin.com/2026/02/21/contra-yang-et-al.html" rel="alternate" type="text/html" title="Contra Yang, et al" /><published>2026-02-21T00:00:00+00:00</published><updated>2026-02-21T00:00:00+00:00</updated><id>https://www.dylanamartin.com/2026/02/21/contra-yang-et-al</id><content type="html" xml:base="https://www.dylanamartin.com/2026/02/21/contra-yang-et-al.html"><![CDATA[<p><em>This morning I woke up to a text from <a href="https://www.colorado.edu/ebio/andrew-martin">my dad</a>, who was asking for my opinion on <a href="https://blog.andrewyang.com/p/the-end-of-the-office">this piece</a> from Andrew Yang. I wrote him a shorter response that contained a decent chunk of what I’m about to say, but it turns out I had a lot more to say about the topic, and when I finally got done writing it all down, I had what almost looked like a blog post. Figured I might as well flesh it out, and here we are.</em></p>

<p>I try to read these viral AI-displacement pieces with an open but critical eye; looking for what’s genuinely new versus what’s just repackaged anxiety, and trying to separate the claims that hold up under scrutiny from the ones that fall apart one you bring the temperature down a few degrees. I read Yang’s piece with that spirit in mind.</p>

<p>I think the basic point of Yang’s piece is right: white collar work is information processing, AI is good at information processing, and the stock market will reward companies that figure out how to do more with fewer people. I don’t think that’s a particularly original take, and people in my industry have been frothing about this (on Twitter, on LinkedIn) for literally years. Maybe it’s hitting mainstream politics finally.</p>

<p>But his speculations on timelines are pretty insane. “20-50% of 70 million white-collar jobs” gone in “the next several years,” millions displaced in 12-18 months – based on what? A conversation with one CEO? Talk about anecdotes laundered into predictions. And while influencers in more tech-native sphere aren’t innocent of making these types of claims too (e.g. AI 2027<sup id="fnref:1" role="doc-noteref"><a href="#fn:1" class="footnote" rel="footnote">1</a></sup>), the general vibe of this discourse feels like shock and awe over substance. Straight-line extrapolation dressed up as forecasting.</p>

<p>That was my biggest complaint the whole thing, really: the emotional engineering. The timbre. Yang frames someone in his family building a website in minutes as evidence that designers are obsolete, but anyone who ships software knows the demo is maybe 20% of the actual work. He cites mortgage delinquency charts as though AI is already cratering the housing market, but the actual NY Fed data tells a very different story<sup id="fnref:2" role="doc-noteref"><a href="#fn:2" class="footnote" rel="footnote">2</a></sup>. He names it “the Fuckening” because it “feels more visceral.” He’s made a career out of being the UBI politician; he links his book tour dates at the bottom. I don’t want to be uncharitable, but it’s worth noting that Yang’s financial incentives are perfectly aligned with maximizing alarm: the scarier the story, the more urgent the book feels, the more relevant the policy proposal becomes. That doesn’t make him wrong, but it does mean we should be especially careful about separating the signal from the sales pitch before engaging with the substance.</p>

<p>He also makes the classic non-tech mistake (intentional misdirection?) of framing AI demos like they’re real, load-bearing parts of software infrastructure. Yes, someone built a website in minutes. But the demo is maybe 20% (or less) of the actual work; the other 80% is edge cases, integration, compliance, error handling, all the stuff that makes things actually work in production. AI is still bad at that part, and I think there’s a meaningful reason it’s going to stay bad at it for a while that’s worth explaining.</p>

<p>The way these models improve is through evaluation: you need to be able to measure whether the model is getting better at a task in order to train it to be better at that task. For the demo stuff, evals are relatively straightforward. “Did the model produce working code that compiles and passes these test cases?” You can answer that programmatically. But most knowledge work isn’t like that. Most knowledge work is a bundle of tasks held together by judgment, context, and institutional memory, and the eval that would capture whether AI is doing <em>the whole job</em> well basically doesn’t exist.</p>

<p>This is Goodhart’s Law applied to AI capabilities: when a measure becomes a target, it ceases to be a good measure. AI benchmarks are saturating – SWE-bench scores went from 33% to over 70% in a single year<sup id="fnref:3" role="doc-noteref"><a href="#fn:3" class="footnote" rel="footnote">3</a></sup> – and labs are increasingly optimizing for the benchmarks rather than for the messy, situated work the benchmarks are supposed to proxy for. Oxford researchers reviewed 445 AI benchmarks and found that most don’t actually measure what they claim to measure, suffering from vague definitions and absent statistical validation.<sup id="fnref:4" role="doc-noteref"><a href="#fn:4" class="footnote" rel="footnote">4</a></sup> The model gets better at the test without necessarily getting better at the job.</p>

<p>And the hardest parts of knowledge work are precisely the parts that resist measurement. You can’t easily write an eval for “did this integration handle the edge case that only surfaces when the legacy billing system sends malformed dates on leap years” or “did this PR account for the implicit constraint that the payments team agreed to in a Slack thread six months ago.” These failures are vast yet specific, context-dependent, and often only recognizable as failures when a real user hits them in a real environment.</p>

<p>The common response here is that bigger context windows will solve this: just give the model the entire codebase, the Slack history, the docs, and let it figure it out. And it’s true that context windows are growing fast. But the bottleneck isn’t having the context; it’s knowing which context matters. An experienced engineer reading a PR doesn’t scan every Slack thread from the last six months; they know, from years of working in this system, that this thread about the payments team’s implicit constraint is relevant while ten thousand others aren’t. That’s not a retrieval problem. It’s a salience problem, one that depends on a mental model of how the system actually works, who made what tradeoffs and why, and what’s likely to break downstream. Throwing more context at a model can actually make this worse, not better, because you’re increasing the noise without improving the model’s ability to identify the signal.</p>

<p>In other words, this is a measurement problem, and measurement problems are slow to solve. You can’t easily evaluate whether a model correctly identified the relevant context, which means you can’t easily train it to get better at that task (Goodhart’s Law again, just at a different layer). I think the next generation of companies building vertical AI tooling will start to crack specific domains, but the generic “AI replaces knowledge worker” story requires solving eval problems that the entire field is still struggling with.</p>

<p>But beside all of that, even if the tech were ready tomorrow, have you ever watched a big company try to adopt any new software? Procurement cycles, compliance reviews, legacy system integration, middle managers fighting to keep headcount. Most Fortune 500s are still finishing cloud migrations they started a decade ago. These predictions always come in too hot. ATMs were supposed to kill bank tellers, spreadsheets were going to eliminate accountants, the internet was going to make offices obsolete by 2005. Every time, the tech changed jobs more than it killed them and new roles showed up that nobody predicted. Our economy might reward signals of efficiency, but in practice the underlying processes take forever.</p>

<p>This time COULD be different because AI is software not hardware, scales way faster, deploys way cheaper. I take that seriously. There’s already data suggesting the economics of software are shifting: SaaS gross margins among public companies have been declining, dropping from around 78% in 2020 to 72% by 2023 as product commoditization and competition compress pricing.<sup id="fnref:5" role="doc-noteref"><a href="#fn:5" class="footnote" rel="footnote">5</a></sup> The traditional seat-based SaaS model is under real pressure; if an AI agent can access a database and execute a workflow directly, why are you paying $70/seat/month for a dashboard that sits between a human and that same database? That’s a real structural shift worth watching. But “the economics are shifting” and “definitely catastrophic in 18 months” are very different claims.</p>

<p>Software engineering is probably the industry where this conversation is loudest and most specific, which makes sense: it’s the one closest to the technology itself. It’s also the one I know best, so let me talk about what I’m seeing in software engineering (I’ve been <a href="/2025/11/07/spinning-plates.html">writing</a> <a href="/2025/11/24/racing-towards-bethlehem.html">about</a> <a href="/2026/02/02/spinning-the-wheel.html">this</a> for a minute).</p>

<p>Specifically, I want to address this question: is software engineering in general just going up one abstraction layer? There’s a version of this argument that sounds clean and reassuring. We went from assembly to C to Python to “just tell the AI what to build,” and every time the previous layer’s practitioners were fine because they moved up. And there’s something to that. Gergely Orosz at The Pragmatic Engineer wrote about how even the creator of Claude Code didn’t open an IDE for an entire month; all his committed code was AI-written.<sup id="fnref:6" role="doc-noteref"><a href="#fn:6" class="footnote" rel="footnote">6</a></sup> Senior engineers are already spending less time typing and more time shaping systems (defining specs, reviewing output, making architectural decisions). AI just pushes that trend to its logical conclusion.</p>

<p>But I think the abstraction-layer framing obscures something important about what’s actually valuable in software engineering right now. It’s not “knowing how to code” in the syntactic sense. AI can write a for loop. It can scaffold a React app. It can even do a pretty good first pass at a complex feature if you give it enough context. What it can’t do well is hold the full mental model of a production system in its head: the implicit constraints, the historical decisions, the understanding of why this particular service communicates with that particular database in this particular way, and what breaks if you change it. The Stanford study I’ll get to in a minute found something relevant here: employment for developers aged 22-25 dropped nearly 20% from its late 2022 peak, but employment for workers over 30 in the same AI-exposed roles actually grew 6-12%.<sup id="fnref:7" role="doc-noteref"><a href="#fn:7" class="footnote" rel="footnote">7</a></sup> The market is telling us something. The value isn’t in writing code; it’s in the tacit knowledge that comes from years of shipping code in messy real-world environments. AI is great at the codified stuff. The un-codified stuff is where humans still dominate, and it’s where the value is concentrating.</p>

<p>There’s a related point here that I think gets lost in the discourse: not all software engineering is created equal, and AI is going to hit different parts of the industry very differently. I work at a product-led company where engineers are expected to talk to customers, make product decisions, think about activation funnels, and ship features that move business metrics. That kind of work is ambiguous, cross-functional, and deeply contextual. It’s hard to automate because the “right answer” isn’t well-defined and changes constantly based on user behavior and market conditions.</p>

<p>Compare that to programming at e.g. a large insurance company, where software is already more of a commodity – maintaining internal CRUD apps, building reports against legacy databases, implementing well-specified business logic. That work has been getting squeezed for years, first by offshoring, then by low-code tools, now by AI. Or think about the kind of programming that happens at a consulting firm, where you’re building roughly similar applications for different clients over and over. AI eats that for breakfast because the patterns are repetitive and the specifications are relatively concrete.</p>

<p>This isn’t a new divide. The frontier of software engineering has always been different from the commodity middle. What’s changing is that AI is dramatically widening that gap. If your work is primarily translating well-understood requirements into code, you’re in trouble regardless of Yang’s timeline, because that’s exactly what AI does best<sup id="fnref:8" role="doc-noteref"><a href="#fn:8" class="footnote" rel="footnote">8</a></sup>. If your work involves navigating ambiguity, making judgment calls with incomplete information, and understanding complex sociotechnical systems, you’re probably fine for a long time. Arguably more valuable than ever, because AI is making the easy parts of your job faster while the hard parts remain stubbornly human.</p>

<p>Since we’re talking about data, let’s actually look at some, because the picture is more nuanced than Yang lets on (though it’s not exactly rosy either).</p>

<p>Morgan Stanley surveyed 935 executives across five sectors and found an average 4% net decline in headcount over 12 months, alongside an 11.5% productivity increase.<sup id="fnref:9" role="doc-noteref"><a href="#fn:9" class="footnote" rel="footnote">9</a></sup> Notably, U.S. companies actually reported a 2% net gain in jobs; the biggest pain was in the UK at 8% net loss, and concentrated among larger firms. Early-career positions were disproportionately affected, which tracks.</p>

<p>The Stanford Digital Economy Lab study is probably the most rigorous thing out there right now.<sup id="fnref:7:1" role="doc-noteref"><a href="#fn:7" class="footnote" rel="footnote">7</a></sup> Using ADP payroll data covering millions of workers, they found a 13% relative decline in employment for 22-25 year olds in the most AI-exposed occupations since late 2022. For software developers in that age range specifically, the drop was nearly 20% from peak. But (and this is the part Yang would leave out) they also found that employment for older workers in the same roles grew 6-12%, and that jobs where AI augments work rather than automates it haven’t seen similar declines. The adjustment is real, but it’s not uniform, and the “automation vs. augmentation” distinction matters enormously for predicting where this goes.</p>

<p>Challenger, Gray &amp; Christmas tracked 696,000 job cuts in the first five months of 2025, an 80% year-over-year jump.<sup id="fnref:10" role="doc-noteref"><a href="#fn:10" class="footnote" rel="footnote">10</a></sup> But they attribute this to a cocktail of tariffs, funding cuts, consumer spending shifts, and AI, not AI alone. The World Economic Forum’s 2025 report estimated 92 million jobs displaced by 2030 but 170 million new roles created, for a net gain.<sup id="fnref:11" role="doc-noteref"><a href="#fn:11" class="footnote" rel="footnote">11</a></sup> And a Harvard Business School professor studying this put it well: AI exposure overlaps with about 35% of tasks visible in labor market data, but the history of predicting employment effects from technology is “extraordinarily hard,” and the radiologists we were told to stop training in 2017 are busier than ever.<sup id="fnref:12" role="doc-noteref"><a href="#fn:12" class="footnote" rel="footnote">12</a></sup></p>

<p>What does all this tell us? The displacement is real, it’s measurable, and it’s hitting early-career workers first and hardest. But it’s also a 4% net headcount decline and a 13% relative employment drop in specific demographics, not the 20-50% apocalypse Yang is selling. The data supports “meaningful structural change that’s already underway and will accelerate” much more than it supports “the Fuckening.”</p>

<p>Plus like a lot of what he’s describing is also just an acceleration of stuff that’s been happening for years. Knowledge work offshoring, junior roles getting squeezed, bad grad employment numbers. AI is pouring gasoline on existing fires not starting new ones. There is something to this, though: economic transitions hurt the people who built their lives around stability. People who followed the script (school, useful degree, knowledge work career) are going to be disrupted. But I also think that’s just capitalism? Things change! We drive towards efficiency! I don’t think the mindset should ever be “learn a thing once and then coast on it”; the whole point is to be constantly examining yourself, updating your priors, and understanding that what worked in the past might not work in the future.</p>

<p>I want to be honest about the limits of that framing, though. “Stay curious and keep adapting” is easy advice for me to give. I’m in my thirties, I work at the frontier of this stuff, and my entire career has been built around the assumption that the tools and the landscape will keep changing. That’s a very different position than someone who’s 50, spent twenty years building expertise in a domain that’s about to get compressed, has a mortgage and kids in college, and is now being told to “upskill.” Yang is right that the social contract of “study hard, get a degree, get a stable career” is under real pressure, and I don’t think “just adapt” is a sufficient answer for everyone. The question of what we actually do for the people who can’t easily pivot is a real one, and I don’t have a clean answer for it. Yang’s answer is UBI, which is at least a concrete proposal, even if the way he’s selling it feels more like a campaign pitch than a policy discussion.</p>

<p>Maybe I’m coming across as too emotional too. I’ve read a lot of these doomsday scenario-type pieces and they always feel like they’re trying to manipulate me rather than inform me. I don’t doubt that things are changing rapidly, maybe faster than ever, and I think that I’m lucky to be in a frontier industry where this idea of adapting and changing and modifying my workflow is endemic. Frankly, the one true thing about software engineering has always been that it evolves and it rewards those who are intellectually open-minded and good at upskilling.</p>

<hr />

<div class="footnotes" role="doc-endnotes">
  <ol>
    <li id="fn:1" role="doc-endnote">
      <p><a href="https://ai-2027.com/">“AI 2027”</a>, a speculative scenario piece by various AI industry figures. <a href="#fnref:1" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:2" role="doc-endnote">
      <p>The NY Fed’s Q4 2025 report on mortgage delinquencies shows rising delinquencies are concentrated in lower-income zip codes and counties with rising unemployment — driven by income inequality and local labor/housing market conditions, not AI displacement — and are still normal by historical standards outside of pandemic-era lows. See <a href="https://libertystreeteconomics.newyorkfed.org/2026/02/where-are-mortgage-delinquencies-rising-the-most/">“Where Are Mortgage Delinquencies Rising the Most?”</a>, Liberty Street Economics, February 2026. <a href="#fnref:2" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:3" role="doc-endnote">
      <p>SWE-bench Verified scores: top model solved 33% at launch in August 2024; leading models consistently above 70% by mid-2025. Via <a href="https://www.technologyreview.com/2025/12/15/1128352/rise-of-ai-coding-developers-2026/">MIT Technology Review</a>. <a href="#fnref:3" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:4" role="doc-endnote">
      <p>Oxford University review of 445 AI benchmarks, late 2025. Via <a href="https://aiforreal.substack.com/p/benchmark-vs-reality-understanding">AI For Real</a>. <a href="#fnref:4" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:5" role="doc-endnote">
      <p>SaaS Capital, 2025 SaaS Valuation Report. Median gross margins among publicly traded SaaS firms declined from 78% (2020) to 72% (2023). See also <a href="https://www.marketdataforecast.com/market-reports/software-as-a-service-saas-market">Market Data Forecast SaaS Market Report</a>. <a href="#fnref:5" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:6" role="doc-endnote">
      <p>Gergely Orosz, <a href="https://newsletter.pragmaticengineer.com/p/when-ai-writes-almost-all-code-what">“When AI Writes Almost All Code, What Happens to Software Engineering?”</a>, The Pragmatic Engineer, January 2026. <a href="#fnref:6" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:7" role="doc-endnote">
      <p>Erik Brynjolfsson, Danielle Li, and Lindsey Raymond, <a href="https://digitaleconomy.stanford.edu/wp-content/uploads/2025/08/Canaries_BrynjolfssonChandarChen.pdf">“Canaries in the Coal Mine? Six Facts about the Recent Decline in Employment for Young Workers”</a>, Stanford Digital Economy Lab, August 2025. See also coverage in <a href="https://fortune.com/2025/08/26/stanford-ai-entry-level-jobs-gen-z-erik-brynjolfsson/">Fortune</a> and <a href="https://time.com/7312205/ai-jobs-stanford/">TIME</a>. <a href="#fnref:7" class="reversefootnote" role="doc-backlink">&#8617;</a> <a href="#fnref:7:1" class="reversefootnote" role="doc-backlink">&#8617;<sup>2</sup></a></p>
    </li>
    <li id="fn:8" role="doc-endnote">
      <p>I want to steelman the counterargument here, since I’m still somewhat of a believer in the AI revolution. Context windows are growing, agents are getting persistent memory across sessions, and the ability of AI to hold larger and larger mental models of a system is improving fast. The gap I’m describing — between writing code and understanding the system the code lives in — will narrow. But I think even with perfect recall, the bottleneck shifts from “can the AI access the relevant information” to “can it figure out which information matters for this specific decision” — which is closer to judgment than memory, and a fundamentally harder capability to build. For now, and I think for a while, that judgment is the thing experienced engineers are actually selling. <a href="#fnref:8" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:9" role="doc-endnote">
      <p>Morgan Stanley, <a href="https://www.morganstanley.com/insights/articles/ai-adoption-accelerates-survey-find">“AI Adoption Surges Driving Productivity Gains and Job Shifts”</a>. Survey of 935 corporate executives across five sectors in the US, Germany, Japan, and Australia. <a href="#fnref:9" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:10" role="doc-endnote">
      <p>Challenger, Gray &amp; Christmas, via <a href="https://www.cnbc.com/2025/10/22/ai-taking-white-collar-jobs-economists-warn-much-more-in-the-tank.html">CNBC</a>, October 2025. <a href="#fnref:10" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:11" role="doc-endnote">
      <p>World Economic Forum, <a href="https://www.weforum.org/publications/the-future-of-jobs-report-2025/">Future of Jobs Report 2025</a>. <a href="#fnref:11" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:12" role="doc-endnote">
      <p>Christopher Stanton, Harvard Business School, via <a href="https://news.harvard.edu/gazette/story/2025/07/will-your-job-survive-ai/">Harvard Gazette</a>, July 2025. <a href="#fnref:12" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
  </ol>
</div>]]></content><author><name>Dylan</name></author><category term="ai" /><category term="reflection" /><category term="work" /><category term="predictions" /><category term="business" /><summary type="html"><![CDATA[This morning I woke up to a text from my dad, who was asking for my opinion on this piece from Andrew Yang. I wrote him a shorter response that contained a decent chunk of what I’m about to say, but it turns out I had a lot more to say about the topic, and when I finally got done writing it all down, I had what almost looked like a blog post. Figured I might as well flesh it out, and here we are.]]></summary></entry><entry><title type="html">Spinning the Wheel</title><link href="https://www.dylanamartin.com/2026/02/02/spinning-the-wheel.html" rel="alternate" type="text/html" title="Spinning the Wheel" /><published>2026-02-02T00:00:00+00:00</published><updated>2026-02-02T00:00:00+00:00</updated><id>https://www.dylanamartin.com/2026/02/02/spinning-the-wheel</id><content type="html" xml:base="https://www.dylanamartin.com/2026/02/02/spinning-the-wheel.html"><![CDATA[<p>A few months ago I wrote about <a href="/2025/11/07/spinning-plates.html">spinning plates</a> and <a href="/2025/11/24/racing-towards-bethlehem.html">racing toward bottlenecks</a>. The gist was that LLMs had changed how I work, I was faster but learning less, and I was trying to find a balance between leverage and atrophy.</p>

<p>I’ve stopped trying to find balance. I’m all in.</p>

<p>Over the holidays I refined my <a href="https://github.com/dmarticus/dotfiles/tree/main/ai">Claude Code setup</a>, went full-bore into multi-worktree setups with <a href="https://www.conductor.build/">Conductor</a>, and spent a lot of time iterating on my work process. On top of that, the models got better. The tooling caught up. And somewhere in there, I personally crossed a threshold. Now most of my work happens through agents. I’m living in Claude Code &amp; Conductor, spinning up sessions, watching them churn, merging the output. The 80/20 flip that Karpathy described happened to me too: 80% agent coding, 20% edits and touchups.<sup id="fnref:karpathy" role="doc-noteref"><a href="#fn:karpathy" class="footnote" rel="footnote">1</a></sup></p>

<p>It feels incredible. It feels like cheating. It feels like gambling.</p>

<h2 id="the-casino">The casino</h2>

<p>There’s a moment after you send a prompt where you’re just… waiting. The agent is running. You can see it thinking, reading files, making decisions. And there’s this little hit of anticipation: <em>what’s it going to do?</em> It’s the same dopamine loop as pulling a slot machine lever. Low effort, variable reward, endlessly repeatable.</p>

<p>Someone on Hacker News called it “doom tabbing”: the AI is already running, the bar to seeing what it does next is so low that you just… watch.<sup id="fnref:doomtab" role="doc-noteref"><a href="#fn:doomtab" class="footnote" rel="footnote">2</a></sup> A coworker described the opposite problem: you <em>can’t</em> just sit there, so you open Slack or try to multitask during the dead time. Either way you lose — watching keeps you in the dopamine loop, switching fragments your focus. Fifty times a day, both add up to a strange kind of fatigue. Pull the lever, spin the wheel, see what happens. The reward gets front-loaded; the difficult part – understanding what you built, debugging it six months later – gets pushed further out in time.</p>

<p>Ryan Broderick went even darker, calling generative AI an “edging machine”: it charges you for the thrill of feeling like you’re building something while caring more about the monetizable loop of engagement than the finished product.<sup id="fnref:garbageday" role="doc-noteref"><a href="#fn:garbageday" class="footnote" rel="footnote">3</a></sup> I don’t cosign his full doom take, but the framing stuck with me. There <em>is</em> something seductive about the loop. It simulates progress. It feels like making.</p>

<p>And then there’s “comprehension debt” – the tendency for the code in your codebase to become less and less understood over time because the AI one-shotted it and you just moved on.<sup id="fnref:comprehension" role="doc-noteref"><a href="#fn:comprehension" class="footnote" rel="footnote">4</a></sup> People counter that AI actually helps you <em>learn</em> — you can ask it to explain things, build mental models. I do this too. But when I ask for an explanation and then let it do the implementation, the understanding doesn’t stick the way it would if I’d written the code myself. It feels like learning in the moment. Whether it compounds into something durable, I’m not sure.</p>

<p>The casino is fun. When you’re on a heater, it really feels like you’re doing something. But the casino doesn’t care whether you understand what you built.</p>

<h2 id="the-fun">The fun</h2>

<p>And yet — work has never felt this fun.</p>

<p>I’ve always believed that energy management matters more than time management. If the work drains you, it doesn’t matter how many hours you have. And these tools have changed the energy equation. The drudgery is gone. The copying and pasting of compiler warnings, the boilerplate, the fill-in-the-blanks tedium – I just don’t do that anymore. What’s left is the creative part: deciding what to build, figuring out the shape of the solution, reviewing whether the output is good.</p>

<p>Karpathy, my coworkers, all the engineers I’ve talked to who use these tools – they’ve all noticed the same thing. Programming feels <em>more</em> fun now because the fill-in-the-blanks drudgery is removed and what remains is the creative part.</p>

<p>I also feel less stuck. When I hit a wall, I don’t have to grind through it alone. I can throw the problem at Claude, watch it try things, learn from what it attempts. There’s almost always a way to make some positive progress. That changes the emotional texture of the day. Less frustration, more momentum.</p>

<p>And the tenacity thing is real. Watching an agent relentlessly work at something – never tired, never demoralized, just trying approach after approach – is genuinely inspiring. I’ve seen Claude struggle with a problem for thirty minutes and then crack it. That stamina was always a bottleneck for me. Now it’s not.</p>

<h2 id="how-im-adapting">How I’m adapting</h2>

<p>I don’t have a clean answer to the “is this cheating?” question. But I have a working theory about how to stay a craftsman in the casino.</p>

<p>The shift I’ve made is this: I spend more time defining success criteria and less time doing the mechanical work of achieving them. Karpathy’s framing helped here. “Don’t tell it what to do, give it success criteria and watch it go.” The leverage comes from being declarative instead of imperative.</p>

<p>Boris Cherny, who created Claude Code, recently shared how his team uses the tool: start every complex task in plan mode, and pour your energy into the plan so Claude can one-shot the implementation.<sup id="fnref:bcherny" role="doc-noteref"><a href="#fn:bcherny" class="footnote" rel="footnote">5</a></sup> One person on his team has one Claude write the plan, then spins up a second Claude to review it as a staff engineer. Another says the moment something goes sideways, they switch back to plan mode and re-plan — don’t keep pushing. The pattern is the same — front-load the thinking, let the machine handle the doing.</p>

<p>My days have started to split into two modes. There’s contemplative time — defining goals, thinking through edge cases, building the reward function. That part is slow and focused. Then there’s execution time — spinning up agents, running them in parallel, triaging output. That part is fast and frenetic, caffeine-fueled, multi-stream.</p>

<h2 id="what-still-matters">What still matters</h2>

<p>The contemplative work is what makes the execution productive instead of just fun. Without it, I’m just pulling levers and hoping.</p>

<p>For frontend work, this means developing strong taste. Can I look at the output and <em>feel</em> whether it’s right? Does the UI make sense? Are the interactions smooth? I’ve been spending more time on what Jim Nielsen calls “sanding the UI” – the patient, iterative work of smoothing rough edges until something feels right.<sup id="fnref:sanding" role="doc-noteref"><a href="#fn:sanding" class="footnote" rel="footnote">6</a></sup> The agent can generate a component, but I’m the one who has to sand it.</p>

<p>For backend work, it means building robust test harnesses. Types that encode invariants. Property-based testing has been great for this – instead of writing specific test cases, I describe properties the code should always satisfy, and the framework generates hundreds of edge cases to throw at it. If the tests pass and the invariants hold, the code is probably fine. The work shifts from <em>writing</em> the code to <em>specifying</em> what correct code looks like. I build the acceptance criteria first – the tests, the types, the “what does correct look like?” – and only then let the agent loose against it.</p>

<p>And domain expertise matters more, not less. There’s a popular narrative that AI helps you upskill quickly in unfamiliar domains — and that’s true when you’re <em>learning</em>. But for this modality, for being genuinely productive with these tools, your existing expertise is what makes it work. The better I understand the problem space, the earlier I can catch the agent going down a wrong path. When I’m working in code I know well, I can interrupt a bad approach in the first few seconds. When I’m in unfamiliar territory, I might not realize something’s off until it’s been spinning for ten minutes. The models still make mistakes – subtle conceptual errors that a hasty junior dev might make, wrong assumptions they run with instead of checking.<sup id="fnref:karpathy-mistakes" role="doc-noteref"><a href="#fn:karpathy-mistakes" class="footnote" rel="footnote">7</a></sup> You have to watch them like a hawk. It ends up looking like pattern-matching on failure modes before they compound.</p>

<p>These are the things I’m holding onto – taste, rigor, expertise. The parts that feel like they might still be craft. Whether they’re enough to keep it that way, I’m not sure.</p>

<h2 id="the-question-i-cant-answer">The question I can’t answer</h2>

<p>Derek Thompson wrote a piece called “The Monks in the Casino” about young men who’ve retreated from social risk into dopamine loops: gambling, speculation, variable rewards without vulnerability.<sup id="fnref:thompson" role="doc-noteref"><a href="#fn:thompson" class="footnote" rel="footnote">8</a></sup> The casino reshapes what feels normal. What starts as entertainment becomes the default texture of experience.</p>

<p>I keep thinking about how that logic spreads. Engineering used to feel like one of the more contemplative corners of work – long stretches of focused thought, deep understanding as the goal. Now the casino has arrived here too. The tools are incredible, and they’re also slot machines. The dopamine loop is built into the workflow. And I’m not sure how vigilant I need to be, or whether vigilance is even the right frame.</p>

<p>The question I keep asking myself is whether this is still craft.</p>

<p>Craft implies understanding. It implies that the maker could explain every decision, could reproduce the work, could teach someone else how to do it. When I ship something Claude mostly wrote, can I say that? Sometimes yes. Sometimes I’m not sure.</p>

<p>There’s a comforting story I could tell myself here — that craft is evolving, that the new skill is knowing what to ask for and how to evaluate the output, that judgment is the new execution. Maybe that’s true. But I notice how convenient it is. It’s exactly the kind of thing you’d say to avoid sitting with the harder question.</p>

<p>What if the answer is actually no? What if I’m slowly trading away the thing that made me good at this — the deep, hard-won understanding — for speed and fun? What if the speed is the bribe?</p>

<p>Recent research suggests this isn’t just paranoia. A randomized experiment from Anthropic found that AI assistance impaired developers’ conceptual understanding, code reading, and debugging abilities – without even delivering significant efficiency gains on average.<sup id="fnref:anthropic-skills" role="doc-noteref"><a href="#fn:anthropic-skills" class="footnote" rel="footnote">9</a></sup> Only the interaction patterns that involved genuine cognitive engagement preserved learning outcomes. Their conclusion: “AI-enhanced productivity is not a shortcut to competence.”</p>

<p>But knowing that doesn’t tell me what to do. I don’t want to stop using these tools – they’re too good, and the work is too fun. A friend texted me yesterday: “What a time to be alive and programming, eh?” It really is. I’m locked in at the casino, the games are as good as they’ve ever been, and I’m watching myself play more than ever. The best I can do is pay attention.</p>

<hr />

<p><em>Thanks to <a href="https://jurajmajerik.com/">Juraj Majerik</a> for reading a draft of this and for feedback. This is the third post in an unplanned series about AI-assisted development. Previously: <a href="/2025/11/07/spinning-plates.html">Spinning Plates</a>, <a href="/2025/11/24/racing-towards-bethlehem.html">Racing Towards Bethlehem</a>.</em></p>

<div class="footnotes" role="doc-endnotes">
  <ol>
    <li id="fn:karpathy" role="doc-endnote">
      <p>Andrej Karpathy’s <a href="https://x.com/karpathy/status/2015883857489522876">thread on AI-assisted coding</a> (January 2026) captures a lot of what I’ve been experiencing. The whole thing is worth reading. <a href="#fnref:karpathy" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:doomtab" role="doc-endnote">
      <p>From a <a href="https://news.ycombinator.com/item?id=46784594">Hacker News comment</a> that stuck with me: “The end result is very akin to doom scrolling. Doom tabbing?” <a href="#fnref:doomtab" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:garbageday" role="doc-endnote">
      <p>Ryan Broderick, “<a href="https://www.garbageday.email/p/generative-ai-is-an-expensive-edging-machine">Generative AI is an expensive edging machine</a>,” Garbage Day. His take is darker than mine, but the “edging machine” framing is vivid. <a href="#fnref:garbageday" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:comprehension" role="doc-endnote">
      <p>Jeremy Wei <a href="https://x.com/jeremytwei/status/2015886793955229705">coined the term</a> in a reply to Karpathy, who responded: “Love the word ‘comprehension debt,’ haven’t encountered it so far, it’s very accurate.” <a href="#fnref:comprehension" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:bcherny" role="doc-endnote">
      <p>Boris Cherny, “<a href="https://x.com/bcherny/status/2017742741636321619">Tips for using Claude Code</a>,” January 2026. <a href="#fnref:bcherny" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:sanding" role="doc-endnote">
      <p>Jim Nielsen, “<a href="https://blog.jim-nielsen.com/2024/sanding-ui/">Sanding UI</a>.” The metaphor is perfect: you can’t sand in one pass, you have to keep coming back with finer grit. <a href="#fnref:sanding" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:karpathy-mistakes" role="doc-endnote">
      <p>Karpathy again: “The mistakes have changed a lot – they are not simple syntax errors anymore, they are subtle conceptual errors that a slightly sloppy, hasty junior dev might do. The most common category is that the models make wrong assumptions on your behalf and just run along with them without checking.” <a href="#fnref:karpathy-mistakes" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:thompson" role="doc-endnote">
      <p>Derek Thompson, “<a href="https://www.derekthompson.org/p/the-monks-in-the-casino">The Monks in the Casino</a>,” November 2025. <a href="#fnref:thompson" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:anthropic-skills" role="doc-endnote">
      <p>Judy Hanwen Shen and Alex Tamkin, “<a href="https://www.anthropic.com/research/AI-assistance-coding-skills">How AI Assistance Impacts the Formation of Coding Skills</a>,” Anthropic, January 2026. The full <a href="https://arxiv.org/abs/2601.20245">paper</a> is worth reading if you’re thinking about how to preserve skill formation while using AI tools. <a href="#fnref:anthropic-skills" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
  </ol>
</div>]]></content><author><name>Dylan</name></author><category term="ai" /><category term="reflection" /><category term="work" /><summary type="html"><![CDATA[A few months ago I wrote about spinning plates and racing toward bottlenecks. The gist was that LLMs had changed how I work, I was faster but learning less, and I was trying to find a balance between leverage and atrophy.]]></summary></entry><entry><title type="html">What I Talk About When I Talk About PostHog</title><link href="https://www.dylanamartin.com/2026/01/28/what-I-talk-about-when-I-talk-about-posthog.html" rel="alternate" type="text/html" title="What I Talk About When I Talk About PostHog" /><published>2026-01-28T00:00:00+00:00</published><updated>2026-01-28T00:00:00+00:00</updated><id>https://www.dylanamartin.com/2026/01/28/what-I-talk-about-when-I-talk-about-posthog</id><content type="html" xml:base="https://www.dylanamartin.com/2026/01/28/what-I-talk-about-when-I-talk-about-posthog.html"><![CDATA[<p>I’ve been at PostHog for about eighteen months now. Long enough to ship meaningful work, long enough to break things in production, long enough to feel the weight of both. This is my attempt to write down what that’s been like.</p>

<h2 id="part-one-the-excellent">Part One: The Excellent</h2>

<p>The thing I keep coming back to is the combination of autonomy, opportunity, and impact. These three words get thrown around a lot in job postings, but at PostHog they actually mean something. I’ve worked on genuinely interesting problems that matter to the business, and I’ve been given real freedom in how I solve them. Most days I wake up excited about the work. That’s rare, and I don’t take it for granted.</p>

<p>The feature flags team sits at an interesting intersection: we’re responsible for infrastructure that needs to be fast and reliable (we serve billions of flag evaluations), but we’re also building product that developers interact with directly. I’ve gotten to do both. I <a href="https://posthog.com/blog/even-faster-more-reliable-flags">rewrote our evaluation service in Rust</a>, shaving latency and improving reliability. I’ve also shipped product features, worked on SDK improvements, and thought deeply about developer experience. The breadth is energizing.</p>

<p>The learning has been extraordinary. I came to PostHog wanting to write Rust professionally. At my previous startup, I’d read Luca Palmieri’s <em>Zero to Production in Rust</em> and knew the language was a good fit for the performance-critical work I wanted to do, but the existing tech stack and hiring concerns made it impractical. At PostHog, I finally got the chance. I’ve now built and shipped production Rust services handling real scale. I’ve learned how to operate distributed systems, how to debug cascading failures, how to think about reliability as a discipline rather than an afterthought. All of this through the lens of actual work, not side projects or tutorials.</p>

<p>And then there are the people. PostHog has assembled an extraordinary group of engineers from around the world. The talent density is intimidating in the best way; I’m constantly learning from my teammates. The company also invests heavily in bringing people together in person. In my eighteen months, I’ve done team meetups and offsites in Toronto, New York, Amsterdam, and San Francisco, plus company-wide offsites in Mykonos and Mexico. I have another one coming up in London. These trips aren’t just perks; they’re how a distributed team builds the trust and rapport that makes async collaboration actually work<sup id="fnref:meetups" role="doc-noteref"><a href="#fn:meetups" class="footnote" rel="footnote">1</a></sup>.</p>

<h2 id="part-two-the-challenges">Part Two: The Challenges</h2>

<p>I want to be honest about this part, because I think the hard stuff is where the real learning happens.</p>

<p>The biggest challenge was rebuilding our feature flags evaluation engine while it was running in production. You can’t feature flag the feature flag service. Every change ships to everyone, immediately. This constraint forced me to think carefully about testing, validation, and rollout strategies. I built extensive test harnesses, shadow testing infrastructure, and tooling to validate behavior before shipping. It was some of the most disciplined engineering work I’ve done.</p>

<p>The rewrite itself went well. What came after was harder.</p>

<p>In late 2025, we had a series of incidents. Four outages in October alone, totaling over fourteen hours of customer impact. The technical details are in our <a href="https://github.com/PostHog/post-mortems/blob/main/2025-10-21-feature-flags-recurring-outages.md">post-mortems</a>, but the short version is: we discovered failure modes in the new service that we hadn’t anticipated. CPU resource sizing issues caused cascading failures. Connection pools exhausted under load. Retry logic amplified problems instead of containing them. Each incident taught us something, but the lessons came at a cost.</p>

<p>Those weeks were some of the hardest of my career. I wasn’t sleeping well. The feeling of letting customers down was awful; these are developers who depend on our service to ship their own products, and we were failing them. I took some time off. I seriously considered whether I wanted to keep doing this kind of work.</p>

<p>What got me through was the team and the culture. Folks from the leadership team reached out to me directly to check in and offer support – I typically don’t hear much from them, but they all reached out when I needed it, and that affected me more than I expected it would. PostHog practices blameless post-mortems, and they really mean it. After each incident, the question was never “who screwed up?” but “what allowed this to happen?” My coworker Phil Haack <a href="https://haacked.com/archive/2026/01/06/one-year-at-posthog/">wrote about this</a> in his own reflection on his first year. That approach made it possible to actually learn from the failures instead of just feeling bad about them.</p>

<p>The incidents also forced us to rethink our team structure. Before, we had one feature flags team with a sprawling scope: SDKs, product UI, platform infrastructure. After, we split into two focused teams. Phil now leads the Flags Platform team, laser-focused on performance, reliability, and architecture. I lead the Feature Flags product team, focused on the configuration UI, cohorts, early access features, and SDKs. The split lets each team go deep on their domain without feeling pulled in competing directions.</p>

<p>Looking back, I think the hardest part wasn’t the technical debugging. It was sitting with the uncertainty while we were still figuring things out. There’s a specific kind of dread that comes from knowing something is broken, knowing people are affected, and not yet knowing why. Learning to function in that state, to keep investigating methodically instead of panicking, was its own kind of growth.</p>

<h2 id="part-three-whats-next">Part Three: What’s Next</h2>

<p>For my first eighteen months, I was hired as a product engineer but spent most of my time on platform work. Rust, performance, reliability, infrastructure. I loved it, and I’m proud of what we built. But I’m ready for a change.</p>

<p>I’m shifting my focus toward product engineering. I want to get closer to our users and think more directly about business impact. Feature flags is already a sticky product, which is solid for something that isn’t a daily-use feature. But our activation rate is lower than I’d like. There’s a gap between people who express interest during onboarding and people who actually end up using the product. That gap feels like an opportunity.</p>

<p>I want to understand why people bounce. I want to make the product so good that trying it feels effortless. I want PostHog to be the obvious choice for teams who care about feature flags and want to understand what those flags actually do.</p>

<p>This is different work than writing Rust services. It’s more ambiguous, more user-facing, more tied to metrics I can’t fully control. I’m excited to learn how to be good at it.</p>

<hr />

<p>If any of this resonates, we’re hiring.</p>

<p>Phil’s <a href="https://posthog.com/teams/flags-platform">Flags Platform team</a> is looking for <a href="https://posthog.com/careers/backend-engineer">backend engineers</a> who want to tackle hard problems at scale: Rust, distributed systems, reliability engineering. If you want to work on infrastructure that serves billions of requests and learn from some genuinely excellent engineers, this is the role.</p>

<p>My <a href="https://posthog.com/teams/feature-flags">Feature Flags product team</a> is hiring <a href="https://posthog.com/careers/product-engineer">product engineers</a> who care about developer experience and want to ship features that users actually love. If you’re energized by the intersection of product thinking and technical depth, come work with us.</p>

<div class="footnotes" role="doc-endnotes">
  <ol>
    <li id="fn:meetups" role="doc-endnote">
      <p>PostHog gives every team a meetup budget to get together in person several times a year, separate from the company-wide offsites. It’s one of those policies that sounds nice on paper but genuinely changes how the work feels. Hard to overstate how much easier it is to collaborate async with someone after you’ve spent a week working alongside them. <a href="#fnref:meetups" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
  </ol>
</div>]]></content><author><name>Dylan</name></author><category term="career" /><category term="reflection" /><category term="software engineering" /><category term="posthog" /><summary type="html"><![CDATA[I’ve been at PostHog for about eighteen months now. Long enough to ship meaningful work, long enough to break things in production, long enough to feel the weight of both. This is my attempt to write down what that’s been like.]]></summary></entry><entry><title type="html">New Year, New Me</title><link href="https://www.dylanamartin.com/2026/01/27/new-year-new-me.html" rel="alternate" type="text/html" title="New Year, New Me" /><published>2026-01-27T00:00:00+00:00</published><updated>2026-01-27T00:00:00+00:00</updated><id>https://www.dylanamartin.com/2026/01/27/new-year-new-me</id><content type="html" xml:base="https://www.dylanamartin.com/2026/01/27/new-year-new-me.html"><![CDATA[<p>I redesigned this site over the weekend. If you’re reading this, you’re looking at the new version.</p>

<p>The old design was fine. It worked, it loaded fast, it was readable. But it had the classic “developer blog” energy: purely functional, zero personality. I wanted something that felt more like a design studio homepage and less like a default Jekyll theme with the serial numbers filed off.</p>

<h2 id="what-i-was-going-for">What I was going for</h2>

<p>I spend a lot of time looking at personal sites. The ones I keep coming back to share a few traits:</p>

<ul>
  <li>
    <p><strong>Monospace as a design choice.</strong> The best personal sites I’ve seen use monospace because it creates a specific mood: technical, deliberate, slightly editorial. I wanted that same energy; Berkeley Mono in particular has enough personality to carry body text while still feeling sharp at small sizes for nav labels and metadata.</p>
  </li>
  <li>
    <p><strong>Strong typographic hierarchy.</strong> Big bold headings, tight letter-spacing, uppercase labels for navigation and section headers. The kind of thing where you can squint at the page and still understand its structure. Type size, weight, and spacing do the heavy lifting; color and decoration stay minimal.</p>
  </li>
  <li>
    <p><strong>Structural borders.</strong> I like sites where the borders do real work: separating navigation from content, delineating sidebar sections, anchoring lists. Intentional 2px lines that say “this is a boundary.” The nav and footer get strong borders; everything else stays subtle.</p>
  </li>
  <li>
    <p><strong>Light and dark, automatically.</strong> I’ve had dark mode on this site since <a href="/2020/12/04/implementing-dark-mode-for-my-website.html">2020</a>, but the old palette was an afterthought. This time both modes are first-class citizens via <code class="language-plaintext highlighter-rouge">prefers-color-scheme</code>, with warm off-whites and deep charcoals.</p>
  </li>
  <li>
    <p><strong>Sidebars that earn their space.</strong> Each sidebar section is a discrete card with its own purpose: stats, links, subscribe options, fun facts. If a page has a sidebar, the sidebar has a job.</p>
  </li>
</ul>

<h2 id="the-font">The font</h2>

<p>I switched everything to <a href="https://berkeleygraphics.com/typefaces/berkeley-mono/">Berkeley Mono</a>. I’ve been using it in my editor for a while and I think it’s the best monospace font available right now. It has enough character to carry long-form prose and it looks great at the small sizes I use for navigation and section labels.</p>

<p>Self-hosting was straightforward: four <code class="language-plaintext highlighter-rouge">@font-face</code> declarations pointing at <code class="language-plaintext highlighter-rouge">.otf</code> files, with <code class="language-plaintext highlighter-rouge">font-display: swap</code> to avoid FOUT. The fallback stack goes TX-02, JetBrains Mono, SF Mono, Fira Code, Cascadia Code, so it degrades gracefully.</p>

<h2 id="the-layout">The layout</h2>

<p>The grid is simple: a main content area and a 260px sidebar on desktop, collapsing to a single column on mobile. CSS Grid makes this trivially easy:</p>

<div class="language-css highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nc">.page-content.with-sidebar</span> <span class="p">{</span>
  <span class="py">grid-template-columns</span><span class="p">:</span> <span class="m">1</span><span class="n">fr</span> <span class="n">var</span><span class="p">(</span><span class="n">--sidebar-width</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Every page that benefits from a sidebar gets one. The writing index has stats and subscribe links. Individual posts have an info table with publish date, word count, and slug. The homepage has social links, feeds, and fun facts. Pages like speaking and projects use the full width.</p>

<h2 id="the-details">The details</h2>

<p>A few smaller decisions that I think matter:</p>

<p><strong>Navigation</strong> is uppercase, bold, and letterspaced. More design studio masthead than list of links. The nav and footer both use 2px borders against the strong color, so the page has clear top and bottom anchors.</p>

<p><strong>Section headers</strong> (“Currently”, “Previously”, “Interests” on the homepage) are <code class="language-plaintext highlighter-rouge">display: inline-block</code> with a 2px underline. This gives them visual weight while keeping them compact.</p>

<p><strong>The post list</strong> has a bold top border and per-post word counts inline with the date. I added aggregate word count stats to the sidebar too; partly because I’m curious, partly because it’s the kind of thing I like seeing on other people’s sites.</p>

<p><strong>Sidebar sections</strong> are bordered cards with uppercase headings and a subtle bottom border inside each card. Small thing, but it makes the sidebar feel intentional.</p>

<h2 id="content-changes">Content changes</h2>

<p>While I was in there, I made some structural changes too:</p>

<ul>
  <li><strong>Renamed “Blog” to “Writing”</strong> since that’s more accurate and I like how it reads in the nav.</li>
  <li><strong>Renamed “Talks” to “Speaking”</strong> for the same reason.</li>
  <li><strong>Created a Digest page</strong> by pulling the reading list out of the Media page and giving it its own home. It felt buried before.</li>
  <li><strong>Added a Uses page</strong> because I’ve always liked <code class="language-plaintext highlighter-rouge">/uses</code> pages on other developers’ sites (h/t to <a href="https://usesthis.com/">UsesThis.com</a>).</li>
  <li><strong>Added word counts</strong> to the writing index, the stats sidebar, and each post’s info box.</li>
</ul>

<h2 id="the-tools">The tools</h2>

<p>This is still a Jekyll site hosted on GitHub Pages. One CSS file, a few Liquid templates, some HTML. The whole design system lives in CSS custom properties:</p>

<div class="language-css highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nd">:root</span> <span class="p">{</span>
  <span class="py">--bg</span><span class="p">:</span> <span class="m">#fafaf8</span><span class="p">;</span>
  <span class="py">--text</span><span class="p">:</span> <span class="m">#1a1a2e</span><span class="p">;</span>
  <span class="py">--accent</span><span class="p">:</span> <span class="m">#3d5af1</span><span class="p">;</span>
  <span class="py">--border</span><span class="p">:</span> <span class="m">#d0d0d0</span><span class="p">;</span>
  <span class="py">--border-strong</span><span class="p">:</span> <span class="m">#1a1a2e</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Dark mode is a <code class="language-plaintext highlighter-rouge">prefers-color-scheme: dark</code> media query that swaps those values. Your OS decides.</p>

<p>The spirit here, as it’s been since I made the <a href="https://github.com/dmarticus/dmarticus.github.io/pull/1/changes#diff-3729493d031f7e2d26243070815ce0be4cc97590732407d8bcb15735452f0afbR1-R17">first commit</a> to this site, is basically <a href="https://motherfuckingwebsite.com/">motherfuckingwebsite.com</a>: no JS, no build step, no dependencies beyond what a browser already gives you. HTML, CSS, and content. The whole site loads fast, works everywhere, and I can understand every line of it.</p>]]></content><author><name>Dylan</name></author><category term="website" /><category term="design" /><category term="css" /><summary type="html"><![CDATA[I redesigned this site over the weekend. If you’re reading this, you’re looking at the new version.]]></summary></entry><entry><title type="html">Dotfiles</title><link href="https://www.dylanamartin.com/2026/01/04/dotfiles.html" rel="alternate" type="text/html" title="Dotfiles" /><published>2026-01-04T00:00:00+00:00</published><updated>2026-01-04T00:00:00+00:00</updated><id>https://www.dylanamartin.com/2026/01/04/dotfiles</id><content type="html" xml:base="https://www.dylanamartin.com/2026/01/04/dotfiles.html"><![CDATA[<p>I was talking to my buddy <a href="https://cameron.otsuka.systems/">Cameron</a> about all of the custom Claude code stuff I’ve been tinkering with (I talk about this in <a href="/2025/11/07/spinning-plates.html">Spinning Plates</a> and <a href="/2025/11/24/racing-towards-bethlehem.html">Racing towards Bethlehem</a>), and he asked me if he could see some of the agent stuff I’ve written. This made me realize that I’ve never actually published the dotfiles where I keep all my configurations. His question, plus recently reading <a href="https://www.jmduke.com/posts/dotfiles.html">Justin’s post about this</a>, inspired me to clean up the code and <a href="https://github.com/dmarticus/dotfiles">publish my dotfiles</a>. Maybe folks will find them useful, but even if I’m the only one who does, I’m glad they’re public.</p>]]></content><author><name>Dylan</name></author><category term="ai" /><category term="work" /><category term="dotfiles" /><category term="setup" /><summary type="html"><![CDATA[I was talking to my buddy Cameron about all of the custom Claude code stuff I’ve been tinkering with (I talk about this in Spinning Plates and Racing towards Bethlehem), and he asked me if he could see some of the agent stuff I’ve written. This made me realize that I’ve never actually published the dotfiles where I keep all my configurations. His question, plus recently reading Justin’s post about this, inspired me to clean up the code and publish my dotfiles. Maybe folks will find them useful, but even if I’m the only one who does, I’m glad they’re public.]]></summary></entry><entry><title type="html">We Have New York at Home</title><link href="https://www.dylanamartin.com/2025/12/19/we-have-new-york-at-home.html" rel="alternate" type="text/html" title="We Have New York at Home" /><published>2025-12-19T00:00:00+00:00</published><updated>2025-12-19T00:00:00+00:00</updated><id>https://www.dylanamartin.com/2025/12/19/we-have-new-york-at-home</id><content type="html" xml:base="https://www.dylanamartin.com/2025/12/19/we-have-new-york-at-home.html"><![CDATA[<p>I spent a month living and working in New York recently, and I loved it — not for any one specific high, but for how much better my day-to-day life felt. The shape of my days wasn’t even that different from Seattle: I still worked, grabbed coffee, ran errands, met up with friends. It just felt easier to be out in it. More “let’s hit the town tonight” baked into a random Tuesday.</p>

<p>What I’m trying to keep is the state it put me in: leaving the house at any opportunity, walking places by default, keeping a running backlog of things to try, and treating any day like it might be worth doing something.</p>

<p>I’ve lived in Seattle for more than 8 years and I still love it here — the nature access is world class, and I find the city beautiful. But after living here for so long (the longest I’ve ever lived somewhere since before I went to college), I’ve noticed that it’s easy to settle into a routine and stop exploring. So, I’m writing this down partly so I don’t spend the winter wishing I was still in New York, and partly to see if I can make these behaviors stick.</p>

<h2 id="walking-as-continuity">Walking as continuity</h2>

<p>In New York, walking is how the day moves. And because you’re on foot, you experience the space between things.</p>

<p>That matters more than I expected. Walking makes time feel contiguous. You get little hits of texture: a new storefront, a poster for a show, a line outside a place you didn’t know existed, weather that forces you to actually notice the season.</p>

<p>In Seattle it’s too easy to turn life into teleportation. Efficient, private, slightly dead.</p>

<p>So I’m trying to walk more as a way to keep the city “on.” If it’s plausibly walkable, I want my default to be: fine, I’ll walk.</p>

<h2 id="passive-transit-is-goated">Passive transit is goated</h2>

<p>In New York, the whole city felt 30 minutes away. I know it’s the most tired take of all time, but the ubiquity and consistency of the subway was perpetually delightful — I’d show up basically anywhere, ride, arrive. I could read, write, play Balatro, or just stare into the middle distance and let my brain idle. That’s time you don’t get back when you’re driving.</p>

<p>And it wasn’t just commuting. Going out at night, meeting someone across town, running errands — it was all the same. You just… go. No parking, no route decisions, no low-level vigilance. I didn’t realize how much I liked that until I had it constantly.</p>

<p>Seattle isn’t that. Link is expanding and buses run, but coverage is spotty, frequency varies, and for most trips driving is still faster. I’m lucky — I live on a bus line that goes straight to my office and my gym, so I’ve started taking it for both. Some days the app lies and I just drive. But when it works, I show up less scattered, and I’ve already had 20 minutes to read or do nothing. That’s worth protecting.</p>

<h2 id="keep-a-running-list-of-spots-to-hit">Keep a running list of spots to hit</h2>

<p>This was the big one, and it’s something I know my friends have done for years and I’ve balked for a while out of (mostly) laziness. We all have to learn our lessons in our own time.</p>

<p>But yeah, in New York, I kept a list of places I wanted to go (and <a href="https://maps.app.goo.gl/A6E3R6HkmzVETfHfA">I made a list of places I did go</a>): coffee shops, bars, restaurants, museums, neighborhoods, specific dishes. It was a backlog, and it had the obvious effect of making me more excited to go out.</p>

<p>It turned “what should we do tonight?” from an empty question into a menu. I wasn’t inventing a plan from scratch when I was tired; I was selecting something I’d already pre-approved when I had energy.</p>

<p>Back in Seattle, I realized how easy it is to fall into the same loop unless I stay current on what’s new. New stuff opens quietly. Scenes shift. Neighborhoods evolve. If I don’t capture that anywhere, I default to the same places because they’re good and easy and already in my head.</p>

<p>So now I keep a “Seattle backlog” on purpose:</p>

<ul>
  <li>Restaurants and bars I’ve heard about</li>
  <li>Coffee shops and bakeries</li>
  <li>Bookstores, galleries, small venues</li>
  <li>Parks, viewpoints, hiking spots</li>
  <li>Single specific items worth leaving the house for</li>
</ul>

<p>And I treat it like an object I maintain casually. Friend recommends a place? List. I walk past something interesting? List. I see a poster? List.</p>

<h2 id="bring-the-energy">Bring the energy</h2>

<p>I’ve never been one to shy away from going out, and I’ve been accused of being “high-energy” by friends and foes alike. But even I’ve succumbed to the Seattle-specific trap (which is especially bad in winter): waiting to feel like going out.</p>

<p>Seattle is quieter, and the city’s energy doesn’t exactly do you favors here. In New York, it’s easy to get swept along — there’s always something happening, and it feels like the default setting is “sure, why not.” In Seattle, you can blink and it’s 9pm and you’re still on the couch, perfectly comfortable (and the bars close in 2 hours anyway so what’s the point).</p>

<p>Rain also makes that feel rational. “Cozy” becomes ideology. And sometimes staying in is correct. But sometimes it’s just inertia that’s learned how to speak softly.</p>

<p>The thing is, the biggest hurdle isn’t the weather or the city — it’s me. If I actually decide to go out, there are always places in Seattle that are a good time. Despite my flippancy, there’s always a bar with a vibe, a restaurant that hits, a show somewhere, a friend who’s down.</p>

<p>One thing I liked about myself in New York is that I didn’t feel that complacency as much. I’d just decide the night was happening. Pick a place. Go. I’m trying to keep that: put on real clothes, make the plan, leave the house.</p>

<h2 id="you-aint-gonna-need-it">You ain’t gonna need it</h2>

<p>Our place in New York was small and tasteful, but lacking in many of the creature comforts of home. By the end it felt more like a feature than a bug though, turns out living with fewer things exposed yet another obvious truth: extra stuff is mostly maintenance.</p>

<p>More clothes, more objects, more “just in case” adds up. Physical clutter; attention debt.</p>

<p>I was surprised to notice this because my Seattle setup felt perfect to me before I left. My wife and I love our place. It’s beautiful, it’s great for entertaining, the view is good. My eight sleep mattress has ruined all other beds for me. But like anything else, stuff accumulates, and it’s weirdly hard to let things go once they’ve been added to the fold. Living with less helped me notice the accumulation.</p>

<p>I want my place to feel light enough that it doesn’t pre-tire me, and nice enough that I’m not collecting stuff just to compensate for anything.</p>

<h2 id="the-point-of-all-this">The point of all this</h2>

<p>I’m making a bet that Seattle already has a lot of the raw material for me to build a daily lifestyle similar to what I had in New York. The only real question is whether I keep choosing it, and if it sticks. I haven’t abandoned the thought of moving to New York altogether.</p>]]></content><author><name>Dylan</name></author><category term="reflection" /><category term="seattle" /><category term="lifestyle" /><category term="personal" /><summary type="html"><![CDATA[I spent a month living and working in New York recently, and I loved it — not for any one specific high, but for how much better my day-to-day life felt. The shape of my days wasn’t even that different from Seattle: I still worked, grabbed coffee, ran errands, met up with friends. It just felt easier to be out in it. More “let’s hit the town tonight” baked into a random Tuesday.]]></summary></entry><entry><title type="html">Racing towards Bethlehem</title><link href="https://www.dylanamartin.com/2025/11/24/racing-towards-bethlehem.html" rel="alternate" type="text/html" title="Racing towards Bethlehem" /><published>2025-11-24T00:00:00+00:00</published><updated>2025-11-24T00:00:00+00:00</updated><id>https://www.dylanamartin.com/2025/11/24/racing-towards-bethlehem</id><content type="html" xml:base="https://www.dylanamartin.com/2025/11/24/racing-towards-bethlehem.html"><![CDATA[<p>After I published <a href="/2025/11/07/spinning-plates.html">Spinning Plates</a>, my old coworker <a href="https://danielbachhuber.com/">Daniel</a> left <a href="https://www.linkedin.com/feed/update/urn:li:activity:7393663853852557312?commentUrn=urn%3Ali%3Acomment%3A%28activity%3A7393663853852557312%2C7393672175381045248%29&amp;dashCommentUrn=urn%3Ali%3Afsd_comment%3A%287393672175381045248%2Curn%3Ali%3Aactivity%3A7393663853852557312%29">a comment</a> I couldn’t stop thinking about:</p>

<blockquote>
  <p>“If your individual velocity has increased, how are you handling the other bottlenecks in the system — code review being an obvious one?”</p>
</blockquote>

<p>Around the same time, I’d been revisiting Ordep’s <a href="https://ordep.dev/posts/writing-code-was-never-the-bottleneck">Writing Code was Never the Bottleneck</a> essay, which opens with a line that feels like a koan:</p>

<blockquote>
  <p>“Writing lines of code was never the bottleneck in software engineering.<br />
The actual bottlenecks were, and still are, code reviews, knowledge transfer, testing, debugging, and the human overhead of coordination.”</p>
</blockquote>

<p>Right. Exactly. That’s been the shape of the job for decades. When writing becomes nearly free, all the work you can’t automate steps out of the shadows.</p>

<p>And yet something <em>has</em> shifted for me. Not the bottlenecks themselves, but the work wrapped around them. That’s what this post is trying to unpack.</p>

<h2 id="the-bottlenecks-are-still-the-bottlenecks">The bottlenecks are still the bottlenecks</h2>

<p>Ordep’s core point is, in my experience, correct: reviewing is hard, debugging is hard, understanding intent is hard, maintaining shared mental models is hard, and making judgment calls is hard. His second point lands even harder:</p>

<blockquote>
  <p>“The marginal cost of adding new software is approaching zero.<br />
But the price of understanding, testing, and trusting that code? Higher than ever.”</p>
</blockquote>

<p>I feel this. I can ship more working code in an afternoon than I used to ship in a week, but verifying it, reasoning about its behavior, and protecting the shape of the system hasn’t gotten cheaper. If anything, it has expanded the amount of code I’m implicitly responsible for.</p>

<h2 id="llms-as-connective-tissue-not-output-machines">LLMs as connective tissue, not output machines</h2>

<p>That said, the work around these bottlenecks has changed. The tools are not solving the hard parts, but they make it easier to reach them. Over the last six months, I have been surprised by how helpful LLM-based tools are as navigational aids rather than generators. They each fill a different gap. <a href="https://www.greptile.com/">Greptile</a> gives me a second look at my own PRs and catches high-level issues that are easy to miss once you have been staring at a diff for too long. <a href="https://0github.com/">0github</a> is now my starting point when I review someone else’s changes; its heat-map diff and “risk” slider point me to the sections that deserve real attention. And I have been leveraging Claude Code as a way to understand new parts of the codebase. When I explore a new subsystem, I will ask it to outline the key files and invariants so I can go straight to the important pieces instead of wandering through the directory tree.</p>

<p>The common thread in all of these tools is simple: they cut down the time it takes to gather context. I spend less energy on the overhead and more on the actual reasoning. They do not replace the difficult parts, but they make it easier to get to them. That was the part I did not expect — the bottlenecks stayed where they were, but the road leading to them became much smoother.</p>

<h2 id="naming-the-tension">Naming the tension</h2>

<p>The smoother road comes with its own tradeoffs, though. After I posted last time, my dad wrote me a long note that helped me put words to the thing I had been circling. He is a <a href="https://www.colorado.edu/ebio/andrew-martin">professor of evolutionary biology</a>, and his message was very him: part philosophy, part evolutionary metaphor. He sent a series of questions and observations:</p>

<blockquote>
  <p>“Where is the balance?<br />
Is there a shifting baseline?<br />
Drift is easy; drift erodes capacity.<br />
You’re describing a moving human–machine interface.<br />
Being intentional about what you want to learn is a daily practice.<br />
This is an adaptationist mindset.”</p>
</blockquote>

<p>That last line made the whole thing click. If the environment is shifting under my feet, then my habits have to adapt with it. The tools make it easier to reach the real work, but they also make it easier to skip the parts that build intuition. The moment I let them shortcut the reasoning, the learning curve flattens. Drift is quiet at first, then it accelerates, and once it starts, it is hard to undo.</p>

<h2 id="what-responsible-use-looks-like-for-me">What responsible use looks like for me</h2>

<p>All of this raised a practical question for me: if the road to the bottlenecks is smoother, how do I make sure I am still doing the part of the work that actually builds skill? That is where I started drawing a line for myself. I use LLMs to accelerate navigation, mapping, summarization, risk-surfacing, tracing, onboarding, interpreting test failures, and spotting suspicious patterns. In other words, anything that helps me figure out where to look and what deserves attention. I avoid using them to speed up correctness, design, invariants, architecture, root-cause debugging, or tradeoff decisions. Those parts only improve through repetition and deliberate attention. I am not chasing purity here; I just do not want to weaken the muscles that matter. If a tool helps shrink the search space, I am happy to use it. If it tempts me to ignore the search space entirely, that is where I step back.</p>

<h2 id="the-way-has-never-felt-faster">The way has never felt faster</h2>

<p>I agree with Ordep: writing code was never the bottleneck, and it still isn’t. Understanding, reviewing, coordinating, and verifying remain the real constraints, and they still require a human brain fully switched on. What feels new is that LLMs finally help with the work around those constraints, the parsing and mapping and sense-making that used to take most of my energy. They make the bottleneck easier to see and easier to reach, even though they don’t change what happens once I’m there. They don’t lower the cost of judgment, but they lower the cost of arriving at the moment where judgment is required. If I can keep adaptation ahead of drift, it improves more than my output; it improves the way I approach problems. The constraints have not moved, but I get to them sooner and with less wandering. The pace has changed. We are not slouching toward the next bottleneck; we are moving straight into it, and I have to decide whether I am meeting that speed on purpose or just being carried along.</p>]]></content><author><name>Dylan</name></author><category term="ai" /><category term="reflection" /><category term="work" /><summary type="html"><![CDATA[After I published Spinning Plates, my old coworker Daniel left a comment I couldn’t stop thinking about:]]></summary></entry><entry><title type="html">Spinning Plates</title><link href="https://www.dylanamartin.com/2025/11/07/spinning-plates.html" rel="alternate" type="text/html" title="Spinning Plates" /><published>2025-11-07T00:00:00+00:00</published><updated>2025-11-07T00:00:00+00:00</updated><id>https://www.dylanamartin.com/2025/11/07/spinning-plates</id><content type="html" xml:base="https://www.dylanamartin.com/2025/11/07/spinning-plates.html"><![CDATA[<p>Over the last six months, the way I work has changed more than it did in the previous six years.</p>

<p>I have a feeling this isn’t just a “me” thing. Companies and tech influencers are loudly pushing AI-first workflows, YouTube talks and blog posts explain “how I use LLMs” as if that’s now a core part of your stack, and experienced devs keep reporting the same weird combo: they <em>feel</em> faster while also suspecting they’re learning less.<sup id="fnref:bignote" role="doc-noteref"><a href="#fn:bignote" class="footnote" rel="footnote">1</a></sup></p>

<p>Even as I write this, I have Claude Code spun up in another terminal pane, churning out unit tests for a chunk of code that I mostly didn’t write but that’s probably 95% of the way to production-ready.</p>

<p>It’s incredible.</p>

<p>I genuinely don’t know how I feel about it.</p>

<h2 id="tldr">TL;DR</h2>

<ul>
  <li>LLMs have transformed my work into lots of shallow touches on many tasks instead of long focus sessions on a few things.</li>
  <li>I’m shipping more than ever, but I retain less and feel less ownership.</li>
  <li>I now spend as much time optimizing my workflow as I do thinking about the code itself, and the ROI is objectively good and subjectively cursed.</li>
  <li>My deep-focus muscle is weaker; hard problems feel harder.</li>
  <li>I’m increasingly inspired by Andrej Karpathy’s “tab complete first, type it out to learn” approach and Dan Abramov’s worries about people not wanting to learn at all, and I’m trying to decide how far I want to lean into each side.</li>
</ul>

<h2 id="my-new-model-of-programming">My new model of programming</h2>

<p>My old “good day” looked like this:</p>

<ol>
  <li>load one tricky problem into my head,</li>
  <li>struggle with it,</li>
  <li>get stuck,</li>
  <li>have a small breakthrough,</li>
  <li>ship a PR.</li>
</ol>

<p>The feedback loop was long but satisfying: effort in, understanding out. Most of the joy lived in steps 2–4; that’s where I actually learned things. The throughput wasn’t insane, but at least I felt like I owned what I shipped.</p>

<p>Now, a typical “good” day looks more like:</p>

<ul>
  <li>write a short-to-medium-length prompt about a bug or feature;</li>
  <li>let Claude Code write a first pass over 3–4 files;</li>
  <li>skim the output, fix the obviously wrong bits, add logging, run tests;</li>
  <li>repeat that loop across a few tasks, plus reviews, plus a doc or two.</li>
</ul>

<p>By the end of the day, my GitHub looks <a href="https://github.com/dmarticus/">absolutely cracked</a>. Lots of green. On a good day I can crank out multiple features that are well tested and work as soon as they hit production.</p>

<p>Under the hood, though, the feedback loop has been re-sized:</p>

<ol>
  <li>define the shape of what I want,</li>
  <li>write a prompt and hand it context,</li>
  <li>triage whatever comes back,</li>
  <li>iterate until it compiles / passes / looks fine.</li>
</ol>

<p>There’s still skill there—prompt design, taste, knowing what “smells wrong” — but the locus of difficulty moves. It’s less “can <em>I</em> solve this?” and more “can I supervise this?” The reward structure changes too: less “wow, I learned something deep” and more “nice, another plate spun and didn’t fall.” Still satisfying, just in a more managerial way.</p>

<p>And when the model drops a solution fully formed, part of the joy evaporates. The problem gets solved; I don’t get the same little “oh sick, I figured that out” hit. When I scroll through my own diffs, I’ll sometimes realize I can’t actually explain why I did X instead of Y in a given PR without re-reading the whole thing.</p>

<p>The work has my name on it; it just doesn’t always feel like it has my fingerprints. Intellectually, I know that programming has always been “assembly on top of other people’s abstractions.” Emotionally, the gap between “I shipped this” and “I understand this” has never felt quite this wide.</p>

<h2 id="meta-work-all-the-way-down">Meta-work all the way down</h2>

<p>The really cursed bit has been where I’ve redirected my effort in the name of plate-spinning optimization.</p>

<p>Over the last few months I’ve spent a <em>lot</em> of time:</p>

<ul>
  <li>writing and re-writing my <a href="https://gist.github.com/dmarticus/a1a37e816b334f05f4dc90e4965a0b3d">CLAUDE.md</a> rules to hydrate Claude Code with exactly the right context,</li>
  <li>changing my editor and my Raycast snippets so “send this chunk + spec + failures” is one keystroke,</li>
  <li>structuring projects in ways that are maximally legible to an LLM (Hacker News was not too happy with my <a href="https://news.ycombinator.com/item?id=43305919">first pass at adding .cursorrules to my main work repo</a>),</li>
</ul>

<p>and less time doing the traditional “learn new thing from first principles” grind.</p>

<p>2020-me would’ve side-eyed the shit out of this behavior (I read a book called <em>Haskell Programming from First Principles</em> in 2020, ffs). But the annoying truth is that the ROI is stupidly good. Small tweaks in how I feed context into models translate into noticeably more throughput. It’s very <a href="https://www.sei.cmu.edu/library/busting-the-myths-of-programmer-productivity/">“busting the myths of programmer productivity”-coded</a>: tooling and environment beat raw keystrokes. This is just an extreme version of the same idea.</p>

<p>It does mean my “craft” feels different. Less “I write elegant code” and more “I design systems so other things can write pretty good code.”</p>

<p>That’s defensible if you frame it as leverage. It’s just weird to wake up and realize your job has quietly shifted from “builder” to “foreman” without anyone explicitly deciding that.</p>

<h2 id="the-deep-work-tax">The deep work tax</h2>

<p>The part that actually scares me is what these changes have been doing to my ability to go deep.</p>

<p>Some problems are still very much not “vibe-codeable”: bizarre perf issues, gnarly data modeling, pathological failure modes where I really do have to build a mental model and play it forward in my head. Those are the problems that historically made me feel like a real engineer.</p>

<p>I am noticeably slower at them now.</p>

<p>If I treat the LLM as a teammate—ask it questions, use it as a rubber duck, let it sketch branches I then reason about—that helps. I’m still doing the thinking; it’s just assisting. But the second I let it slide into “do the thinking for me,” my learning curve flattens. I get answers without much intuition attached.</p>

<p>There’s also the raw attention issue. Dev work has been “spinning plates while someone throws more plates at you” for a while: context switching, Slack, video calls, random interrupts. LLMs quietly make that worse:</p>

<ul>
  <li>starting a new “little task” is cheap, so it’s trivial to keep stacking plates;</li>
  <li>I can make a marginal amount of progress in shallow mode, so my brain learns it doesn’t need to fully load anything.</li>
</ul>

<p>When I finally try to go deep, all I seem to have left are short bursts of focus, and it’s been a long time since I’ve consistently been able to enter a real “flow state.”</p>

<p>Part of why I’m writing this is just to admit that out loud: my focus is worse, and I don’t like that trend.</p>

<h2 id="learning-from-karpathy">Learning from Karpathy</h2>

<p>I listened to the recent <a href="https://www.dwarkesh.com/p/andrej-karpathy">interview between Dwarkesh Patel and Andrej Karpathy</a> while I was in the middle of all this, and I liked the whole thing, but two parts really lodged in my brain since they felt relevant to this “how do I work with LLMs without lobotomizing myself” sentiment I’ve been feeling.</p>

<p>The first was how he actually uses them. In the section where he <a href="https://www.youtube.com/watch?v=lXUZvyajciY&amp;t=2142s">talks about how he codes day-to-day</a>, walking through how he built <a href="https://github.com/karpathy/nanochat">Nanochat</a>, he describes his workflow as almost entirely in-editor tab completion—his “sweet spot” for using LLMs—rather than giant freeform prompts in a chat box.</p>

<p>The second was how he encourages people to learn from Nanochat itself. The repo is framed as an educational, hackable, end-to-end LLM stack. His recommended way to study it (and his other materials) is very deliberate:</p>

<ul>
  <li>keep the reference repo or notebook on one side of your screen;</li>
  <li>on the other side, keep your own empty project;</li>
  <li><em>type everything in yourself</em> when you’re learning a new concept;</li>
  <li>no copy-paste, ideally no fancy completions, at least for the first pass.</li>
</ul>

<p>The goal is to build a real mental model instead of outsourcing it to autocomplete. It’s very “understand the thing, don’t just trust the abstraction.”</p>

<p>I find that mix really compelling: high-leverage tab completion for production work, almost monastic manual typing for learning. And I keep wondering whether I should push myself in that direction (less “prompt in a side pane”, more “use completions as a compressed spec”) plus explicit “no LLMs, just type” time when I’m adding new primitives to my brain.</p>

<h2 id="or-just-learn-period">Or just learn, period</h2>

<p>Contrast that with Dan Abramov, <a href="https://bsky.app/profile/danabra.mov/post/3lzewnojls226">who recently wrote about feeling pessimistic about educational content</a>: there’s this sense that people don’t actually want to <em>learn</em> anymore, they want pasteable answers and “LLM will do it” workflows instead.</p>

<p>That hits a little too close.</p>

<p>It’s not that I don’t want to learn—I do—but the system I’ve built around myself makes it extremely easy to default to “Eh, I’ll just prompt it.” And in a world where content creators are noticing demand shift away from deep explanations toward “just give me the fix,” it’s very easy to let my own habits drift the same way.</p>

<p>Karpathy’s “type it all out” energy feels like a deliberate counter-move to that. A personal protest: yes, I <em>could</em> vibe code this, but instead I’m going to sit here and actually understand it, line by line.</p>

<p>I don’t really want to become the person Dan is worried about writing for.</p>

<h2 id="when-the-plates-slow-down">When the plates slow down</h2>

<p>There’s also this amplitude thing: the highs are higher, the lows are lower.</p>

<p>On a good plate-spinning day, I feel unstoppable. I’m:</p>

<ul>
  <li>shipping easy features end-to-end,</li>
  <li>reviewing pull requests,</li>
  <li>giving interview feedback,</li>
  <li>triaging bugs,</li>
  <li>keeping up with team chatter.</li>
</ul>

<p>The bottleneck becomes “How fast can I read, parse, and respond?” and that’s always been one of my strengths, so it feels fantastic — everything aligns with what I’m good at.</p>

<p>But when anything in that loop slows down unclear spec, weird bug, flaky infra, life stuff — everything suddenly feels heavier. The plate stack is tall, and now I’m noticing how many of them are wobbling.</p>

<p>It stops being “Today was kind of unproductive” and turns into “I failed to keep the system at max output,” which is a wild standard to hold myself to, and a fast path to burnout if I let it ossify.</p>

<h2 id="where-do-i-go-from-here">Where do I go from here?</h2>

<p>So what do I actually do with all of this, besides write a neurotic blog post about plate metaphors?</p>

<p>Here’s my current plan; future-me can come back and roast me if I bail on any of it:</p>

<ol>
  <li>
    <p>Treat LLMs like power tools, not autopilot.<br />
For routine work—CRUD, migrations, boilerplate tests, glue code—I’m fine going full plate-spinner: let the machines churn as long as tests are green. For hard problems, default to “rubber duck that can autocomplete,” not “Boss, do my job.”</p>
  </li>
  <li>
    <p>Steal Karpathy’s split-brain approach.<br />
During normal dev, lean more into in-editor tab completion and inline comments as the primary interface, instead of bouncing to a giant chat window for everything. During learning time, pick one repo / concept, put it on one side of the screen, and type it all out manually on the other. No copy-paste, minimal autocomplete, maximal attention.</p>
  </li>
  <li>
    <p>Protect deep work like a real deliverable.<br />
Carve out a few hours a day where I don’t prompt at all. No “just one quick suggestion,” no “let me see what the model thinks.” Just me, the code, and my own brain. The goal isn’t purity; it’s to keep my intuition from atrophying.</p>
  </li>
  <li>
    <p>Optimize systems, not days.<br />
Try to evaluate weeks and months instead of chasing “max output” every single day. Did I learn anything deep? Did I move important things, or just many things? If a week has some sludgy, low-output days but a big conceptual gain, that should count as a win.</p>
  </li>
  <li>
    <p>Be okay with forgettable work.<br />
Not every PR needs to be a personal growth moment. Some code can just… exist. The key is being intentional about which problems I’m using to level up, and which ones I’m fine letting the machines mostly handle (my old coworker Glenn has been <a href="https://www.statetransition.co/p/the-rise-of-disposable-software">writing about this</a> on his Substack for a while now, ahead of the curve as always).</p>
  </li>
</ol>

<p>Beyond these steps, I don’t really have a clean conclusion. I’m very much in the middle of this, same as everyone else.</p>

<p>But I’m increasingly convinced that if my whole career turns into “telling a black box what to do,” I’ll burn out — even if the metrics say I’m 3x more productive.</p>

<p>For the hard stuff, I want to keep taking some plates down, sitting with them, and remembering what it feels like to hold one thing still for a while.</p>

<div class="footnotes" role="doc-endnotes">
  <ol>
    <li id="fn:bignote" role="doc-endnote">
      <p>Gergely Orosz writes about this and much more in his July 2025 post, <a href="https://newsletter.pragmaticengineer.com/p/software-engineering-with-llms-in-2025"><em>Software Engineering with LLMs in 2025</em></a>. <a href="#fnref:bignote" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
  </ol>
</div>]]></content><author><name>Dylan</name></author><category term="ai" /><category term="reflection" /><category term="work" /><summary type="html"><![CDATA[Over the last six months, the way I work has changed more than it did in the previous six years.]]></summary></entry><entry><title type="html">Show them something good</title><link href="https://www.dylanamartin.com/2024/10/29/show-them-something-good.html" rel="alternate" type="text/html" title="Show them something good" /><published>2024-10-29T00:00:00+00:00</published><updated>2024-10-29T00:00:00+00:00</updated><id>https://www.dylanamartin.com/2024/10/29/show-them-something-good</id><content type="html" xml:base="https://www.dylanamartin.com/2024/10/29/show-them-something-good.html"><![CDATA[<p>The other day, I was listening to a podcast featuring one of my favorite thinkers, Harry Miree - a drummer/YouTuber known for his work with LoCash, Boom City, and most recently as the tour drummer for HARDY. Harry’s introspective nature offers a unique perspective on a career path I’ve always admired from afar: the professional musician (professional in the sense that he gets paid to play in bands that don’t bear his name)<sup id="fnref:bignote" role="doc-noteref"><a href="#fn:bignote" class="footnote" rel="footnote">1</a></sup>.</p>

<p>A particular segment of the podcast struck me. Harry and the host were discussing aspiring musicians who move to Nashville to break into the music industry:</p>

<blockquote>
  <p>“I remember asking everyone [what they were doing in Nashville], the Barista or the Uber driver or whatever… they’re all just like ‘I moved here to play drums, I’ve been here a couple years, and I’m just waiting on the gig’. And I would ask all of them ‘where can I check out your playing; where can I see you or listen to you?’ and they all went ‘well there’s a demo, it’s on cassette though and the thing is the bass player was like sick that day so this time is not that good and you know…’ I mean there’s all this qualifying! Nobody said like ‘oh right here – look at this video’. And I hate qualifying things… if I have to send something to someone and I have to qualify it, I’m not going to send it; I’ll just redo it until it doesn’t need an explanation… I realized right away that everybody I met here that didn’t have a gig was doing this qualifying thing, and I thought ‘cool, I’m not even going to say I’m a drummer to anybody until I have a video that I like of what it looks like and sounds like and feels like for me to play the drums’. So with all the little money I had, I paid this lovely videographer $25 an hour to come to my house and just point a Canon Rebel at me and I recorded myself playing a song, and I did it once a week…”
(<a href="https://www.youtube.com/watch?v=ihyw65GtO4k">source</a>)</p>
</blockquote>

<p>The key takeaway? The best way to break into an industry is to <em>demonstrate what your work looks like</em>, allowing potential employers to see your capabilities firsthand.</p>

<p>This concept resonates deeply with me, both personally and professionally. While I don’t make music for a living, I do make software<sup id="fnref:bignote2" role="doc-noteref"><a href="#fn:bignote2" class="footnote" rel="footnote">2</a></sup>, and my experience has shown that creating and publishing quality work has been invaluable when seeking new opportunities. I’ve secured numerous interviews (two of which led to new jobs!) where the hiring team mentioned noticing me because of my GitHub or this blog. On the flip side, as someone involved in recruiting and hiring engineers, I always look forward to exploring candidates’ websites, blogs, and other online artifacts. In my experience, a strong open-source project can absolutely tip the scales in a candidate’s favor during hiring decisions.</p>

<p>Outside of work, I volunteer as a career mentor for my alma mater<sup id="fnref:bignote3" role="doc-noteref"><a href="#fn:bignote3" class="footnote" rel="footnote">3</a></sup>. One piece of advice I frequently offer is to “publish the things you build.” For students just starting their careers with limited work experience, the best way to stand out among the growing crowd of CS graduates is to publish projects that showcase their skills. It’s never been easier to build something cool with software and put it online. I don’t want the students I work with to make the same mistake as the musicians Harry mentions – having plenty of talent but failing to compile a demonstration of that talent in a digestible way. As with many things, the easiest way to start standing out from the crowd is to <a href="https://x.com/shaiyanhkhan/status/1754197898814689379">just do things</a>.</p>

<p>However, sometimes just doing things isn’t enough on its own. After hearing Harry’s interview, I read <a href="https://newsletter.posthog.com/p/how-not-to-be-boring#you-cant-80-20-everything">this post</a><sup id="fnref:bignote4" role="doc-noteref"><a href="#fn:bignote4" class="footnote" rel="footnote">4</a></sup> by my boss, <a href="https://x.com/james406">James Hawkins</a>, which succinctly summarized another concept that Harry’s interview had me pondering – when it comes to personal branding, it’s often better to go the extra mile and do something <em>truly good</em>. Harry discusses this at length in the podcast, lamenting the trade-off musicians face between creating “great art” and producing something that’s good enough to release. He observes that many musicians fail to showcase their abilities effectively because the process of doing something good is challenging, and the creative burden of not knowing when something is good enough to publish often prevents people from creating anything at all.</p>

<p>I believe this rings true – personal branding is a valuable asset for anyone in their career, and I agree with both James and Harry that it’s worth making something good if you’re going to put in the effort at all. However, as I encourage the students I mentor, every journey starts somewhere. Getting into the habit of publishing things that are “good enough” and consistently improving is an excellent way to enhance your skills overall. It’s not always easy to put yourself out there, but I don’t think it should be! Challenging tasks are worth doing, and I believe that investing extra time in enhancing your personal brand with examples of your work tends to have outsized positive impacts on your life and career – the return on investment can be substantial.</p>

<p>While nothing I’ve said here is groundbreaking or particularly novel, I felt compelled to write about it because these ideas have been noodling around in my mind since watching the podcast. I figured the best way to process my thoughts would be to take my own advice and just do the thing. After all, that’s what personal branding and growth are all about – consistently putting your work out there and trying to improve with each iteration.</p>

<div class="footnotes" role="doc-endnotes">
  <ol>
    <li id="fn:bignote" role="doc-endnote">
      <p>I love music and for a while thought pretty seriously about trying to do it as a vocation – I went to college on a music scholarship, played in the jazz band my entire time there, and have played in various bands throughout my life. I eventually abandoned my musician dreams to settle for a more stable career, but I’ve always had immense respect for folks who devote themselves fully to a creative pursuit in a professional capacity. <a href="#fnref:bignote" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:bignote2" role="doc-endnote">
      <p>There are obvious differences between professional musicians and professional software developers, but I think the parallels between the creative pursuit of making music and the creative pursuit of building good software hold strongly enough that this perspective is pretty transferrable. <a href="#fnref:bignote2" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:bignote3" role="doc-endnote">
      <p>I partner with the <a href="https://www.whitman.edu/career-prep/career-and-community-engagement-center">Whitman Career Center</a>, where I advise computer science undergrads on how to break into tech. <a href="#fnref:bignote3" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:bignote4" role="doc-endnote">
      <p>I know, I know – I’m a corporate shill, but I love my job and my coworkers. <a href="https://posthog.com/careers">Come join if you want to love your job, too</a>. <a href="#fnref:bignote4" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
  </ol>
</div>]]></content><author><name>Dylan</name></author><category term="career" /><category term="software engineering" /><category term="personal branding" /><summary type="html"><![CDATA[The other day, I was listening to a podcast featuring one of my favorite thinkers, Harry Miree - a drummer/YouTuber known for his work with LoCash, Boom City, and most recently as the tour drummer for HARDY. Harry’s introspective nature offers a unique perspective on a career path I’ve always admired from afar: the professional musician (professional in the sense that he gets paid to play in bands that don’t bear his name)1. I love music and for a while thought pretty seriously about trying to do it as a vocation – I went to college on a music scholarship, played in the jazz band my entire time there, and have played in various bands throughout my life. I eventually abandoned my musician dreams to settle for a more stable career, but I’ve always had immense respect for folks who devote themselves fully to a creative pursuit in a professional capacity. &#8617;]]></summary></entry></feed>