Posts tagged "post"

5 posts

Update: 12 Apr 2025

After a few more iterations with Claude, the website is looking a lot better but it isn't working that much better. The Archives link on top right throws up a poorly formatted page.1 The huge, black "Approved Thoughts" at the top of the page and the smaller blue version in the header are redundant.2 There are many other issues to be fixed.

I also managed to get the CMS set up so that I could login and see all the posts in the CMS.3 It is useless though because the posts I create there do not show up on the site. So there is some solid debugging needed before I can get more ambitious here. Still I am quite chuffed that I have this set up and working.

Between the markets and a potential relocation, I have not devoted any time to this website. Hopefully that changes soon.

I have also decided another project I want to try my hand at. More as I get started.

Thought for the day: If you ask "do I feel like doing this now", you won't do it. Certainly not regularly. First, figure out how to stop the question.

Footnotes

  1. This is now fixed.

  2. This is also now fixed, although I am not sure if it is an improvement.

  3. This was almost fixed but broke again and I have given up for now. When on the road, I can add posts directly via GitHub so, I am good for now.

A few thoughts on LLMs

I need to sit down and combine some of these thoughts into a coherent piece but for now, I just want to dump it all here for reference later. I doubt any of these are original and I have certainly seen some elsewhere.

  1. Market structure for LLM makers may end up being like the airline market. High fixed cost to set up, hard to create a product differentiation that users care about (other than price), lots of competitors entering, at least in part due to the "prestige" of owning one. Lots of utility for consumers but hardly any profit for producers.

  2. An LLM with an infinite context window, one that can contain all my life, will be an entirely different product than an LLM with a limited context window. You can never have enough when it comes to context windows.

  3. Notwithstanding 1, "personality" makes a huge difference in the experience of working with an LLM and the ability to create the right one could determine whether a model can dominate a market or a niche. Claude Sonnet 3.5 absolutely had than special sauce in my experience. We need a lot more of it.

  4. This.:

    I don’t see how we’re going to avoid a situation where the internet become lousy with AI-created, pseudo academic writing filled with made up facts and quotes, which will then get cemented into “knowledge” as those articles become the training fodder for future models.

    But combine it with the fact that if we all start getting our answers from LLMs, the online content & ad based business model goes caput and then what is the incentive for people to put up good content on the web? None.

  5. But anyone who has good proprietary, verified, high-quality data & content will potentially control the value for the customer even as base LLMs become a commodity. Therefore does more data and content start going behind the paywall? If it doesn't it becomes training data and cannot be monetized.

The One Where I(?) Set Up a Website

This is the website in question. Bookmark it. Or don’t.

Many people are asking if this can even be called a website. These people are losers. Sad. The finest people I know think its a great website, maybe the best website in the history of the world. Nobody knows websites like I do.

Anyhoo.

It all began when I decided to shift this letter to a website last week. Having purchased a few web domains before, it took me only a few minutes to buy approvedthoughts.com from Cloudflare. Unlike GoDaddy etc. who offer a promotional price bundles and then jack it up on renewals, Cloudflare offers at-cost registration + renewals and I have good experience with some of their other services, so it was an easy choice.

For all my talk about using AI models, I hadn’t really used them for any type of programming project / tech support roles. I figured this was the perfect opportunity for me to try out if I could actually create something useful with the help of one of these models.

For context, while I have an undergraduate degree in Computer Applications, I have never programmed professionally and frankly did barely any programming during the course itself. I am probably a bit better than someone who has never done any programming, but not by a lot.

For context also, I could obviously just spend $20 on a paid service that would be absolutely trivial to set up and manage but where’s the fun in that?

This is how I started on Claude 3.5 Sonnet:

I have recently purchased a domain for a website. I used cloudflare for purchasing the domain. I would now like to set up a Wordpress or other website at the domain. Walk me through the steps of doing this.

The Sonnet is the “most intelligent model” by Anthropic. It is the go to model at the moment for a lot of the people I follow and the only model I pay for.

The project went through multiple chats because after a while, in each chat, I would start getting this message:

This chat is getting long.

Here, for the first time, I used a trick I had recently seen (I think) on Twitter.

ok - I have saved and deployed on cloudflare. However I am getting the message that this chat has become too long. Can you provide a status update that I can use with another AI chatbot to continue this work?

And it did! I pasted the update in another chat and the conversation continued there as if it was the same chat!

You can check out the first chat [here. I think / hope it has no confidential information. I won’t be sharing the remaining chats because they do have some of my account credentials. This one should provide enough context for most of what I am discussing here though.

So How’d it Go?

Well you’ve seen the “website”, so clearly not as well as I’d hoped. That said, I am continuing with the project and based on what I have learned so far, I am hopeful I will eventually get it working.

It is still remarkable that I got this far, considering that I have zero experience with the technologies / services involved. Claude successfully walked me through setting up 3 different services (Cloudflare, GitHub and Hugo) and had them work with each other to serve up the site.

All of this happened within a couple of hours and with me blindly following its instructions - copying and pasting code where it asked me to, choosing the options it asked me to and only making high level decisions, like which theme I liked for the blog, which require no technical sophistication.

For the most part, all of this was “one-shot”, meaning it told me to do something, in plain English, I did the thing and it all worked. Where it required a second shot, it was because I hadn’t provided the full context e.g. that I was using a Mac and not a PC. And once I clarified that, it adjusted the directions for all subsequent steps.

In fact, because I use Claude almost exclusively, I had forgotten how much better it is that some of the other models at being almost “human”. This chat is not very different from how I would interact with IT support engineer at work. Just randomly dropping bits of context or asides and having it be incorporated correctly into subsequent interactions as relevant.

In one of the subsequent chats, it seemed to me like Claude was stuck or going around in circles, without fixing the problem, so I decided to switch to the newly launched DeepSeek-R1 reasoning model. While the R1 made immediate progress where Claude was stuck, the experience of working with the R1 was just so much harder.

Unlike Claude, which “understood” that I was not technically sophisticated, and only gave tasks of a manageable size at each stage, the R1 would just throw a laundry list of complex tasks at me.

For e.g. there’s a set of 3 commands I needed to run on the Terminal app on my Mac, after every change to the code. This would “push” the changes to GitHub. Every time there would be only minor changes to these commands. Claude would provide these commands every time, unprompted. R1 would not.

Claude also seemed to just understand better where we were in the overall project at all times. So if it asked me to do three things and if I went back with a question about the first thing, it would help me fix that and then go back to the remaining two items in the list. R1 did not seem to have as good an “understanding” of the overall project and did not seem to “appreciate” that if I was asking a question about step 2 of 5, that steps 3-5 still needed to be done after it answered the question about step 2.

Of course, the answer to step 2 would have 5 steps, so now I have to scroll back and forth to figure out which step of which step I am on. So despite being a “stronger” model, I found it discouraging and just much harder with the R1 to have to keep track of where we were in the project.

What went wrong?

I suspect the website would be looking a lot better if I had just followed Claude’s guidance. In fact, even now, I think it is at a stage where I can post these weekly updates there with some extra effort.

But I knew I was going to be traveling for a few weeks and so after the site was ready for me to start posting content, I asked Claude if we could also set up a CMS, content management system, for the website. That way I could post from anywhere instead of just from my desktop and include links and images in the posts with minimal effort.

Things went off the rails pretty much immediately. Without getting into a lot of detail, we encountered a bug while setting up the authorization setup for the CMS. Claude kept suggesting various tweaks but nothing worked. It felt to me like we were going around in circles, trying the same things over and over again.

I even asked Claude to look at the entire conversation and figure out if it was going around in circles. It gave itself a clean chit. Of the 6 or so hours I have spend on this project, my guess is 4-5 have been trying to resolve this issue.

My suspicion is that Claude is trained on an older version of the various services / projects I am using and therefore could not fix these errors.

Eventually I gave up with Claude and took the project over to DeepSeek R1. It made some progress, identifying some errors Claude had made, e.g. key files located in wrong folders, wrong parameters etc. However since neither of the models can really take over and look at the code themselves, they may not be able to fix issues caused by my mistakes in implementing their directions.

Maybe AI agents will fix this issue in future. In any event, the project is on pause as I travel and I am wondering whether I should start from scratch with DeepSeek instead of trying to fix the current version.

What did I learn?

A lot.

Despite the hiccup, I feel comfortable that these models can help me do projects that I just couldn’t have done before. I intend to try more projects, and of course, finish this one.

Different models have very different “skills” and it’s important to figure out which model is the best for a given task.

While Claude is the easiest to talk to, almost like a person, it really helps to understand how to prompt other models to get the best out of them. Providing more context and better instructions can result in much better results.

Larger context windows will change everything. A model that can keep your entire life in context will be multiple orders of magnitude more helpful than the current models.

I am a little more skeptical than before about LLMs becoming generally intelligent or superintelligent. Their ability to deal with problems that are not in their training data seems limited. Maybe agentic models will overcome some of this but breakthroughs in for e.g. fundamental physics seem unlikely. Obviously I am extrapolating from a tiny sample but that’s the direction in which I have updated my priors.

What’s next?

I have a few ideas I want to work on once (not if!) I finish the website. I wonder if I can set up a web server on my old Mac and run a few basic services off it. For e.g. creating a URL shortening service (like Bitly) or a QR code generation service.

This from Simon Willison is my inspiration. 14 projects in 1 week. No rocket science but just small, useful things here and there. Or, if I am more ambitious, that custom Spaced Repetition app built by Andy Matuschak, I mentioned couple of weeks ago.

Is truly customized software possible? I hope it is.

Newsletter vs. Website

Having done a throat clearing piece the first week and the one writing idea I actually had when starting the newsletter, the second week, this week I was finally forced to think about the direction I want to take with this letter.

I enjoyed heading down the rabbit hole of how to learn better last week. I already knew that there was much I needed to learn about teaching after struggling to homeschool the boys for a year recently. Last week it also became clear that I also had to learn a lot about learning.

As it happens, this is also a space where a lot is changing (yes, AI, duh!) and one that is even otherwise relevant to me given the presence of two (reluctant) learners that I am (even more reluctantly) trying to support.

So it seemed like a natural beat to walk.

I subscribe to a few Substacks and as I trawled through them I realised that the only reason I subscribed to each of them was because they were written by experts who also happen to have an amazing way with the words. Here’s a sample:

  1. Money Stuff by Matt Levine - GOAT of newsletters. It’s not even close. And daily? Get outta here…

  2. Don't Worry About the Vase by Zvi Mowshowitz - I can’t believe it’s free.

  3. Bits about Money by Patrick McKenzie (Patio11) - How is a software engineer teaching me about my job?

In other words, I have a very high bar for who I let into my inbox. I could write about how to learn better but clearly I am not the preeminent expert on the topic, I am barely getting started. It would be hard for me to clear my own bar for quite some time, if ever. And if I am going sustain this, I need to enjoy what I am doing. I need to be able to experiment, make messes and generally figure things out.

It occurred to me that a better medium for that kind of a goal might be a website rather than spamming people in their inboxes. It will give me the freedom to play around with the format, topics, content & timing and make the aforementioned messes without worrying too much.

Now my wife thinks I am trying to wiggle out of my New Year’s resolution with this talk, but frankly, she barely knows me, so we can safely ignore that possibility.

So here’s the plan: I will continue writing here on a weekly basis till the website is setup and then move everything over there.

Spaced Repetition

What is Spaced Repetition (SR)?

Claude says:

Spaced repetition is a learning technique that involves reviewing information at gradually increasing intervals. Here's how it works:

When you first learn something new, you review it fairly soon (like the next day). If you remember it well, you wait a bit longer before the next review (maybe 3 days). If you still remember it well, you extend the interval further (perhaps a week, then two weeks, then a month).

The key principles behind spaced repetition are:

  1. The spacing effect: Information is better remembered when studying is spread out over time rather than crammed into a single session.
  2. The testing effect: Actively recalling information strengthens memory more effectively than passive review.

This may not sound revolutionary. After all, some version of this is how we prepared for our exams. But the important thing is that unlike the 90s, there’s an app for that (Anki, Mochi & others). The app handles the spacing and repetition, making it more practical, reliable and scalable.

When I first learned about and used SR apps years ago, I thought, “Wow! It would have been great to have this in school. This will be great for the boys in a few years.”

I never used the technique until last year when I used Anki for homeschooling the kids in Hindi. You can get many ready-made decks (free & paid) on the Anki website, but creating your own (necessary for specific requirements) was a chore, and I stopped after a few months.

I thought SR was for students until I saw a Dwarkesh Patel podcast with Dan Shipper using it to learn more broadly. It wasn’t limited to coursework or vocabulary but included concepts. He claims it allows for better understanding of linkages and patterns over time.

But what had my jaw drop was that he used Claude / ChatGPT to create the cards! Of course. Just throw in the material you are learning into Claude, ask it to create SR cards for the material and you’re done.

It may need some editing or creating additional cards, but it’s generally good enough. It can even output in Anki file format. Rope in an AI and the amount of knowledge you can include in your SR system just explodes.

While researching this piece, I revisited Alexey Guzey’s original post that introduced me to the concept. I didn’t remember this, but the first thing he talks about is to use SR for instilling novel thought patterns.

Front card: saying no

Back card: If I want to say no, I will stop and make sure this is not just status quo bias (coz it probably is)

Status quo bias sucks. Does this one solve it? Beautifully so. Although several months had passed before it finally kicked in, the number of times I noticed saying “no” out of status quo bias, then having this thought come up and make me retroactively reverse the initial “no” is in the tens already.

Will this work? I am not sure, but between how Dwarkesh is using SR and this, I feel the case for having my own SR system is growing stronger. Hopefully, I’ll have an update in a few weeks.

How do you train?

Tyler Cowen’s post inspired by David Perell planted another bug in my brain.

“Athletes train. Musicians train. Performers train. But knowledge workers don’t.”

Recently, one of my favorite questions to bug people with has been “What is it you do to train that is comparable to a pianist practicing scales?”

If you don’t know the answer to that one, maybe you are doing something wrong or not doing enough. Or maybe you are (optimally?) not very ambitious?

Here’s his answer. Of course, I didn’t know the answer to that one.

Watching Andy Matuschak study Quantum Mechanics with Dwarkesh Patel (who else?) led to a few realisations.

Having the right tools and workflows and being good at them is a huge productivity unlock. In the video, Andy uses custom software that allows him to do much more, faster. He reads, creates SR cards, and reviews material on the fly.

He also knows how to put mathematical symbols in his notes, making useful cards so much more quickly.

What is the difference in learning between someone manually revising from a textbook before an exam (as I used to do) and someone using a custom SR system, AI-generated SR cards, and a workflow to push those cards daily? Huge.

This year, I plan to practice getting better with the keyboard. How long can I work on my desktop without touching the mouse? To switch between windows and apps, and use functions within apps?

Sure, I can do Ctrl + C and Ctrl + V. But can I effortlessly do Alt + E + S + V + F (Paste Special > Formulas in Excel)? Can I do 50 other menu actions across multiple programs? Can I set up workflows for routine processes and use AI to automate?

The second thing that struck me was the depth of his learning approach. Sure, it’s a tough topic, but he spent 20+ min on the first page, asking questions, drawing connections, testing himself, and revising. I don’t recall engaging with my coursework so diligently.

The last thing that occurred to me is how important and hard it is to learn these hacks/tricks/workflows. Some people just know and others never do.

Here’s a great resource for this “tacit knowledge.” I’m glad someone recognised the importance of the concept, named it, and created a knowledge base.

×