What We’re Reading (Week Ending 12 January 2025) - 12 Jan 2025
Reading helps us learn about the world and it is a really important aspect of investing. The late Charlie Munger even went so far as to say that “I don’t think you can get to be a really good investor over a broad range without doing a massive amount of reading.” We (the co-founders of Compounder Fund) read widely across a range of topics, including investing, business, technology, and the world in general. We want to regularly share the best articles we’ve come across recently. Here they are (for the week ending 12 January 2025):
1. The art of outlasting: What we can learn from timeproof Japanese businesses – Eric Markowitz
Japan is home to an extraordinary number of shinise, or long-established businesses. A 2008 study found that Japan had over 21,000 companies older than 100 years, including more than 3,000 that had crossed the 200-year mark. These firms are not just historical artifacts — they are vibrant examples of how to endure and thrive in a rapidly changing world. Their strategies — balancing tradition with adaptability, patience with practicality — are a masterclass in long-term thinking that today’s entrepreneurs and executives would be wise to study…
…What ties these stories together is an approach to business that’s almost rebellious in its patience. While the modern world glorifies disruption and speed, Japan’s ancient companies remind us that longevity is often about playing the long game. It’s about building something so solid, so aligned with its environment, that it can weather any storm. But let’s not romanticize this too much. Strip away the poetry of water metaphors and ancient traditions, and you’ll find ruthless pragmatism at the core of these businesses’ survival.
When Japan’s post-war construction boom faded, Kongo Gumi didn’t just stick to temples — they pivoted hard into office buildings and apartments while maintaining their temple maintenance business as a hedge. During the lean years of the 1990s recession, Hōshi Ryokan cut costs to the bone while refusing to lay off staff, with family members taking deep pay cuts to keep their centuries-old workforce intact. Okaya transformed from selling samurai swords to becoming a global steel trader, making calculated bets on new technologies and markets while keeping their supply chain relationships rock solid.
These companies didn’t just drift through history — they clawed their way through wars, depressions, and cultural upheavals, making brutal choices about what to preserve and what to sacrifice. Their longevity wasn’t achieved through Zen-like detachment, but through gritted teeth and white-knuckled adaptability.
2. Notes on China – Dwarkesh Patel
I got quite mixed messages about the state of public opinion in China. This is to be expected in a society where you can’t establish common knowledge. One person told me that the new generation is quite nationalist, unlike the older reform generation which personally experienced the catastrophes of Mao and the tangible benefits of liberalization. He made the rather insightful point that this tilt in Chinese public opinion increasingly gives lie to the American talking point, “We’re against the CCP, not the Chinese people.” In fact, he went on to say that the current regime is way more liberal than what would result from an election in China.
Another person told me that these Chinese nationalists were only a vocal minority, similar to the wokes in America circa 2020. While they make up only about 10% of the population, they aggressively shout down others on Weibo (China’s Twitter equivalent). Most people find them annoying but feel uncomfortable confronting them directly. This matches what a student who graduated from a top university there told me – the vast majority of his classmates are simply apolitical. And in our own interactions with locals, we saw little evidence of widespread nationalism. In fact, when my Chinese-speaking trip mate (who could actually speak Chinese) would mention he was from the UK to taxi drivers, they would often respond enthusiastically: “Oh wonderful, we love the UK!”…
…We chatted up quite a lot of young people on night life streets. I was struck by how many young people expressed feeling stressed or overwhelmed. We met a musician in Chengdu who was writing songs about youth anxiety. We chatted up some modeling school students – even they complained about the intense pressure they felt. We met a guy who had studied in Australia but returned to China during COVID. He explained that many of his friends with prestigious degrees are moving away from Shanghai and Beijing – Yes, the pay there can be twice as high as in second or third tier cities. But the competitiveness is insane. And in order to actually land the high skilled positions, they have to work truly insane hours (9-9-6 is not a myth). He said that many of his friends were opting for these less ambitious lower-paying careers in smaller cities, where the rent is lower and the pressure is manageable…
…I’m still puzzled by how China can have both a demographic collapse and massive youth unemployment. You’d think with fewer young people being born, the ones who are around would be in high demand. One explanation I heard while there is that there are plenty of menial jobs available, but today’s educated youth – who’ve gone through high school and college – just won’t take the low-skilled positions their parents and grandparents did. Meanwhile, there’s a real shortage of the high-skilled jobs that would actually match their education and aspirations. It’s a mismatch between the jobs available and the jobs young people feel qualified for and willing to do…
…The biggest surprise from talking to Chinese VCs people at AI labs was how capital constrained they felt. Moonshot AI, one of China’s leading AI labs, raised $1 billion at a $3 billion valuation. Meanwhile, just xAI’s new cluster alone will cost $3-4 billion.
The tech ecosystem feels quite shell shocked from the 2021 crackdown. One VC half-jokingly asked if I could help him get his money out of China. If you keep your money in China, you’re basically stuck choosing between terrible options. You can either accept a measly 2% yield from state banks, or throw it into China’s perpetually struggling stock market. This helps explain why valuations for Chinese companies are chronically low – the exit opportunities just suck. Even if you build (or invest in) something great, there’s no guarantee the company will be able to raise the next round. And even if you do raise again and succeed, the government might randomly cancel your IPO. And even if you somehow make it to the public markets, Chinese equities have been performing terribly anyways. It’s a good reminder of how easy it is to completely wreck an innovation ecosystem that depends on risk-taking investors.
3. Is AI progress slowing down? – Arvind Narayanan and Sayash Kapoor
To be clear, there is no reason to doubt the reports saying that many AI labs have conducted larger training runs and yet not released the resulting models. But it is less clear what to conclude from it. Some possible reasons why bigger models haven’t been released include:
- Technical difficulties, such as convergence failures or complications in achieving fault tolerance in multi-datacenter training runs.
- The model was not much better than GPT-4 class models, and so would be too underwhelming to release.
- The model was not much better than GPT-4 class models, and so the developer has been spending a long time trying to eke out better performance through fine tuning.
To summarize, it’s possible that model scaling has indeed reached its limit, but it’s also possible that these hiccups are temporary and eventually one of the companies will find ways to overcome them, such as by fixing any technical difficulties and/or finding new data sources…
…Industry leaders don’t have a good track record of predicting AI developments. A good example is the overoptimism about self-driving cars for most of the last decade. (Autonomous driving is finally real, though Level 5 — full automation — doesn’t exist yet.) As an aside, in order to better understand the track record of insider predictions, it would be interesting to conduct a systematic analysis of all predictions about AI made in the last 10 years by prominent industry insiders.
There are some reasons why we might want to give more weight to insiders’ claims, but also important reasons to give less weight to them. Let’s analyze these one by one. It is true that industry insiders have proprietary information (such as the performance of as-yet-unreleased models) that might make their claims about the future more accurate. But given how many AI companies are close to the state of the art, including some that openly release model weights and share scientific insights, datasets, and other artifacts, we’re talking about an advantage of at most a few months, which is minor in the context of, say, 3-year forecasts.
Besides, we tend to overestimate how much additional information companies have on the inside — whether in terms of capability or (especially) in terms of safety. Insiders warned for a long time that “if only you know what we know…” but when whistleblowers finally came forward, it turns out that they were mostly relying on the same kind of speculation that everyone else does.
Another potential reason to give more weight to insiders is their technical expertise. We don’t think this is a strong reason: there is just as much AI expertise in academia as in industry. More importantly, deep technical expertise isn’t that important to support the kind of crude trend extrapolation that goes into AI forecasts. Nor is technical expertise enough — business and social factors play at least as big a role in determining the course of AI. In the case of self-driving cars, one such factor is the extent to which societies tolerate public roads being used for experimentation. In the case of large AI models, we’ve argued before that the most important factor is whether scaling will make business sense, not whether it is technically feasible…
…As an example, Sutskever had an incentive to talk up scaling when he was at OpenAI and the company needed to raise money. But now that he heads the startup Safe Superintelligence, he needs to convince investors that it can compete with OpenAI, Anthropic, Google, and others, despite having access to much less capital. Perhaps that is why he is now talking about running out of data for pre-training, as if it were some epiphany and not an endlessly repeated point.
To reiterate, we don’t know if model scaling has ended or not. But the industry’s sudden about-face has been so brazen that it should leave no doubt that insiders don’t have any kind of crystal ball and are making similar guesses as everyone else, and are further biased by being in a bubble and readily consuming the hype they sell to the world…
…Inference scaling is useful for problems that have clear correct answers, such as coding or mathematical problem solving. In such tasks, at least one of two related things tend to be true. First, symbolic reasoning can improve accuracy. This is something LLMs are bad at due to their statistical nature, but can overcome by using output tokens for reasoning, much like a person using pen and paper to work through a math problem. Second, it is easier to verify correct solutions than to generate them (sometimes aided by external verifiers, such as unit tests for coding or proof checkers for mathematical theorem proving).
In contrast, for tasks such as writing or language translation, it is hard to see how inference scaling can make a big difference, especially if the limitations are due to the training data. For example, if a model works poorly in translating to a low-resource language because it isn’t aware of idiomatic phrases in that language, the model can’t reason its way out of this.
The early evidence we have so far, while spotty, is consistent with this intuition. Focusing on OpenAI o1, it improves compared to state-of-the-art language models such as GPT-4o on coding, math, cybersecurity, planning in toy worlds, and various exams. Improvements in exam performance seem to strongly correlate with the importance of reasoning for answering questions, as opposed to knowledge or creativity: big improvements for math, physics and LSATs, smaller improvements for subjects like biology and econometrics, and negligible improvement for English.
Tasks where o1 doesn’t seem to lead to an improvement include writing, certain cybersecurity tasks (which we explain below), avoiding toxicity, and an interesting set of tasks at which thinking is known to make humans worse…
…We think there are two reasons why agents don’t seem to benefit from reasoning models. Such models require different prompting styles than regular models, and current agentic systems are optimized for prompting regular models. Second, as far as we know, reasoning models so far have not been trained using reinforcement learning in a setting where they receive feedback from the environment — be it code execution, shell interaction, or web search. In other words, their tool use ability is no better than the underlying model before learning to reason…
…The furious debate about whether there is a capability slowdown is ironic, because the link between capability increases and the real-world usefulness of AI is extremely weak. The development of AI-based applications lags far behind the increase of AI capabilities, so even existing AI capabilities remain greatly underutilized. One reason is the capability-reliability gap — even when a certain capability exists, it may not work reliably enough that you can take the human out of the loop and actually automate the task (imagine a food delivery app that only works 80% of the time). And the methods for improving reliability are often application-dependent and distinct from methods for improving capability. That said, reasoning models also seem to exhibit reliability improvements, which is exciting.
Here are a couple of analogies that help illustrate why it might take a decade or more to build products that fully take advantage of even current AI capabilities. The technology behind the internet and the web mostly solidified in the mid-90s. But it took 1-2 more decades to realize the potential of web apps. Or consider this thought-provoking essay that argues that we need to build GUIs for large language models, which will allow interacting with them with far higher bandwidth than through text. From this perspective, the current state of AI-based products is analogous to PCs before the GUI.
4. Waymo still doing better than humans at preventing injuries and property damage – Andrew J. Hawkins
The study is the product of the collaboration between Waymo and insurer Swiss Re, which analyzed liability claims related to collisions from 25.3 million fully autonomous miles driven by Waymo in four cities: Phoenix, San Francisco, Los Angeles, and Austin. They then compared those miles to human driver baselines, which are based on Swiss Re’s data from over 500,000 claims and over 200 billion miles traveled.
They found that the performance of Waymo’s vehicles was safer than that of humans, with an 88 percent reduction in property damage claims and a 92 percent reduction in bodily injury claims. Across 25.3 million miles, Waymo was involved in nine property damage claims and two bodily injury claims. The average human driving a similar distance would be expected to have 78 property damage and 26 bodily injury claims, the company says.
Waymo’s vehicles also performed better when compared to new vehicles equipped with all the latest safety tech, including automatic emergency braking, lane-keep assist, and blind spot detection. When compared to this group, Waymo’s autonomous driving system showed an 86 percent reduction in property damage claims and a 90 percent reduction in bodily injury claims.
5. SITALWeek #454 – Brad Slingerlend
I think we are approaching the point where we can start to estimate the value of AI for developers and the companies/consumers who are going to buy the next wave of innovative applications. I think the salient question for AI (and, frankly, humanity!) is: How much AI reasoning can you get for a human-equivalent salary? In other words, for a certain salary, how much compute power will it take to match or outperform a human (assuming the AI can collaborate with other humans/AIs using the same methods and tools a human would)…
… LLMs are shifting from a pure token-in/token-out model to a test-time scaling model, which may offer us better inroads for estimating costs. Essentially, they are thinking harder before spitting out a reply; thus, rather than just predicting the next words in a response using a probability model (see You Auto-Complete Me), they are doing some deep thinking to arrive at more accurate, useful answers. This is a major leap in capability that comes with a major leap in cost. OpenAI raised prices for their o1 model to $200/mo (Pro subscription) from $20 (Plus subscription). For developers, use of o1’s advanced reasoning API comes at 3-4x the cost of their “general purpose” GPT-4o. If o1 were priced at a typical Western office worker wage of $40/hr, the reasoning of the model would equate to around 5 hours of work per month. We also don’t know if the $200/mo price point is profitable for OpenAI or if they are just relying on Microsoft to further subsidize their business model (which brings us back to the principal-agent problem I started this section off with). So, all of my hand waving here seems to imply you can get a decent amount of human-equivalent reasoning for an amount of money in the realm of human labor cost. If true, after a few more years of advancements in semiconductors and AI models, we should have markedly affordable “human reasoning as a service”, an explosion in demand, and a wide range of outcomes for how much human supervision of AI will be required (it may be that human jobs stay relatively flat, but each human is 2x productive, then 4x, etc.).
Following this logic, at current AI reasoning costs, companies would need to lay off one human for every AI human equivalent they hire and would probably lose more skill/knowledge than they gain. In other words, based on my attempts to guess the cost of replacing human reasoning, today’s AI offerings aren’t likely compelling enough. In a couple years, however, maybe you will be able to lay off one human and hire a handful of AIs, which, by collaborating with each other and humans, may yield superior results. Even today, extremely high-value tasks, such as in-depth research or stock market predictions, may be able to take advantage of the high-cost test-time scaling AI models. And, if any of this math is in the realm of reason, you can easily see that AI may not require such high-value-add applications to be cost effective in the near to medium future. The proof will come within the next couple of years as today’s entrepreneurs develop the next generation of apps leveraging LLMs and overtaking human capabilities: If these apps are at price points that outcompete human employees, a significant wave of change could come much faster to society.
Disclaimer: None of the information or analysis presented is intended to form the basis for any offer or recommendation. We currently have a vested interest in Alphabet (parent of Google and Waymo) and Microsoft. Holdings are subject to change at any time.