What We’re Reading (Week Ending 17 November 2024) - 17 Nov 2024
Reading helps us learn about the world and it is a really important aspect of investing. The late Charlie Munger even went so far as to say that “I don’t think you can get to be a really good investor over a broad range without doing a massive amount of reading.” We (the co-founders of Compounder Fund) read widely across a range of topics, including investing, business, technology, and the world in general. We want to regularly share the best articles we’ve come across recently. Here they are (for the week ending 17 November 2024):
1. OpenAI Shifts Strategy as Rate of ‘GPT’ AI Improvements Slows – Stephanie Palazzolo, Erin Woo and Amir Efrati
In May, OpenAI CEO Sam Altman told staff he expected Orion, which the startup’s researchers were training, would likely be significantly better than the last flagship model, released a year earlier.
Though OpenAI had only completed 20% of the training process for Orion, it was already on par with GPT-4 in terms of intelligence and abilities to fulfill tasks and answer questions, Altman said, according to a person who heard the comment.
While Orion’s performance ended up exceeding that of prior models, the increase in quality was far smaller compared with the jump between GPT-3 and GPT-4, the last two flagship models the company released, according to some OpenAI employees who have used or tested Orion.
Some researchers at the company believe Orion isn’t reliably better than its predecessor in handling certain tasks, according to the employees. Orion performs better at language tasks but may not outperform previous models at tasks such as coding, according to an OpenAI employee. That could be a problem, as Orion may be more expensive for OpenAI to run in its data centers compared to other models it has recently released, one of those people said.
The Orion situation could test a core assumption of the AI field, known as scaling laws: that LLMs would continue to improve at the same pace as long as they had more data to learn from and additional computing power to facilitate that training process…
…However, OpenAI researcher Noam Brown said at the TEDAI conference last month that more-advanced models could become financially unfeasible to develop.
“After all, are we really going to train models that cost hundreds of billions of dollars or trillions of dollars?” Brown said. “At some point, the scaling paradigm breaks down.”…
…Orion was trained in part on AI-generated data, produced by other OpenAI models, including GPT-4 and recently released reasoning models, according to an OpenAI employee. However, such synthetic data, as it is known, is leading to a new problem in which Orion may end up resembling those older models in certain aspects, the employee said.
2. Prejudice And China – Louis-Vincent Gave
This led us to the comments made in September by Ford chief executive officer Jim Farley. Freshly returned from a visit to China, Farley told The Wall Street Journal that the growth of the Chinese auto sector poses an existential threat to his company, and that “executing to a Chinese standard is now going to be the most important priority.”
By any measure, this is an earth-shattering statement.
Making cars is complicated. Not as complicated as making airliners or nuclear power plants. But making cars is still the hallmark of an advanced industrial economy. So, the idea that China is suddenly setting the standards that others must now strive to meet is a sea-change compared with the world we lived in just five years ago…
…This brings me to the simplest, most obvious, and likeliest explanation why most CEOs and investors missed how China leapfrogged the West in industry after industry over the last five years: during that time, no one from the West bothered to visit China…
…Unlike Japan in the 1990s, China has not seen its banking system go bust and lose its ability to fund new projects. On the contrary, the surge in loans to industry over the past few years lies at the heart of China’s booming industrial productivity…
…This is another key difference between China today and Japan in the 1990s. China today is not only more efficient and more productive than a decade ago, it is probably more efficient and more productive than most other major industrial economies. And it boasts a very attractive cost structure. Until a few years ago, you would need to check your bank balance before going out for dinner in Tokyo. Today, you can stay in the Four Seasons in Beijing or Shanghai for less than US$250 a night. Perhaps the best illustration of how Japan’s past is a very poor guide to China’s present is the difference in their trade balances; a reflection of how different their competitiveness has been…
…This is not to understate the magnitude of the Chinese property bust. The rollover in real estate has been a massive drag on growth and on animal spirits over the past five years. But on this front, there is another key difference between China and Japan: in China, the contraction of real estate was the policy. It was not the unfortunate consequence of policies gone-wrong. Reallocating capital away from real estate and towards industry was a stated goal of the government…
…There seem to be at least three separate visions of China.
The first is the China you read about in much of the Western media: a place of despond and despair. It is permanently on the cusp of social disorder and revolution, or it would be, were it not an Orwellian nightmare of state surveillance, supervision and repression that strangles creativity and stifles progress. This is the place that Westerners who have never visited China typically imagine, because it is the place portrayed by the media…
…The second is the vision of China you get from talking to Chinese millennials in tier-one cities. This version of China recalls the “lost decades” of Japanese deflationary depression…
…This brings me to the third vision of China: that it is only just beginning to leapfrog the West in a whole range of industries. This vision is starting to show up itself in the perception of Western brands in China, and their sales. For example, Apple’s iPhones no longer figure in the five best-selling smartphone models in China. And Audi’s new electric cars made and sold in China will no longer carry the company’s iconic four-circle logo; the branding is now perceived to be more of a hindrance than a benefit.
To put it another way, following years of investment in transport infrastructure, education, industrial robots, the electricity grid and other areas, the Chinese economy today is a coiled spring. So far, the productivity gains engendered by these investments have manifested themselves in record trade surpluses and capital flight—into Sydney and Vancouver real estate, and Singapore and Hong Kong private banking.
This has mostly been because money earners’ confidence in their government has been low. From bursting the real estate bubble, through cracking down on big tech and private education, to the long Covid lockdowns, in recent years the Chinese government has done little to foster trust among China’s wealthy. It’s small surprise, then, that many rich Chinese have lost faith in their government’s ability to deliver a stable and predictable business environment.
This brings me to the recent stimulus announcements and the all-important question whether the measures rolled out will prove sufficient to revitalize domestic confidence in a meaningful way. Will it even be possible to lift confidence as long as the Damocles’ sword of a wider trade conflict with the US and yet more sanctions looms over the head of Chinese businesses?
From this perspective, perhaps the most bullish development for China would be for the new US administration (regardless who sits behind the Resolute desk) to come in and look to repair the damage done to relations by the 2018 semiconductor sanctions and the 2021 Anchorage meeting…
…When it comes to China’s relevance to investors, there are four ways of looking at things.
- China can be uninvestible and unimportant. This is the pool that most investors have been swimming in for the last few years. But this is akin to saying that China is like Africa. It simply doesn’t pass the smell test. Instead of sliding into irrelevance, China’s impact on the global economy only continues to grow.
- China can be uninvestible but important. This is essentially what Jim Farley, fresh back from his China trip, told The Wall Street Journal.
- China can be investible but unimportant. This is the space Japan inhabited for a couple of decades, and into which Europe seems to be gently sliding. However, the idea that China today is where Japan has been for the last three decades is grossly misplaced on many fronts, including the competitiveness of its economy, its overall cost structure, and its weight in global indexes.
- China can be investible and important. This is what David Tepper of Appaloosa Management argued on CNBC following the announcement of China’s stimulus (see Changing Narratives Around The World). For now, this is still a minority view, at least among Western investors. Not that Western investors matter all that much. What truly matters is whether Chinese investors themselves start rallying to this view. If they do, the unfolding bull markets in Chinese equities and the renminbi could really have legs.
3. $2 H100s: How the GPU Rental Bubble Burst – Eugene Cheah
ChatGPT was launched in November 2022, built on the A100 series. The H100s arrived in March 2023. The pitch to investors and founders was simple: Compared to A100s, the new H100s were 3x more powerful, but only 2x the sticker price.
If you were faster to ramp up on H100s, you too, can build a bigger, better model, and maybe even leapfrog OpenAI to Artificial General Intelligence – If you have the capital to match their wallet!
With this desire, $10-100’s billions of dollars were invested into GPU-rich AI startups to build this next revolution. Which lead to ….
The sudden surge in H100 demand
Market prices shot through the roof, the original rental rates of H100 started at approximately $4.70 an hour but were going for over $8. For all the desperate founders rushing to train their models to convince their investors for their next $100 million round…
…For most of 2023, the H100 prices felt like they would forever be above $4.70 (unless you were willing to do a huge upfront downpayment)
At the start of 2024, the H100 prices reached approximately $2.85 across multiple providers…
…In Aug 2024, if you’re willing to auction for a small slice of H100 time (days to weeks), you can start finding H100 GPUs for $1 to $2 an hour.
We are looking at a >= 40% price drop per year, especially for small clusters. NVIDIA’s marketing projection of $4 per GPU hour across 4 years, has evaporated away in under 1.5 years.
And that is horrifying because it means someone out there is potentially left holding the bag – especially so if they just bought it as a new GPUs…
…The average H100 SXM GPU in a data center costs $50k or more to set up, maintain, and operate (aka most of the CAPEX). Excluding electricity and cooling OPEX cost…
…If the price falls below $1.65/hour, you are doomed to make losses on the H100 over the 5 years, as an infra provider. Especially, if you just bought the nodes and cluster this year…
…Many infrastructure providers, especially the older ones – were not naive about this – Because they had been burnt firsthand by GPU massive rental price drops, after a major price pump, from the crypto days – they had seen this cycle before.
So for this cycle, last year, they pushed heavily for a 3-5 year upfront commitment and/or payment at the $4+ price range. (typically with 50% to 100% upfront). Today, they push the $2.85+ price range – locking in their profits…
…When a model creator is done training a model, you have no more use for the cluster. What would they do? – they resell and start recouping some of the costs…
…This ended up creating a triple whammy in reducing the demand for H100s!
1. Finetuning is significantly cheaper than training from scratch.
a. Because the demands for fine-tuning are significantly less in compute requirements (typically 4 nodes or less, usually a single node), compared to training from scratch (from 16 nodes, usually more, for 7B and up models).
b. This industry-wide switch essentially killed a large part of smaller cluster demands.
2. Scaling back on foundation model investment (at small/mid-tier)
a. In 2023, there was a huge wave of small and medium foundation models, within the text and image space.
b. Today, however, unless you are absolutely confident you can surpass llama3, or you are bringing something new to the table (eg. new architecture, 100x lower inference, 100+ languages, etc), there are ~no more foundation model cos being founded from scratch.
c. In general, the small & medium, open models created by the bigger players (Facebook, etc), make it hard for smaller players to justify training foundation models – unless they have a strong differentiator to do so (tech or data) – or have plans to scale to larger models.
d. And this has been reflected lately with investors as well, as there has been a sharp decline in new foundation model creators’ funding. With the vast majority of smaller groups having switched over to finetuning. (this sentiment is combined with the recent less than desired exits for multiple companies).
e. Presently today, there is approximately worldwide by my estimate:
<20 Large model creator teams (aka 70B++, may create small models as well)
<30 Small / Medium model creator teams (7B – 70B)
f. Collectively there are less than <50 teams worldwide who would be in the market for 16 nodes of H100s (or much more), at any point in time, to do foundation model training.
g. There are more than 50 clusters of H100 worldwide with more than 16 nodes.
3. Excess capacity from reserved nodes is coming online
a. For the cluster owners, especially the various foundation model startups and VCs, who made long reservations, in the initial “land grab” of the year 2023.
b. With the switch to finetuning, and the very long wait times of the H100’s (it peaked at >= 6 months), it is very well possible that many of these groups had already made the upfront payment before they made the change, essentially making their prepaid hardware “obsolete on arrival”.
c. Alternatively, those who had the hardware arrive on time, to train their first few models, had come to the same realization it would be better to fine-tune their next iteration of models. Instead of building on their own.
d. In both cases, they would have unused capacity, which comes online via “Compute Resellers” joining the market supply….
…Both AMD and Intel may be late into the game with their MX300, and Gaudi 3 respectively.
This has been tested and verified by us, having used these systems. They are generally:
- Cheaper than a H100 in purchase cost
- Have more memory and compute than a H100, and outperforms on a single node.
- Overall, they are great hardware!
The catch? They have minor driver issues in training and are entirely unproven in large multi-node cluster training.
Which as we covered is largely irrelevant to the current landscape. To anyone but <50 teams. The market for H100 has been moving towards inference and single or small cluster fine-tuning.
All of which these GPUs have been proven to work at. For the use cases, the vast majority of the market is asking for.
These 2 competitors are full drop-in replacements. With working off-the-shelf inference code (eg. VLLM) or finetuning code for most common model architectures (primarily LLaMA3, followed by others)…
…Given that the open-weights model has entered the GPT-4 class arena. Falling H100 prices will be the multiplier unlock for open-weights AI adoption.
It will be more affordable, for hobbyists, AI developers, and engineers, to run, fine-tune, and tinker with these open models.
Especially if there is no major leap for GPT5++, because it will mean that the gap between open-weights and closed-source models will blur.
This is strongly needed, as the market is currently not sustainable. As there lacks the value capture on the application layer for paying users (which trickles down the platform, models, and infra layers)
In a way, if everyone is building shovels (including us), and applications with paying users are not being built (and collecting revenue and value).
But when AI inference and fine-tuning becomes cheaper than ever.
It can potentially kick off the AI application wave. If it has not already slowly started so.
4. Politics, Portfolios & Perspective: Investing in a Crazy Election Year – Alliance Wealth Advisors
How we feel about the economy is directly correlated to if the party we most closely identify with is in power or not. This is regardless of what the economic data actually tells us. In other words, our emotions get the best of us and cloud our ability to stay objective…
…In the past two presidential elections, there were many “expert” predictions claiming that electing both Donald Trump and Joe Biden would cause a significant stock market correction. Yet, both presided over stock market highs at various times. Anyone who made changes to their portfolio based on those election outcomes suffered a serious opportunity cost that will impact them for a long time…
…Politics aside, the stock market is a complex adaptive system, influenced by countless variables interacting with one another in constantly evolving ways. Companies are dynamic and run by smart people who learn to adapt to new environments. History has shown that companies can react to all kinds of changes and have always been able to grow their earnings over time. When they do stock prices tend to follow.
5. Writes and Write-Nots – Paul Graham
I’m usually reluctant to make predictions about technology, but I feel fairly confident about this one: in a couple decades there won’t be many people who can write…
…The reason so many people have trouble writing is that it’s fundamentally difficult. To write well you have to think clearly, and thinking clearly is hard…
…Till recently there was no convenient escape valve for the pressure created by these opposing forces. You could pay someone to write for you, like JFK, or plagiarize, like MLK, but if you couldn’t buy or steal words, you had to write them yourself. And as a result nearly everyone who was expected to write had to learn how.
Not anymore. AI has blown this world open. Almost all pressure to write has dissipated. You can have AI do it for you, both in school and at work.
The result will be a world divided into writes and write-nots…
…Is that so bad? Isn’t it common for skills to disappear when technology makes them obsolete? There aren’t many blacksmiths left, and it doesn’t seem to be a problem.
Yes, it’s bad. The reason is something I mentioned earlier: writing is thinking. In fact there’s a kind of thinking that can only be done by writing.
Disclaimer: None of the information or analysis presented is intended to form the basis for any offer or recommendation. We currently have a vested interest in Apple. Holdings are subject to change at any time.