Blog
by Lila Tretikov, Madison Faulkner and James KaplanMar 05, 2025
There are three components to building generative AI models: data, compute, and talent. Today, we’re entering a paradigm shift in the market dynamics and relative importance of these three components that we believe will spawn a new wave of companies training their own models.
The old paradigm, driven by OpenAI and the principles they learned (summarized in the Bitter Lesson), was that foundation models are primarily a game of scaling compute over building human knowledge into models. Compute was king, even over data. These beliefs had two underpinning assumptions: that a general-purpose model would outperform domain-specific models and that model training is highly capital intensive. This kicked off a race that was believed to have few winners at most and a multi-billion dollar entry ticket (which is why companies like OpenAI, Anthropic, Xai, and SSI have raised capital so aggressively).
Still, just over two years since the launch of ChatGPT, this paradigm is showing cracks:
Pre-training scaling laws seem to not be holding, but we’re seeing the rise of a new scaling law for test-time compute / reasoning
The large model providers have seemingly tapped out the data that’s available on the internet
Model distillation democratized the most powerful models soon after launch, most famously by DeepSeek
DeepSeek also showed the potential of unsupervised reinforcement learning to take a well-crafted dataset and create a model very cheaply that outperformed on specific tasks
The implication is that we’re in a new age of model training. We went from just a few research labs leveraging scale economies on compute to train very powerful general-purpose models to a potential democratization of models with any company being able to train their own task/domain-specific models if they have a unique dataset.
The next-generation AI infrastructure is here and it is servicing both generative AI companies and those using generative AI. We believe this new wave of model users need optimization infrastructure across data, models, training and inference to achieve DeepSeek-like performance and efficiency, which is why we were so compelled by Ceramic Founder and CEO Anna Patterson’s vision. Today, Ceramic’s platform enables 2-3x performance over DeepSeek in training.
Simply, Ceramic wants to give every AI-native company the ability to build and optimize their infrastructure with the acceleration power of DeepSeek through partnerships with core compute providers like Lambda and AMD. Ceramic is building an enterprise stack designed for simplifying the training and fine-tuning of foundation models. The software platform’s model can train with long contexts and any cluster size, enabling enterprises to develop, train, and scale their own AI models faster than traditional methods. Ceramic massively improves GPU utilization regardless of underlying hardware (technically, for a given dataset and a given target step count or val/loss, Ceramic will use fewer TFlops and thus take less time to train).
A new approach to data preprocessing, Ceramic reranks raining data, aligning micro-batches by topic. This enhances attention efficiency and prevents token masking issues found in other models.
Ceramic is the only platform that can train large models on long-context data, providing unrivaled quality and performance. The company outperforms all reported benchmarks for long-context model training, maintaining high efficiency even for 70B+ parameter models. This is done through novel parallelizations, allowing model training to scale beyond 10k+ GPUs without experiencing the diminishing marginal benefits to more GPUs.
In testing, Ceramic trained reasoning models with their platform that achieved an exact-match 92% Pass@1 score on GSM8K, outperforming Meta’s Llama70B (79%) and DeepSeek R1 (84%) according to their internal benchmarks.
Ceramic believes the inference ecosystem will rapidly innovate over the next couple years and will support the complexity of deployed inference optimization. This includes ease of transitioning between training and inference workloads.
We are excited by Ceramic’s killer team. Anna is a pioneering computer scientist in AI, a PhD, former VP of Engineering in AI at Google, and one of the industry experts in search engine infrastructure. She started at Google in 2004 and left to start Cuil, a clustering-based search engine that was acquired back into Google. In addition to TeraGoogle and Cuil, she founded Recall (now part of the Wayback Machine). Anna has also built three of the biggest search engines of all time by pages indexed. During herGoogle tenure, she was the GM of Google Books, later becoming one of the founding members of Google's AI team. Later, she founded Gradient Ventures, Google's AI-focused early-stage venture fund. Anna’s view into scaling deep learning infrastructure training and inference over the past two decades gives her the unique lens to optimize the next decade of generative training and inference.
***
Today, we are thrilled to announce our lead investment in Ceramic’s seed financing alongside various strategic investors such as IBM and Samsung Next, and look forward to partnering with Anna and the Ceramic team as they continue working to deliver fast and cost-effective AI model training for enterprises.
Disclaimer
The information provided in this blog post is for educational and informational purposes only and is not intended to be investment advice, or recommendation, or as an offer to sell or a solicitation of an offer to buy an interest in any fund or investment vehicle managed by NEA or any other NEA entity. New Enterprise Associates (NEA) is a registered investment adviser with the Securities and Exchange Commission (SEC). However, nothing in this post should be interpreted to suggest that the SEC has endorsed or approved the contents of this post. NEA has no obligation to update, modify, or amend the contents of this post nor to notify readers in the event that any information, opinion, forecast or estimate changes or subsequently becomes inaccurate or outdated. In addition, certain information contained herein has been obtained from third-party sources and has not been independently verified by NEA. Any statements made by founders, investors, portfolio companies, or others in the post or on other third-party websites referencing this post are their own, and are not intended to be an endorsement of the investment advisory services offered by NEA.
NEA makes no assurance that investment results obtained historically can be obtained in the future, or that any investments managed by NEA will be profitable. To the extent the content in this post discusses hypotheticals, projections, or forecasts to illustrate a view, such views may not have been verified or adopted by NEA, nor has NEA tested the validity of the assumptions that underlie such opinions. Readers of the information contained herein should consult their own legal, tax, and financial advisers because the contents are not intended by NEA to be used as part of the investment decision making process related to any investment managed by NEA.