Calaveras AI

We provide pre-training datasets and RL environments.

Our products are trusted by almost every frontier lab in the United States.

Learn how we can make your AI models more capable.

Schedule a Call >>>

## PRODUCTS

1. STEM_ PRE_TRAINING_ DATA

We found a 100B+ token code dataset that nobody else has found, and sold it to four of the top six AI companies in the world (as measured by Chat Arena).

Since then, we've collected petabytes of data and trillions of tokens to fuel the next generation of frontier AI models.

We can provide both off-the-shelf pretraining data and custom procurement with rapid turnaround.

2. REALISTIC_ STEM_ RL_ ENVIRONMENTS

We offer robust verifiable RL environments targeting core STEM capabilities.

Our environments realistically model real-world white collar tasks.

We can also offer custom RL environment design for your needs.

## TEAM

ALANA XIANG

Co-founder, CEO

SOPHIA WISDOM

Co-founder, CTO

## HIRING

We are looking for engineers who deeply understand the systems they use. If this sounds like you, apply here .

## CONTACT US

Email us at [email protected], follow us on Twitter/X, or schedule a call.