Tim is an AI innovator who is hell-bent on helping businesses make faster and better decisions with data. After originating Lyft's dynamic pricing algorithms in 2014, he refactored...
Rob is a polymath whose education and professional careers have allowed him to indulge in his principal joys: learning about the world, and enabling progress through marvelous tech...
Paul is a creative business leader and occasional hacker whose career has spanned finance and technology. He led the Capital One Innovation Lab in San Francisco, where his organization...
If your data is in a standard relational database, check out our developer page. We describe the different levels of integration possible, depending on how much you want to limit your data movement.
If your architecture is more complex (e.g. hadoop) or lives in a third-party platform (e.g. Databricks), contact us, and we'd be happy to do a quick consult.
We support the move away from the dominance of tech monoliths in AI. ChatGPT is a closed service that requires the use of the Microsoft/OpenAI LLM and (in the case of private instances) Microsoft Azure. We appreciate the value that these organizations have created, but we don't want to feed all of our private data to them.
Our products allow you to keep your data where it is. It removes the risk inherent in any type of data movement outside your VPN.
Additionally, our architecture also gives you choice. It allows you to leverage any of the LLMs being developed by the global community.
Datacakes is not building our own foundational LLM. But we are actively developing ways to tune and combine LLMs out there in a way that improves performance.
We provide a way for owners of large, structured datasets to connect that data to LLMs without exporting it.
Once connected,
Our product lets you easily connect your large structured dataset to LLMs for cleaning and analysis, without requiring any data preparation or export. You can try this out on our demo platform, Cubie.
A couple notes:
1. For owners of sensitive data, we also offer easy setup of a self-hosted compute environment. This ensures that even when code produced by the LLM is run on your data during analysis, the execution happens inside your walls.
2. We also give you choice among LLM services; you can plug in any commercial or open source LLM available to you. We do recommend that you try our instruct service, which contains a factorization layer that we developed to improve the quality and reliability of the response.