Table of Contents
Character.ai has really exciting application of LLMs that let you have conversations with archetypes (e.g. Einstein). It excites us because their core users love the product, often using it for two hours a day (source).
On the research side, they are pushing the limits of how to extend an LLMs memory so that their users feel like the person they are talking to remembers them. Today, the AI community mostly does this through storing conversation snippets in a vector DB and then doing a search at run-time and adding those snippets into the LLM prompt. Will be exciting to see how Character.ai changes our approach as their CEO claims they are close to materially extending memory.
Their founder (Noam) is also one of the original authors of the ‘transformers are all you need’ paper and was one who came up with the idea of cross attention. The team recently raised a whopping $150M Series A and appears to be using capital very efficiently so we’d expect them to have a significant runway.
Role: Research Engineer
Location: Palo Alto, California
Who I’d Message: Sam Shleifer would be an immediate pick for me. He was a founding engineer at Character so he’ll understand the problems, culture and company well. He’s got extensive experience as a researcher in big-tech and notoriously cool places like Hugging Face so we think he’d be well positioned to speak to you about the research challenges they are facing. James Groeneveld also has an exciting background. He was formerly an engineer and appears to be a self taught researcher. Always love scrappy engineers who use things like fast.ai to get started and teach themselves how to be effective researchers.
Cohere is currently a direct competitor to OpenAI, building foundational models. They are also pursuing areas like augmented search and building additional foundational capabilities for AI practitioners. Just like Character.ai, their founder is also one of the original authors of “the transformers are all you need” paper.
We think Cohere is in a competitive space with Anthropic and OpenAI but that there is room for multiple foundational players. They recently brought on the ex-CFO of YouTube (Martin Kon) to be their President and COO and he’s been instrumental in recent fundraising and customer partnerships. Our friends and past clients at Cohere feel that he’s been a very positive addition.
Role: Member of Technical Staff
Location: Remote or Toronto, Canada
Who I’d Message: Sara Hooker heads AI at Cohere and has an extensive and impressive research background. She’s been at Cohere from the earlier days and we feel will be able to talk extensively about the work they are doing and the competitive landscape. Before reaching out, we’d recommend reading some of her research.
DeepMind doesn’t need much explanation but their work in AlphaFold is pretty awesome. DeepMind recently merged with Google Brain so there’s one AI research org at Alphabet now rather than three (DeepMind, Google Brain, Google).
Role: We keep seeing headcount shifting teams and a lack of reliability on whose hiring. So instead, we’d recommend reaching out to specific researchers or managers, building a connection and getting the scoop from them on who is hiring.
Who I’d Message: Sameera Ponda is a research manager who leads a half-dozen projects across many really hot areas like document retrieval and RL. Because she leads many teams she’s influenced many different papers, check them out here before reaching out.
We also saw that Karen Gu’s team is doing cool research. She’s been at DeepMind for many years and we’d be excited to ask her about how things have changed + where she sees Google AI strategy going at large. She mentions that her team is hiring, which would encourage me to reach out. You can read about Karen’s research on instance-dependent noisy labels here.
Anthropic is a foundational model company created by ex-OpenAI engineers and researchers. They recently raised a $300M round at a $4.1B valuation.
Role: Evergreen Research Scientist and Evergreen Research Engineer. Since these are evergreen (i.e. always posted), we’d recommend reaching out to specific researchers at Anthropic to learn about which team is hiring before submitting an application.
Location: San Francisco or Hybrid
Who I’d Message: Peter Lofgren has an epic background. He’s a rare person who can ship major infrastructure projects and also take an AI concept from research to production. He’s open that he finds the talent at Anthropic to be excellent and that they are hiring so I’d hypothesize he might be more receptive to chatting.
Catherine Olsson also has an epic background. She’s got really exciting AI research under her belt, has been thinking deeply about the impact of AI to humanity, and is also an engineer.
Amazon is releasing new AI products including a direct competitor to Github Co-pilot called CodeWhisperer. They are integrating AI across their entire stack from hardware in AWS to software.
Role: We’d recommend reaching out to managers or researchers directly instead of applying to specific postings. In this kind of market, the networked approach is 10x more effective than applying cold online.
Who I’d Message: Yan Liang has done some very practical research lately called Ask and Verify. When you’ve got a query and a set of products and want to do matching and information extraction, Yan’s team breaks the problem down into two parts. It’s a simple but effective approach. It’s unclear if she’s actively hiring but we often see managers at Amazon have good exposure to other teams that are hiring because recruiters work across multiple teams. I’d also considering reaching out to Charlene (a recruiter who specializes in research science) directly.
Adept.ai builds custom LLMs to take actions (first model is called ACT-1). For example, you ask in natural language for a task to be done like “create notes from my last 3 sales calls and add the notes alongside contact information for each attendee into my CRM”. Adept will then execute a set of agents from web browsing to using your CRM APIs to complete the task. They haven’t publicly launched a product yet but we are excited to see it (hopefully) soon.
Role: Not currently listed. Adept recently raised $350M so we expect more hiring to happen soon.
Location: San Francisco, California
Who I’d Message: Ammol is a Senior Researcher who’s been with Adept since its founding. Adept’s had some changes at the wheel with some of the founders splitting off and creating a separate company. Talking to someone who has been at Adept since the beginning increases your chances of getting the full picture.
Runway ML has been a forerunner in the image and video space, particularly with building products for creatives and consumers. In late 2022, Runway raised $50M at a $500M valuation. Although less capitalized than some of the other startups on this list, we think they are very well positioned in the image and video space.
Roles: Machine Learning Research Team Lead | Research Director
Who I’d Message: Parmida, although relatively new to the company, she has a good balance of both engineering and research and cares deeply about democratizing AI for all. For those who skew less to engineering and more to research, Jonathan’s background seems like a good fit. For Team Lead and Director roles often a first conversation with a peer team member (vs. your future manager) can give you great insights into whether you think you’d be a good leader at the company and how to position yourself for the role.
One of the most successful early adopters of GPT-3, Jasper AI builds tools for marketers to do copywriting. I think they will need to branch out beyond their niche (although it’s been a very successful niche so far) to stay competitive but it seems like their CEO is acutely aware of the space. An additional plus is that the leadership team at Jasper openly talks about having families and maintaining a good balance.
Jasper raised $125M at a $1.5B valuation in 2022. Since then, they’ve faced some challenges to their model since GPT-4 is capable of creating good marketing copy out of the box and users are getting better at formulating their own prompts (i.e. bypassing Jasper). But they’ve likely got enough runway to keep expanding and building value adds on top so we’d still have a chat with them.
Role: AI Scientist. This is an AI Researcher role but focused on taking state of the art research and applying it to real world problems. We see this role as the right fit for someone who loves both research and engineering but skews towards engineering. If your goal is to have papers with state of the art research, this probably isn’t the right fit.
Location: Remote or Austin
Who I’d Message: Excited about where Jasper is headed more broadly? Saad helped create the AI team at Jasper. He is probably a good fit for thinking about how Jasper will fit into the ecosystem and most of the “business side problems”.
If you are more interested in the tech stack or research, Amanda is new to Jasper but has been working on LLMs at Fractal and now Jasper as a Data Scientist and Dhruva is on the ML engineering side and can likely talk to how their systems are scaling. Neither is a pure researcher, but Jasper organizes their teams into pods where each pod has a PM, Engineers, and Researchers (amongst others) so I’m optimistic that they will know about the research that is going on as well.