In Defense of Thin Wrappers

GroupMe first launched as an SMS-only application. All groups were assigned a unique phone number that you could add to your contacts as Family, College Friends, or Music Crew. You’d add members to the group with a series of SMS commands. When you sent a message to your group’s phone number, a text would be relayed to everyone else in the group. All of this was built using Twilio, which at that time had found a way to abstract away all the complexity of integrating with telecommunications infrastructure so application developers could focus on building great user experiences. We would have never existed without Twilio, but it also led to a real problem with our business: we paid for every single message we sent. The average size group was six people, so we paid when someone sent a message to the group phone number, and then we paid to relay that message five times to everyone else in the group. One message meant paying for it six times on average. We also paid every month to lease the group phone numbers.

This became an extremely expensive endeavor for a free service with no means of monetization. The product grew virally. For every group that was created, one user would go off and create their own, adding five new people to the network. Every group and message sent was a variable cost, and we were beholden to Twilio’s prices. We tried negotiating but could never get prices to a place that wouldn’t put us out of business. My co-founder Steve even proposed doing an equity swap with Twilio to align our respective fates, which was a wonderful idea but sadly rejected. Our only chance of survival was to raise enough money for VCs to subsidize our text messaging product while we found ways to drive down SMS costs and migrate our user base to an over-the-top mobile messaging application similar to WhatsApp.

To get off Twilio, we first had to understand how to get closer to telecommunications infrastructure, or “the metal,” as industry veterans called it. We hired consultants who ramped us up and helped us to identify two companies, Bandwidth and Level3, that Twilio was using to build their service on top of while they hammered out deals directly with telcos. These companies were not developer-friendly friendly, and we had to task Brandon Keene with the mission-critical responsibility of migrating GroupMe off Twilio and figuring out how to rebuild all of our SMS infrastructure while maintaining acceptable service levels for our users. We also had to play Bandwidth and Level3 off each other to negotiate bulk pricing that wouldn’t put us out of business and enable us to scale for the years ahead. We were in our early 20s and had no clue what we were doing.

We miraculously managed to cut a deal with Bandwidth, migrated off of Twilio, and bought ourselves enough time to wait for most of our users to switch to the native mobile app where we didn’t have to pay exorbitant SMS costs as the service scaled.

Lately, I have seen many companies that remind me of this GroupMe experience. They are building consumer-facing applications that sit on top of LLMs, primarily Open AI, and when they get some form of traction and grow, variable costs start skyrocketing. Similar to GroupMe, very quickly monthly costs ramp up to hundreds of thousands a month, but now it’s inference instead of SMS. Over time, these costs will come down for application developers. Market competition, open source, and locally hosted models will all make inference more affordable, but it’s unclear if we are operating on a timeframe of one year or five.

For most application developers, it’s not really an option to not use these models. Consumers are growing to expect the type of functionality and features they deliver. Once things start working, there will inevitably be some form of scaling and expensive inference costs that are meaningfully higher than what pre-LLM companies have experienced. Having to deal with these issues when you are growing is really hard. You effectively have to rebuild the engine of your machine mid-flight. It’s hard enough to improve your user experience continuously, hire people, do performance management, and run your company. Adding the capital sink of inference costs creates a whole new series of challenges.

This means it is incumbent upon founders to get ahead of this issue. Several things feel like best practices now:

Plan for using more than one model when you start, and begin to diversify at signs of inflecting. Being beholden to a singular LLM is likely a recipe for disaster. You have no leverage and are subject to pricing whims. Plan to be multi-model. This doesn’t mean starting with three integrations out the gate, it just means knowing who you’ll expand with and having an idea as to when you’ll do it and what the process will look like.
Learn how to route prompts to the right models. Not all models are created equal, and some are better at certain things than others. One of our portfolio companies has a wizard-like mastery of this. There are now many companies that act as an intermediary between applications and underlying LLMs, but I think if AI is a core part of your value proposition you need to master this yourself and can’t outsource it.
Find your Brandon. Someone inside your company needs to shoulder responsibility for owning your LLM strategy and executing the plan. Like all mission-critical things, accountability is everything.
Find a group of advisors who know how this all works. While LLMs feel reasonably new to a lot of people starting companies, there are experts out there who are excellent at helping assess which models best fit your needs, and understand how to think about competitive pricing and prompt routing. It’s probably a good idea to have a circle of 2-3 advisors who have some skin in the game that you can turn to with specific questions, both strategic and tactical.
Hone your business development chops. You’re going to be in a constant conversation with model providers asking for things: pricing, integrations, access to private betas, etc. These relationships matter. Invest in them.
Raise a little more money than you think you need as a buffer so you’re not always caught on your heels reacting to these costs. When things really work it means your costs will escalate faster than expected. Extra capital on your balance sheet may provide you with some peace of mind.

I’m sure there’s a lot more to add to this list, but it’s a start. This is likely the status quo for the next several years so it’s good for entrepreneurs to be aware of the current state of affairs and have a plan for it. Exciting times come with exciting challenges.