Now with AI, if you discover a technique that works, either intuitively, or through some measured automation, you can actually deploy it instantly.
It enables not only a degree of empathy and personalization which was not possible before, but as a leader of a company and a brand ambassador, you can try things more quickly. Then when you find a magical moment, you can operationalize it.
—Bret Taylor, Chairman of the Board, OpenAI, CEO of Sierra
A tech stack is the collection of tools, technologies, and services used to build or run something.
In the past that usually meant the software and infrastructure a company used to power its apps or websites. But that its not just companies that have tech stacks anymore. You’ll have them too.
If you read to the end you get an Easter Egg this week.
Your Tech Stack
As AI grows and scales it will make sense to start to belong to an ecosystem. If you’re an Apple person you’ll probably be part of the ChatGPT ecosystem, as that is more tightly coupled with the iPhone.
If you are an Android person you’ll probably start integrating with Gemini, since that is Google’s product.
But if you’re smart, you’ll build your own tech stack based on the strengths and weaknesses of each model and application. Rather than a generalized everything app, you’ll have a tool box of models to choose from.
Take my setup for example.
I use ChatGPT for deep conversations, image generation, and advanced voice interactions. When I’m driving I treat it like an interactive co-pilot. I can interrupt it, ask clarifying questions, and dive deep into an idea the moment it hits.
Claude is my go-to for code generation. It’s fast and thoughtful and it handles structured logic well.
For research that requires more in-depth nuance and source triangulation, I turn to Grok.
When I need to quickly find a fact or check something lightweight, Perplexity gets me there faster than a traditional search engine.
And for fun I use Suno to generate music that fits my mood or projects.
These aren’t just tools. They’re collaborators. Each one fills a different niche.
That’s the key shift. It’s no longer about using one assistant to do everything. It’s about assembling a constellation of models and services, each with its own strength and personality.
Your tech stack becomes an extension of your mind.
This also means that the essential skill in a world of AI is not just prompt engineering. It’s about knowing how to orchestrate a team of them.
The best results come from chaining tools together in creative ways. You might brainstorm a business idea with ChatGPT, write the backend code with Claude, ask Grok for a SWOT analysis, and then rehearse your presentation with GPT Advanced Voice. That’s not science fiction. That’s just Wednesday.
We are entering an era where individual leverage will be defined by how well you architect your personal stack. This is as big a shift as the invention of the personal computer. But instead of a single machine, you now have a team of agents working for you across modalities.
The difference between the future haves and have-nots may not come down to access. It may come down to orchestration.
OpenAI
Chat GPT is still the best frontier model out there imo. Some people swear by Gemini, but early adoption has become a moat for me.
Reasoning (o4-mini-high) is great for more complex tasks, the interface is clean, the latency is great, and it overall outputs what you want on your first try, especially if you put thought into your prompt.
What sets OpenAI apart is 3 things:
Advanced Voice
Superior Image Gen
Excellent Custom Instructions and Memory
Advanced Voice
The advanced voice feature allows you to speak directly to the interface with almost no latency between input and response. The voice is clear and not uncanny, and allows you to interrupt it if it starts rambling about something you don’t care about.
Its an endlessly patient tutor or brainstorming partner, willing and able to search the web for anything it doesn’t know off the top of its head. This feature alone would be enough to justify the $20 a month subscription.
Superior Image Gen
GPT 4.0 has the best image gen I have seen so far. Kicked off by the recent Ghibli phenomenon on 𝕩, what makes 4.0’s image gen special is how accurate it is, and how it can edit existing photos.
Most image gen relies on generative technology and an attention mechanism. The model will recognize entities in a prompt and try to create the image based on the associations it has been trained on.
What makes 4.0 special is its a combination of the attention mechanism and a diffusion model. What you see in the interface is a blank slate of gaussian noise which slowly forms into what you asked for.
Like Michelangelo crafting David, the model strips away everything from the image that is not your prompt. Then if you ask for any edits, it can just alter the photo, instead of having to create an entirely new one.
With the increased ability to get letters correct, and this editing capability, it is far and away my favorite image gen on the market.
Excellent Custom Instructions and Memory
OpenAI let’s you put your very own system prompts into your interface to customize how the model responds to you.
This allows for each turn with the model to be maximized and reduces any friction I may encounter. For example, I dislike moral lectures, appeals to authority, a sycophantic tone, and any mention of the AI reminding me its an AI.
After implementing these custom instructions, I don’t have any of these problems anymore.
What’s an even bigger game changer is memory. ChatGPT has gotten very good at remembering all your conversations. And not only that, its gotten very good at analyzing how you talk to it. Are you terse, do you dive deep, are you curious, what kinds of topics do you enjoy.
If you really want to do some shadow work, use this prompt:
Please psychoanalyze me. Be critical and clinical.
Overall, I’d say ChatGPT is my go to app that I build a lot of my ecosystem on top of. But its not a one stop shop.
Claude Sonnet
Claude is the geeky stepbrother of ChatGPT. I’ve long since thought they have a branding problem, not helped by the fact they named their premier models after what sounds like a troubled French thespian in a Marcel Proust novel.
This is underscored by the fact that their three premier models (haiku, sonnet, opus) are plays on literature, based on the size of their parameter counts.
Haiku is the smallest and fastest (like a haiku). Sonnet is the medium one, and Opus and is largest and most powerful (like a magnum opus).
I consider myself a proud nerd, but even this type of stuff is deep lore geek shit and won’t win the hearts and minds of the normies, many of whom have never even heard of Studio Ghibli, let alone Hyperion.
Maybe that’s why Anthropic, the company who makes Claude, has leaned heavily into coding and the enterprise API market, almost entirely giving up on the consumer market which has been captured by ChatGPT.
Claude is great for speed, efficiency, strong multitasking capabilities, and error correction, especially in handling large or complex codebases. People often prefer Claude Code for quick, focused coding tasks and appreciate its streamlined interface.
I’ll give you an example. Last week a vendor sent us some multimodal data. The PDFs they provided did not correspond with the unique identifiers we had on file.
After asking them for a mapping file, they provided a list of unique identifiers which were embedded in the JSON they had provided, but still did not correspond to what we had readily available in our database.
I was able to use Claude to write some code to extract the JSON data, map all the unique identifiers with the multimodal data we needed, and export the output into a txt file. I am not remotely technical and was able to execute this in a matter of minutes.
Claude may sound like a troubled thespian, but he’s my technical co-founder and I wouldn’t have it any other way.
Grok
Elon Musk’s infamous Grok is where I go when my prompt violates the terms of service on puritanical ChatGPT, or when I want to do any kind of deep research.
The image gen is not nearly as good so far, but anything ChatGPT won’t answer, Grok is more than happy to. I’ve also grown fond of the Grok option on 𝕏 that gives you context on a meme or statement in a x-eet.
But far and away the best use of Grok is through Deep Research, and there’s a very specific reason for this.
Coming in at over 100,000 GPU’s, xAI Colossus supercomputer cluster is the largest in the world, housed in a 750,000 square foot facility in Memphis, Tennessee, originally a former Electrolux plant.
Elon Musk's xAI constructed the initial Colossus supercomputer cluster in just 122 days, outpacing all previous industry estimates.
No one could have done what he did, plain and simple.
After completion of the first 100,000 GPU phase, the expansion to 200,000 GPUs took an additional 92 days. This rapid deployment was made possible by repurposing an existing facility in Memphis and leveraging an incredible founder-mode project timeline.
But more important than the pace of standing up the data center, more important even than the size of the cluster, is Elon’s ability to make them all coherent.
The whole reason that OpenAI had not done similar was the prevailing wisdom was that there was marginal diminishing returns for GPU coherence.
Basically, he more GPU’s you tried to use, the more entropy was introduced into the system. Musk figured out how to build the 100,000 cluster and make them coherent.
Coherence ensures that all GPUs can efficiently share and synchronize data and model parameters, which is essential for training massive AI models like Grok 3 at scale.
Without it, training would suffer from delays, errors, and poor hardware utilization, undermining the value of such a large investment.
The main difficulty lies in orchestrating high-speed, low-latency communication across such a vast number of GPUs. xAI addressed this by using a novel Ethernet-based architecture (NVIDIA Spectrum-X) and providing each GPU with a dedicated 400GbE network interface, minimizing contention and maximizing bandwidth.
Synchronizing so many GPUs also introduces complex software challenges, requiring advanced distributed training stacks to manage load balancing, fault tolerance, and real-time error correction.
Additionally, the system must be able to handle significant power and thermal fluctuations, necessitating robust cooling and power management solutions.
Together, these innovations enable Colossus to maintain reliable, high-speed synchronization at a scale never before achieved in supercomputing anywhere in the world.
For this reason, xAI’s Deep Research offering is second to none.
Deep Research, as it relates to xAI, refers to an advanced report generation system that leverages large language models (LLMs) as autonomous agents to iteratively search, analyze, and synthesize information in response to a user query.
The process is designed to produce comprehensive, detailed reports that resemble the work of a human research analyst (think BCG or McKinsey).
Unlike simple search or retrieval, Deep Research involves multiple rounds of querying and reasoning, with the LLM refining its findings, evaluating sources, and structuring the output for clarity and depth.
In the case of xAI’s Grok, Deep Research typically involves a few iterations for outlining and then for each report section, enabling the system to tackle complex topics and deliver nuanced, multi-faceted insights.
This approach represents a significant evolution from basic search or single-pass report generation, emphasizing iterative analysis and synthesis to achieve higher-quality, more reliable outputs.
For any kind of research you are doing or any kind of deeper dive you want to do on any topic, Deep Research can take up to ten minutes and return to you with a report that would put any McKinsey analyst to shame.
Perplexity
Perplexity has replaced Google search for me. Why would you want ten blue links when you can get a sourced answer?
The secret lies in the RAG technology (Retrieval Augmented Generation), which searches the internet to find sources, and utilizes an attention mechanism to create a generative summary of the information for you, sourcing where it got all the info from.
It basically does the Google search for you, compiles all the necessary information, and returns to you with a summary and bibliography.
Perplexity started out in late 2022 as a clever but simple wrapper around OpenAI's GPT-3.5, basically adding a slick interface and some search tricks to the model.
At first, it didn’t have its own tech, just a clean UX.
But as the AI arms race heated up, Perplexity hustled to build its own infrastructure, creating a massive search index and experimenting with open-source models like Mistral and Llama 2, instead of relying on OpenAI.
The real turning point came when DeepSeek-R1 was open sourced by the Chinese.
Perplexity post-trained DeepSeek to strip out the Marxist censorship embedded in its training data, and Deepseek “1776” (get it?) was born.
This move wasn’t just technical. It was a statement about transparency and freedom of information, letting the Deepseek model tackle previously off-limits topics like Taiwan or Chinese politics without dodging the question or parroting official lines.
By open-sourcing 1776 and making it available to anyone, Perplexity threw down the gauntlet against sanitized, closed source AI, including OpenAI and Google, and signaled it wanted to be more than just another GPT wrapper.
Perplexity also has a Deep Research function, but I prefer Grok’s for the reasons laid out above. Perplexity for me is a quick, lightweight way to search for facts and figures I would have gone to Google for in the past.
Suno
Okay, this one is a bit of a bonus.
Suno’s story is classic startup chaos with a side of music nerd obsession. The founders-bored with wrangling financial transcripts at Kensho in Cambridge-ditched the corporate grind in 2022 to chase a wilder idea: what if AI could crank out not just speech, but full-blown, radio-ready songs?
Their first try was Bark, a text-to-speech toy, but user demand for music pushed them to build something way bigger.
Suno’s early models were rough, but by late 2023, they’d built a system that could spit out entire tracks-vocals, instruments, the works-from a single prompt, using a Frankenstein’s monster of transformer tricks and relentless data wrangling.
The catch? They were cagey about their training data, and soon the music industry came knocking with lawsuits, accusing them of hoovering up copyrighted tracks to teach their AI how to sing like Michael Jackson or ABBA.
Instead of backing down, Suno doubled down-rolling out v4 and then v4.5, each version making AI music sound even more uncannily real, and adding features like virtual singers, remastering, and genre-mashing collabs with real producers.
Now, Suno’s at the center of an AI music arms race, both loved and loathed for making anyone a songwriter in seconds and blowing up the old rules of who gets to make music.
I use it to make hype songs for myself, curated songs for my nephews birthday, and serenades for my dates (if they’re nerdy like me).
Like most AI tools you’ll be shocked at how the music it makes actually kind of slaps.
So the real question for the future is not whether everyone will have their own tech stack. It’s how good you are at building yours.
Very good overview.
You won’t loose your job to AI, just to someone who knows how to use it better!
Owning your stack is self defense for creative people. I would argue that goes even beyond using tools but also to build the tools for expressing creativity. But certainly it's about designing systems that grow: https://open.substack.com/pub/theafh/p/the-extraction-machine-is-running?utm_source=share&utm_medium=android&r=42gt5