SecDevOps.comSecDevOps.com
Why AI Workloads Are Fueling a Move Back to Postgres

Why AI Workloads Are Fueling a Move Back to Postgres

The New Stack(yesterday)Updated yesterday

I have spent the last couple of years watching the database landscape move through waves of excitement and disappointment. Vectors, graph, multimodal and NoSQL systems all took turns in the...

I have spent the last couple of years watching the database landscape move through waves of excitement and disappointment. Vectors, graph, multimodal and NoSQL systems all took turns in the spotlight. Each wave promised simpler development and new possibilities. Some delivered. Some did not. Most made sense in their moment. Then AI arrived. AI did not simply stretch existing systems. It broke the assumptions that shaped the last generation of managed database services. It exposed hidden trade-offs that were easy to ignore when workloads were lighter and changes were slower. It also pushed teams to rethink how they work with data. Today, I see a clear shift in the market. Teams are moving back to Postgres. More and more new applications start with Postgres in the stack. Postgres is becoming the database of AI. If an engineer is building some new application today, they are very likely going to use Postgres in their stack. It’s the most popular database system in 2025 by far. I want to explain why this shift is happening (at least in my humble opinion). I want to describe why Postgres is silently becoming the anchor of modern AI development. I also want to explain why many teams should consider leaving fully managed databases behind. This is not about nostalgia and self-hosting in the old sense. It is about a new model that keeps the benefits of managed services while giving teams the performance, cost control and data locality they need for the next decade. The new model is BYOC (Bring Your Own Cloud). How AI Workloads Broke the Managed Database Model The entire managed database ecosystem grew during a period of predictable workloads. Lift-and-shift migrations into the cloud were the backbone of the growth of services like Amazon Relational Database Service (RDS) or Azure Managed SQL. First, you lift-and-shift onto the plain Elastic Compute Cloud2 (EC2) instance, and then you move to RDS. Straightforward playbook, everyone did the same thing. No brainer. Most applications behaved like classic Software as a Service (SaaS) products. They had modest working sets. They used straightforward online transaction processing (OLTP) patterns. They scaled gradually. They relied heavily on network-attached storage, autoscaling groups and stable indexing structures. Performance was usually good enough. Latency was acceptable. Costs were manageable. And then AI showed up. AI workloads behave very differently. They are bursty. They rely on heavy parallelism. They use vector search and high-dimensional embeddings. They ingest large datasets continuously. They require frequent experiments, fast cloning and many isolated environments. They also require tight proximity between compute nodes and data storage. The gap between old and new patterns creates friction that managed databases cannot hide anymore. I speak with engineering teams every week. They all describe similar experiences. They try to scale a managed Postgres instance during a model rollout. They hit IOPS limits. They hit throttling windows. They see latency spikes at the exact moment they need predictability. They also see cost blowups because the only way to remain safe is to overprovision every environment. These problems accumulate slowly at first. Then they become unmanageable once AI workloads reach production scale. This is the moment when teams start questioning the managed model itself. The Convergence on Postgres for Modern Development Almost every major database vendor now talks about PostgreSQL compatibility. Some treat it as simple marketing. They feel FOMO and want to “jump on the Postgres ship.” It’s unclear how their offer adds value to the already competitive Postgres market, but they make the jump first and worry about the go-to-market strategy later. Others rebuild their entire engine around it. These vendors do that because they anticipate developers’ needs. Developers want a stable and well-understood SQL system. They want strong transactions. They want predictable joins. They want broad tooling support. They want a database that does not lock them into a single company or architecture. They want open source. Postgres has decades of refinement that newer systems cannot match. And it’s production-proven and rock solid. Postgres delivers all of this without forcing teams into a specialized model. It is flexible enough to serve as an OLTP engine. It can handle analytics. It can store vectors. It can run time series workloads. It can serve as a cache. It has extensions for almost everything. It has decades of refinement that newer systems cannot match. And it’s production-proven and rock solid. AI strengthens this convergence. AI teams want fewer moving parts. They want simpler pipelines. They want transactional safety combined with analytical capability, as they don’t have time to figure out new database architectures. They want to move fast in this emerging market. They want vector search without maintaining a separate vector store. They want to test new features on real data without complex data sync jobs. They want to query across data models. Postgres gives them the opportunity to unify these workloads in one place. I see more teams removing entire layers of their data stack because they realize that Postgres can handle the vast majority of their needs with the right infrastructure behind it. I see more teams removing entire layers of their data stack because they realize that Postgres can handle the vast majority of their needs with the right infrastructure behind it. They get lower latency. They get fewer operational surprises. They get a simpler development workflow. Most importantly, they get a single, well-understood data system that fits both the application and the AI pipeline. The shift is not theoretical. It is visible in product roadmaps across the industry. Why Managed Postgres Cannot Handle AI Scale  We have now established that Postgres is the new center of gravity. The next question is where and how to run it. For years, the default answer was simple. Use RDS. Use Aurora. Use Cloud SQL. The pitch was simple: Let someone else run Postgres. “The days of DBAs are gone,” they said. Most developers liked this idea. It removed infrastructure responsibility from the critical path. It reduced operational overhead. It shifted the responsibility of managing databases to the cloud vendor. But the model has a hidden constraint. A managed database means a one-size-fits-all solution. Users depend heavily on network storage. They accept network latency. They accept fixed IOPS limits. They accept multisecond cold starts. They accept the cost structure that comes with these designs. These trade-offs made sense 10 years ago. But why would you need to pay for IOPS in 2025? The pricing model still treats IOPS as scarce, even though modern Non-Volatile Memory Express (NVMe) changes the equation. AI workloads demand extremely fast storage and predictable performance. They also require large and frequent database clones for testing and experimentation. Managed databases struggle in both areas. The internal storage layers of managed systems create unavoidable bottlenecks. The cloning mechanisms depend on snapshot-restore cycles or full-blown physical copies. Both approaches are slow and expensive, especially at scale. Once a team hits these limits, the only fix is overprovisioning. You keep increasing the instance size. You maintain oversized replicas. You run full staging environments 24 hours a day, even when they sit idle. Your costs grow faster than your product. This is the opposite of what teams want in the AI era. This is the point where teams begin looking for alternatives that give them the full power of Postgres without the restrictions of managed systems. The Rise of BYOC Postgres  I see a new pattern emerging across teams building serious AI features. They want Postgres in their own cloud account. They want control over compute and storage. They want to colocate data with GPUs. They want unlimited IOPS. But first and foremost, they still want the benefits of an automated experience that gives them backups, replication and monitoring. This is the BYOC model. It is not traditional self-hosting. It is a managed platform that runs inside your own cloud environment. You keep full control over infrastructure. You keep your cloud discounts. You keep your security posture. You also keep control over where data physically lives, which matters for data residency and regulatory requirements. This model aligns naturally with compliance frameworks like SOC 2, HIPAA, GDPR and CCPA. Data never leaves your account. Encryption is handled with your own keys. Key management integrates with your existing key management service setup. Tenant isolation follows the same boundaries you already trust across the rest of your infrastructure. The platform takes care of operational complexity like backups, replication, upgrades and failure handling. You stay in control of policies, access and audit boundaries. For many teams, this is the first time managed Postgres actually fits their security and compliance model instead of fighting it. How Data Locality and Local Storage Improve Performance To further add to BYOC benefits, with the right tooling, this model resolves performance problems by removing the networked storage bottleneck. Solutions such as Vela let you deploy Postgres on the same instance where your storage is, leveraging the speed and performance of local NVMe devices attached to the instance. Using distributed simplyblock storage “under the hood,” it provides resilience and scalability, as well as copy-on-write functionality, which are otherwise not available with local storage. And that’s all deployed and managed in your own cloud. All you need to do is provision a cloud instance with local NVMe devices. Results? Storage latency drops into the microsecond range. IOPS limits disappear. Parallel ingestion becomes not only practical but required to reach the...

Source: This article was originally published on The New Stack

Read full article on source →

Related Articles