> doing hundreds of thousands of messages per day > The postgres instance now ru...

colinchartier · on Dec 18, 2021

I should've clarified, the database handles more than just the "regular" pub/sub, some of the tables have over a billion rows.

boomskats · on Dec 18, 2021

Thank you. That makes a lot more sense and explains the value of the approach much better.

SahAssar · on Dec 18, 2021

I assume that is their main database for everything, not just for pub/sub. One of the big benefits of doing it that way is that you have proper transaction handling across jobs and their related data.

jbverschoor · on Dec 18, 2021

Come on man… you can run the whole thing off if a few Gb instance. Such a huge instance should be able to do about 100k a second!

aniforprez · on Dec 18, 2021

Potential and actual usage aren't related. They might be having a lot of records and read/writes but maybe the actual pub/sub isn't that intensive. They seem to be using the same DB for everything

stillicidious · on Dec 18, 2021

Be careful not to confuse average load with peak instantaneous load. Bursty workloads are the bane of capacity planners everywhere.

akvadrako · on Dec 18, 2021

Does postgres scale that well? I would be interested in case studies as I've not seen much achieving beyond 10k records per second.

michaelcampbell · on Dec 18, 2021

Whose to say it can't?

colinchartier · on Dec 18, 2021

Such a server is 400$/mo, a backend developer that can confidently maintain kafka in production is significantly more expensive!

macksd · on Dec 18, 2021

I think the point of interest was 32 cores to handle what sounds like 10 messages per second at most. That's not really a ton of throughput... It's certainly a valid point that an awful lot of uses cases don't need Twitter-scale firehoses or Google-size Hadoop clusters.

colinchartier · on Dec 18, 2021

Ah, the database does a lot more than just pub/sub - especially since the high traffic pub/sub goes through redis. I guess my point was that we never regretted setting up postgres as the "default job queue" and it never required much engineering work to maintain.

For an example, it handles stripe webhooks when users change their pricing tier - if you drop that message, users would be paying for something they wouldn't receive.

rowanG077 · on Dec 20, 2021

It said nothing about the distribution of traffic. It might well be thousands and thousands of pub sub messages at some point of the day and 0 for others.

rockwotj · on Dec 18, 2021

Fwiw I don't know the shape of the data, but I feel like you could do this with Firebase for a few bucks a month...

daenz · on Dec 18, 2021

you 100% could, and this thread feels like the twilight zone with how many people are advocating for using a rdbms for (what seems like) most peoples queuing needs.

tomc1985 · on Dec 18, 2021

Dude you are seriously underestimating postgres' versatility. It does so many different things, and well!

daenz · on Dec 18, 2021

I'm not underestimating anything. I am advocating for the right tool for the job. I have a hard time believing, despite the skewed sample size in this thread, that most people think using postgres as a message queue for most cases makes the most sense.

tomc1985 · on Dec 18, 2021

What is your idea of 'most cases'?

I've personally written real-time back-of-house order-tracking with rails and postgres pubsub (no redis!), and wrote a record synchronization queuing system with a table and some clever lock semantics that has been running in production for several years now -- which marketing relies upon as it oversees 10+ figures of yearly topline revenue.

Neither of those projects were FAANG scale, but they work fine for what is needed and scale relatively cleanly with postgres itself.

Besides, in a lot of environments corporate will only approve the use of certain tools. And if you already have one approved that does the job, then why not?

daenz · on Dec 18, 2021

>some clever lock semantics

Most senior+ engineers that I know would hear that and recoil. Getting "clever" with concurrency handling in your home-rolled queuing system is not something that coworkers, especially more senior coworkers, will appreciate inheriting, adapting, and maintaining. Believe me.

I get that you're trying to flex some cool thing that you built, but it doesn't really have any bearing on the concept of "most cases" because it's an anecdote. Queuing systems are a thing for a reason, and in most cases, using them makes more sense than writing your own.

pritambaral · on Dec 18, 2021

> Most senior+ engineers that I know would hear that and recoil. Getting "clever" with concurrency handling in your home-rolled queuing system is not something that coworkers, especially more senior coworkers, will appreciate inheriting, adapting, and maintaining. Believe me.

I am both a "senior+ engineer" that has inherited such systems and an author of such systems. I think you're overreacting.

Concurrency Control (i.e., "lock semantics") exists for a reason: correctness. Using it for its designed purpose is not horror. Yes, like any tool, you need to use it correctly. But you don't just throw away correctness because you don't want to learn how to use the right tool properly.

I have inherited poorly designed concurrency systems (in the database); yes, I recoiled in horror and did not appreciate it. So you know what I did? I fixed the design, and documented it to show others how to do it correctly.

I have also inherited OOB "Queuing Systems" that could not possibly be correct because they weren't integrated into the DB's built-in and already-used correctness system: Transactions and Concurrency Control. Those were always more horrific than poorly-implemeneted in-DB solutions. Integrating two disparate stores is always more trouble than just fixing one single source.

----

> I get that you're trying to flex some cool thing that you built, but it doesn't really have any bearing on the concept of "most cases" because it's an anecdote. Queuing systems are a thing for a reason, and in most cases, using them makes more sense than writing your own.

I get that you're trying to flex that you use turnkey Queueing Systems, but it doesn't really have any bearing on the concept of "most cases", because all you've presented are assertions without backing. Queuing systems are good, for a specific kind of job, but when you need relational logic you better use one that supports it. And despite what MongoDB and the NoSQL crowd has been screaming hoarsely for the past decade, in most cases, you have relational logic.

tomc1985 · on Dec 18, 2021

Well, you'd have to see it before you judge. It's super simple, like 5 or 10 lines total. Handles 1000x+ the traffic it sees. In any case concurrency is nothing to be afraid of. Do they not teach dining philosophers any more?

My point is that postgres is a swiss army knife and you and anyone else would be remiss to not fully understand what it is capable of and what you can do with it. Entire classes of software baggage can be eliminated for "most" use cases. One could even argue that reaching for all these extra fancy specialized tools is a premature optimization. Plus, who could possibly argue against having fewer moving parts?

fanf2 · on Dec 18, 2021

I guess the clever lock semantics are SKIP LOCKED, which is designed to support efficient queues. The cleverness is inside PostgreSQL rather than in the application, other than the cleverness of knowing about this feature. https://www.2ndquadrant.com/en/blog/what-is-select-skip-lock...

tomc1985 · on Dec 18, 2021

Yup, exactly that

qeternity · on Dec 18, 2021

No, you are misunderstanding. People are saying Postgres does message broking quite well. That makes it the right tool for the job for many people. You have a hard time believing it but people who have actually done it are saying otherwise. This is your misunderstanding.

michaelcampbell · on Dec 18, 2021

There is also the issue of having to have up to n experts for n different "best tools". Programmer/devops time is expensive; the tool choice is not the only (and often the least) cost to consider.

foepys · on Dec 18, 2021

Why should I rely on yet another microservice when I have PostgreSQL right there?

daenz · on Dec 18, 2021

Everything is a nail, why should I use anything but this hammer?

tluyben2 · on Dec 18, 2021

Make every system as complex as you can with tech you are not really familiar with is a good plan for your small team? Under a 100 people, your company does not have 100 devops etc to make sure all these 'best of breed' tools actually managed properly in production. If a service on top of postgres dies, I will find out why very quickly; on Kafka, even though I have used it a bunch of times, I usually have no clue; just restart and pray. Why would I force myself to use another tool when postgres actually works well enough? Resume driven?

Sometimes I agree with best tool for the job; if the constraints make something a very clear winner; if the difference is marginal for the particular case at hand, I pick what I/we know (I would actually argue that IS the best tool for the job; but in absolute 'what could happen in the future' terms it probably is not).

pritambaral · on Dec 18, 2021

Postgres happens to be a very good hammer, thank you very much. You should try it sometime.

But seriously though, postgres's relational logic implementation makes for a very good queueing system for most cases. It's not a hack that's bolted on top. I know that's how quite a few "DBs" are designed and implemented, and maybe you've been burned by too many of them, but Postgres is solid. I've seen it inside and out.

Dowwie · on Dec 18, 2021

I think you're helping bring balance to the enthusiasm here for using Postgres as a multi-purpose tool. However, there is a lot of room for you and the advocates favoring Postgres to both be right about tooling. I adopted RabbitMQ because I decided I didn't want to grow into needing it by dealing with many of the problems that motivated engineers to bring RabbitMQ into existence. However, I probably would have been fine with Postgres-pubsub, or Redis-pubsub/streams, both databases that I already used for their general purpose and have established capabilities for messaging. I noticed your earlier agreement with the person who mentioned using Firebase, and Firebase is yet another multi-purpose tool good enough at many things but still not better than the customized domain systems. If you agree with the claim for Firebase, others can now agree about Supabase. It's all Postgres beneath, though.

nicoburns · on Dec 18, 2021

Agree with your point about multiple tools being good enough, but IMO firebase is not one of them. In my experience despite it claiming to be excellent at scaling, it performs worse than even a small Postgres instance. It’s good at the “real-time subscriptions”, but that’s about it.

Dowwie · on Dec 18, 2021

Noted. Thanks for sharing your experiences with that. We need to hear more about lackluster investments in tech.

tata71 · on Dec 18, 2021

Does Firebase offer self-hosting these days?

What do you say to those who don't want Google to know their usage info?

pricci · on Dec 18, 2021

Checkout supabase.com. It is based on postgres.

threeseed · on Dec 18, 2021

But Kafka does significantly more.

And if your needs are simpler like in this case then there are dozens of smaller pub/sub/queue systems that you could compare this to.

speed_spread · on Dec 18, 2021

Limit the types of server used to reduce system complexity. If you can have all your business state in the same place, ops are much easier.

Kafka does more for streaming data, but doesn't do squat for relational data. You always need a database, but you sometimes can get by without a queuing system.

moneywoes · on Dec 18, 2021

Briefly what are some mandatory kafka use cases?

akvadrako · on Dec 18, 2021

I would say postgres does much more. What use case can only Kafka handle?

remram · on Dec 18, 2021

It's that much on a popular cloud platform, you can buy this for 3-4 times that amount and use it for years.

discordance · on Dec 18, 2021

Got a 128gb 32 core xeon workstation sitting under my desk off eBay and it was $400

isbvhodnvemrwvn · on Dec 18, 2021

That's not exactly a setup suitable for reliable production usage though.

tluyben2 · on Dec 18, 2021

Or rent it for a lot less at a traditional hosting company.

pts_ · on Dec 18, 2021

That's the job of a DevOps engineer not a backend developer attempted to be overworked.

lern_too_spel · on Dec 18, 2021

It probably fits within the free tier limits of a managed pubsub service.

tyingq · on Dec 18, 2021

I imagine their scaling problem isn't messages/day, it's probably lots of concurrent, persistent connections. And I don't think a connection pooler would work with this job queue setup.

ClumsyPilot · on Dec 18, 2021

Yeah, every home IoT hub processes more messages than that with less thsn raspberri pi worth of compute

earleybird · on Dec 18, 2021

I certainly appreciate the sentiment though I'm pretty sure I don't have the same reliability and uptime guarantees on my little Rpi3/MQTT/NodeRed/SQLite/ESP8266 home system :-)

That said, it's been running for upwards of 4 years and accumulated an insane number of temperature readings inside and above heating vents (heat source is heat pump)

SELECT count() as count FROM temperatures : msg : Object { _msgid: "421b3777.908118", topic: "SELECT count() as count FROM …", payload: 23278637 }

Ok, I need therapy for my data hoarding - 23 million temp samples is not a good sign :-)

brodouevencode · on Dec 18, 2021

Curious as to why you aren’t tracking that with a time series database?

earleybird · on Dec 19, 2021

It's hobby diversion so minimal effort is a factor. That and that SQL query comes back in seconds. The initial experimentation was with ESP8266's and the MQTT/NodeRed/SQLite played a supporting roll.

My experience with SQLite is that it can take you a long ways before needing to look elsewhere.

ClumsyPilot · on Dec 18, 2021

The IoT hubs are an embedded system, built with a minimal memory footprint and overhead, 512 mb of ram is typical, sometimes less. Here is an example: https://www.gl-inet.com/products/gl-s1300/

That means you can't have docker and different versions of Java, node and .Net all running in parallel.

You run a single process and Sqlite is a library that allows SQL operations and database to be inbuilt. You 'budget' is like 100 mb of Ram, becauae other stuff has to run too.

All the time-series databases I know are a large, memory hungry hippo, built for distributed compute/kubernetes. Just very different usecase. If one was built with minimalism in mind, then it could be used.

johnisgood · on Dec 18, 2021

Are you implying that given the specs, hundreds of thousands of messages per day is not good enough? I think you are, or at least that is what I was thinking myself.

marcosdumay · on Dec 18, 2021

Only for hundreds of thousands of messages per day, that's way too big of a server. But if you look on the rest of the thread, it doesn't do only that.

Anyway, for a server that only does pub/sub with ACID guarantees, those specs are so large that there is certainly a bottleneck before they matter. So it wouldn't be strange if somebody gets one that can't even handle that, it just would mean that there is some issue somewhere we don't see.

edmundsauto · on Dec 18, 2021

Is your point that the server has room to grow? Or that you just “ain’t impressed by that”?

marcosdumay · on Dec 18, 2021

I guess the point is that the scale is actually not that large, but that's perfectly ok because most problems will never need that large scale either.

In fact, the article makes a very good point how just doing it in postgres is great, it doesn't really scale (because of ACID), and adapting it for scale after you need it will lead to a better design than what you would do if you started optimizing without any information.

tmountain · on Dec 18, 2021

People are jumping on this. Question is—do the resource requirements outlined align with usage you described, or is that combined workload? By combined workload, I mean working set plus messaging. It’s not a useful exercise to criticize a service that’s multifaceted based on a single use case. Full disclosure—-not a Postgres user, nor am I invested in the tech.

dreyfan · on Dec 18, 2021

It’s such a low throughput requirement I think even bitcoin could support it.

ivalm · on Dec 18, 2021

No, this is prob still spiky enough to have more than 7 transactions per second, that’s too much for Bitcoin.

michelpp · on Dec 18, 2021

To cherry pick two details of the post and insinuate something about it?

No.