As someone that works in a data domain, I'd say it's unlikely the ads are served on a single conversation basis in the near future, if they even are today. Any modern data org like advertising is optimizing metrics of conversion (either optimizing for increasing profits via CPI increase or revenue by increasing advertising TAM presumably).
Introducing context beyond immediate conversation history will improve conversion rates & allow targeted advertising towards wider topics or higher CPI topics (like financial products), hence it's inevitable.
What's with the dismissiveness? The author is a senior staff engineer at a huge company & has worked in this space for years. I'd suspect they've done their diligence...
I'm also curious how you went from a non-platformatized approach to adopting this platform; what were the important insights for strategizing, prioritizing, motivating teams to lift existing pipelines into the new thing? Open ended question
- inability to back-test new real-time features. People were forced to log-and-wait to create training sets for months. Chronon reduces this to hours or days.
- the difficulty of creating the lambda system (batch pipeline, streaming pipeline, index, serving endpoint) for every feature group. In chronon, you simply set a flag on your feature definition to spin up the lambda system.
How do you/AirBnB handle deeply linked features (2-hop+?) that are also latency sensitive? Maybe I'm missing something, but I don't imagine that with the transformation DSL described in Chronon.
For our org, those are by far the most complicated to handle. Graph DBs are kind of scaling poorly, while storing state in stream processing jobs is way too large/expensive. Those would also be built on top of API sources, which then lead us to the unfortunate "log & wait" approach for our most important features
In the API itself - you could specify the chain links by specifying the source.
To be precise - a GroupBy(aggregation primitive) can have a Join(enrichment primitive) as a source. To rephrase, you can enrich first and then aggregate and continue this chain indefinitely.
> Graph DBs are kind of scaling poorly
That makes sense. Since you scaling these on the read side it is much much harder than pre-computing on the write side. (That is what Chronon allows you to do)
This isn't really a drop-in replacement; they don't offer transforms out of the box.
Admittedly some of the transforms proposed in this article are a little simple & don't represent the full space of feature eng requirements for all large orgs
Actually feast does support transformations depending upon the source. It supports transferring data on demand and via streaming. It does not support batch transformation only because technically it should just be an upload but we can revisit that decision.
Introducing context beyond immediate conversation history will improve conversion rates & allow targeted advertising towards wider topics or higher CPI topics (like financial products), hence it's inevitable.
reply