Author. Happy to answer any questions.

echrisinger · on April 9, 2024

How do you/AirBnB handle deeply linked features (2-hop+?) that are also latency sensitive? Maybe I'm missing something, but I don't imagine that with the transformation DSL described in Chronon.

For our org, those are by far the most complicated to handle. Graph DBs are kind of scaling poorly, while storing state in stream processing jobs is way too large/expensive. Those would also be built on top of API sources, which then lead us to the unfortunate "log & wait" approach for our most important features

nikhilsimha · on April 9, 2024

we call this chaining.

In the API itself - you could specify the chain links by specifying the source.

To be precise - a GroupBy(aggregation primitive) can have a Join(enrichment primitive) as a source. To rephrase, you can enrich first and then aggregate and continue this chain indefinitely.

> Graph DBs are kind of scaling poorly

That makes sense. Since you scaling these on the read side it is much much harder than pre-computing on the write side. (That is what Chronon allows you to do)

morkalork · on April 9, 2024

At what size of team, features or number of models would you say the break even point is for investing time into using this platform?

nikhilsimha · on April 9, 2024

Offline is pretty easy to get started with. It should take less than a week to set it up for new use-cases across the company. (You can begin building training-sets if offline is setup)

Online is a bit more involved - you need a month or more to test that your KV store scales against traffic coming from chronon for reads and writes.

dundun · on April 9, 2024

How does this relate to Zipline and Bighead? Does it replace those projects or is it a continuation of them?

nikhilsimha · on April 9, 2024

Bighead is the model training and inference platform.

Chronon is a full re-write of zipline with 1) a different underlying algorithm for time-travel to address scalability concerns. 2) a different serde and fetching strategy to address latency concerns.

echrisinger · on April 9, 2024

I'd imagine a continuation... he is also the author of Zipline

andscoop · on April 9, 2024

I noticed airflow as the backing orchestration service. Was there any consideration for another orchestration tool? I know Airbnb has at least two internally, but also that airflow is the predominant one for the data org still.

nikhilsimha · on April 9, 2024

Airflow is the current implementation since it is the paved path at airbnb. But we are open to accepting contributions for other orchestrators.

Someone mentioned they wanted to add cadence support.

echrisinger · on April 9, 2024

I'm also curious how you went from a non-platformatized approach to adopting this platform; what were the important insights for strategizing, prioritizing, motivating teams to lift existing pipelines into the new thing? Open ended question

nikhilsimha · on April 9, 2024

There were two main drivers -

- inability to back-test new real-time features. People were forced to log-and-wait to create training sets for months. Chronon reduces this to hours or days.

- the difficulty of creating the lambda system (batch pipeline, streaming pipeline, index, serving endpoint) for every feature group. In chronon, you simply set a flag on your feature definition to spin up the lambda system.