Simon would have said

Tree Traversals in Typescript

In a search engine, we might want to be able to slice and dice the search results that came back by three different facets; by content type, by content level, and by timeframe.

// what is a facet?
type ContentType = 'videoCourse' | 'interactiveCourse' | 'project'
type Level = 'beginner' | 'intermediate' | 'advanced'
type TimeFrame = 'lastSixMonths' | 'lastYear' | 'older'

When we apply a filter using one of these facets we want it to affect the counts when we are breaking down along some other facet, but not when we are looking for the counts on that facet. For example, when filtering to intermediate courses, we don’t want that filter to appear to be applied when trying to find out how many beginner courses there were, or we would just always get zero.

// what does an aggregation look like?
type EsAggregation = {
  aggregations: {
    contentType: {
      key: ContentType
      levels: {
        buckets: {
          key: Level
          timeFrame: {
            buckets: {
              key: TimeFrame
              doc_count: number
            }[]
          }
        }[]
      }
    }
  }
}

We know that down at the leaf nodes of this tree we have all of the information we need to give any of the counts we are interested in. Particularly, leaf nodes (after processing) might look something like this.

// an aspirational type. All the context we could have down at the bottom of the tree.
type Leaf = {
  contentType: ContentType
  level: Level
  timeFrame: TimeFrame
  count: number
}

If we can convert the elasticsearch aggregations to a flat list of leaf nodes in this form, it becomes much easier to ask questions about counts broken down by any given facet.

const flatten = ({
  aggregations: {
    contentType: { buckets: ctBuckets }
  }
}: EsAggregation): Leaf[] =>
  ctBuckets.reduce<Leaf[]>(
    (ctLeafs, { key: contentType, levels: { buckets: levelBuckets } }) =>
      levelBuckets.reduce(
        (levelLeafs, { key: level, timeFrame: { buckets: timeBuckets } }) =>
          timeBuckets.reduce(
            (timeLeafs, { key: timeFrame, doc_count: count }) => [
              ...timeLeafs,
              { contentType, level, timeFrame, count }
            ],
            levelLeafs
          ),
        ctLeafs
      ),
    []
  )

The idea here is that at each step down the tree, we can get the information we want from that level in context and it will be available when we get to the leaf level through closure. When we get to the leaf level, we reach up to the timeBucket in context for a TimeFrame and a count, the levelBucket in context for a Level, and the ctBucket in context for a contentType.

This process could also be seen as “denormalizing” the tree. We are taking information that is held in common by the interior nodes of the tree, and duplicating it down to every leaf node. The result is a flat structure that is much easier to iterate over, but that contains duplicate information. What if we just wanted to describe how to iterate over the structure without explicitly flattening it?

// What context do we want from a leaf?
type Reducer<T> = (state: T, leaf: Leaf) => T

// How do we put the reducer in that context?
const reduce = <T>(
  {
    aggregations: {
      contentType: { buckets: ctBuckets }
    }
  }: EsAggregation,
  reducer: Reducer<T>,
  initialState: T
): T =>
  ctBuckets.reduce<T>(
    (ctState, { key: contentType, levels: { buckets: levelBuckets } }) =>
      levelBuckets.reduce(
        (levelState, { key: level, timeFrame: { buckets: timeBuckets } }) =>
          timeBuckets.reduce(
            (timeState, { key: timeFrame, doc_count: count }) =>
              reducer(timeState, { timeFrame, level, contentType, count }),
            levelState
          ),
        ctState
      ),
    initialState
  )

This is structurally the same, but now generic over which reducing operation. The type parameter here lets us tie together the initial state type, the output type, and the signature of the reducer. In fact, we can get back flatten by providing an appropriate reducer.

const flatten = (agg: EsAggregation): Leaf[] =>
  reduce<Leaf[]>(agg, (leafs, leaf) => [...leafs, leaf], [])

What the reduce function is allowing us to do here is separate the complex logic of how we iterate over this structure from the comparatively simple logic of what we want to do with the result of that iteration. We have created a context in which we have all of the information we need, and the reduce function allows us to run arbitrary code within that context. Now it becomes easy to get the counts broken down by any given facet.

// counts broken down by a facet
type CountsBy<T extends string> = { [key in T]: number }

const countsByTime = (agg: EsAggregation): CountsBy<TimeFrame> =>
  reduce(
    agg,
    (countsSoFar, { timeFrame, count }) => ({
      ...countsSoFar,
      [timeFrame]: countsSoFar[timeFrame] + count
    }),
    { lastSixMonths: 0, lastYear: 0, older: 0 }
  )

The related functions for other facets are a little trickier because our time bins overlap, which needs to be accounted for by our reducer.

const countsByLevel = (agg: EsAggregation): CountsBy<Level> =>
  reduce(
    agg,
    (countsSoFar, { level, timeFrame, count }) =>
      timeFrame === 'older'
        ? {
            ...countsSoFar,
            [level]: countsSoFar[level] + count
          }
        : countsSoFar,
    { beginner: 0, intermediate: 0, advanced: 0 }
  )

How do we ask about the count given some set of filters? simple, we just add that filtering logic to our reducing function.

type Filter = {
  contentType: ContentType[]
  level: Level[]
  timeFrame: TimeFrame[]
}
const inFilter = ({ contentType, level, timeFrame }: Leaf, filter: Filter): boolean =>
  (filter.contentType.length === 0 || filter.contentType.includes(contentType)) &&
  (filter.level.length === 0 || filter.level.includes(level)) &&
  (filter.timeFrame.length === 0 || filter.timeFrame.includes(timeFrame))

const filteredCountsByLevel = (agg: EsAggregation, filter: Filter): CountsBy<Level> =>
  reduce(
    agg,
    (countsSoFar, leaf) =>
      inFilter(leaf, { ...filter, level: [] })
        ? {
            ...countsSoFar,
            [leaf.level]: countsSoFar[leaf.level] + leaf.count
          }
        : countsSoFar,
    { beginner: 0, intermediate: 0, advanced: 0 }
  )

Again, we separate our concerns. One function for the logic of deciding whether a leaf satisfies a filter, one function for counting up over leaves, and one function for finding the leaves we want to traverse over. Each individual function is easy to describe and therefore easy to test. Even better, they give us a vocabulary that makes their composition easy to describe; “count the items under each leaf in the tree that satisfies the filter”. Initially, setting the level in the filter to be the empty array might seem like a wart (and maybe it is, you can have your own aesthetic preferences). But it also has a nice clean interpretation; “when breaking down by level, pretend like there isn’t a level filter.”

Typechecked playground demonstrating the code in this article