java - Why don't I see any output from the Kafka Streams reduce method? -
given following code:
kstream<string, custom> stream = builder.stream(serdes.string(), customserde, "test_in"); stream .groupbykey(serdes.string(), customserde) .reduce(new customreducer(), "reduction_state") .print(serdes.string(), customserde); i have println statement inside apply method of reducer, prints out when expect reduction take place. however, final print statement shown above displays nothing. likewise if use to method rather print, see no messages in destination topic.
what need after reduce statement see result of reduction? if 1 value pushed input don't expect see anything. if second value same key pushed expect reducer apply (which does) , expect result of reduction continue next step in processing pipeline. described i'm not seeing in subsequent steps of pipeline , don't understand why.
as of kafka 0.10.1.0 aggregation operators use internal de-duplication cache reduce load of result ktable changelog stream. example, if count , process 2 records same key directly after each other, full changelog stream <key:1>, <key:2>.
with new caching feature, cache receive <key:1> , store it, not send downstream right away. when <key:2> computed, replace first entry of cache. depending on cache size, number of distinct key, throughput, , commit interval, cache sends entries downstream. happens either on cache eviction single key entry or complete flush of cache (sending entries downstream). thus, ktable changelog might show <key:2> (because <key:1> got de-duplicated).
you can control size of cache via streams configuration parameter streamconfig.cache_max_bytes_buffering_config. if set value zero, disable caching , ktable changelog contain updates (effectively providing pre 0.10.1.0 behavior).
confluent documentation contains section explaining cache in more detail:
Comments
Post a Comment