hive - IIS Logs Straming to Hadoop real time -


i trying poc in hadoop log aggregation. have multiple iis servers hosting atleast 100 sites. want to stream logs continously hdfs , parse data , store in hive further analytics.

1) apache kafka correct choice or apache flume

2) after streaming better use apache storm , ingest data hive

please suggestions , information of kind of problem statement.

thanks

you can use either kafka or flume can combine both data hdfsbut need write code there opensource data flow management tools available, don't need write code. eg. nifi , streamsets

you don't need use separate ingestion tools, can directly use data flow tools put data hive table. once table created in hive can analytics providing queries.

let me know need else on this.


Comments

Popular posts from this blog

aws api gateway - SerializationException in posting new Records via Dynamodb Proxy Service in API -

asp.net - Problems sending emails from forum -