- Mastering Hadoop 3
- Chanchal Singh Manish Kumar
- 138字
- 2025-04-04 14:54:50
Summarization patterns
Summarization problems use the pattern widely across domains. It's all about grouping similar data together and then performing an operation such as calculating a minimum, maximum, count, average, median-standard deviation, building an index, or just simply counting based on key. For example, we might want to calculate the total amount of money our website has made by country. As another example, let's say you want to get the average number of times the users login to our website. One more example can be finding the minimum and maximum number of users by state. The MapReduce works with key-value pair. Thus, operations on keys are commonly used operations. The mapper emits the key-value pairs and the values of these keys are aggregated on the reducer. The following are a few commonly used examples of the summarization pattern.