Big Data as for as I Know : What is Map/Reduce ?

What is Map/Reduce ?

Map/Reduce is for huge data sets that have to be indexed, categorized, sorted, culled, analyzed, etc. It can take a very long time to look through each record or file in a serial environment. Map/Reduce allows data to be distributed across a large cluster, and can distribute out tasks across the data set to work on pieces of it independently, and in parallel. This allows big data to be processed in relatively little time.

Laundromat analogy of Map/Reduce

Imagine that your data is laundry. You wash this laundry by similar colors. Then you dry this laundry by similar material (denims, towels, shirts, etc.)

Serial Operation:

Map/Reduce operation

Word Count example of Map/Reduce

Other Potential uses of Map/Reduce

Since it takes a large data set, breaks it down into smaller data sets, here are some potential uses:

indexing large data sets in a database
image recognition in large images
processing geographic information system (GIS) data - combining vector data w/ point data (Kerr, 2009) analyzing unstructured data
analyzing stock data
Machine learning tasks

Big Data as for as I Know

What is Map/Reduce ?

Laundromat analogy of Map/Reduce

Serial Operation:

Map/Reduce operation

Word Count example of Map/Reduce

Other Potential uses of Map/Reduce

0 comments:

Popular Posts

About

Blogger news

Blogger templates

Pages

About Me

Blog Archive

About

Contact Details