Distributed Storage?

Amazon S3: An Example of a Distributed Storage System

What is Amazon s3 and how it works…

How do these replicas get created in the first place?

When a piece of data gets written to the system by a client program, its replicas are synchronously created into various other nodes in the process.

With so many redundancies in place, how does it still fail?

First off… that depends on your definition of failure…

How is a redundancy fixed then?

With cloud systems, this typically means replacing a replica with a new instance.

Hadoop History

As the World Wide Web grew in the late 1900s and early 2000s, search engines and indexes were created to help locate relevant information amid the text-based content. In the early years, search results were returned by humans. But as the web grew from dozens to millions of pages, automation was needed. Web crawlers were created, many as university-led research projects, and search engine start-ups took off (Yahoo, AltaVista, etc.).

Why is Hadoop important?

1. Ability to store and process huge amounts of any kind of data, quickly. With data volumes and varieties constantly increasing, especially from social media and the Internet of Things (IoT), that’s a key consideration.



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store