hadoop
  • Hadoop and Big Data
  • Introduction to Hadoop and Big Data
  • HDFS and MapReduce
  • Pig
  • Hive
  • Data storage
  • Data ingestion
  • Apache Spark
  • Apache Spark - DataFrames
  • Introduction to Data Engineering
  • Links
  • References
Powered by GitBook
On this page

References

PreviousLinks

Last updated 6 years ago

  • White, Tom. Hadoop: The Definitive Guide: Storage and Analysis at Internet Scale, O'Reilly Media.

  • Aven, Jeffrey. Hadoop in 24 Hours, Sams Teach Yourself, Pearson Education.

  • Carpenter, Jeff; Hewitt, Eben. Cassandra: The Definitive Guide: Distributed Data at Web Scale, O'Reilly Media.

  • Chodorow, Kristina. MongoDB: The Definitive Guide: Powerful and Scalable Data Storage, O'Reilly Media

  • ,

  • AWS whitepaper, Lambda Architecture for Batch and RealTime Processing on AWS with Spark Streaming and Spark SQL

Hadoop in Real World
Sundog Education by Frank Kane
The Ultimate Hands-On Hadoop - Tame your Big Data!
https://db-engines.com/en/
https://www.domo.com/learn/data-never-sleeps-5
https://hortonworks.com/apache/hdfs/
https://spark.apache.org/
https://hadoop.apache.org/docs/stable/hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduceTutorial.html
https://hadoop.apache.org/docs/r2.4.1/hadoop-project-dist/hadoop-common/FileSystemShell.html
https://pandaforme.gitbooks.io/introduction-to-cassandra/content/understand_the_cassandra_data_model.html
https://databricks.com/blog/2015/06/22/understanding-your-spark-application-through-visualization.html
https://www.safaribooksonline.com/library/view/learning-spark/9781449359034/ch01.html
https://www.python-course.eu/lambda.php
http://xyz.insightdataengineering.com/blog/
http://searchdatamanagement.techtarget.com/definition/data-engineer
https://www.oreilly.com/ideas/questioning-the-lambda-architecture