hadoop
  • Hadoop and Big Data
  • Introduction to Hadoop and Big Data
  • HDFS and MapReduce
  • Pig
  • Hive
  • Data storage
  • Data ingestion
  • Apache Spark
  • Apache Spark - DataFrames
  • Introduction to Data Engineering
  • Links
  • References
Powered by GitBook
On this page

Links

PreviousIntroduction to Data EngineeringNextReferences

Last updated 6 years ago

Course Materials

Course S3 bucket - No public access

Databricks CE account creation page

Spark Demo Notebook

Spark Exercises

Datasets

movielens 100k

crime data Los Angeles

u.data:

u.item:

u.user:

u.genre:

u.occupation:

Crime Data from 2010 to present:

crime-data-la:

crime-data-code-name:

crime-data-area-name:

https://legacy.gitbook.com/book/juheck/hadoop-and-big-data/details
https://s3-us-west-1.amazonaws.com/julienheck/hadoop/
https://accounts.cloud.databricks.com/registration.html#signup/community
https://s3-us-west-1.amazonaws.com/julienheck/hadoop/7_spark/demo_spark.dbc
https://s3-us-west-1.amazonaws.com/julienheck/hadoop/7_spark/exercise_spark_rdd.dbc
https://s3-us-west-1.amazonaws.com/julienheck/hadoop/7_spark/exercise_spark_dataframes.dbc
https://s3-us-west-1.amazonaws.com/julienheck/hadoop/7_spark/exercise_spark_dataframes2.dbc
https://s3-us-west-1.amazonaws.com/julienheck/hadoop/7_spark/exercise_spark_dataframes3.dbc
https://s3-us-west-1.amazonaws.com/julienheck/hadoop/7_spark/Classroom-Setup.dbc
https://s3-us-west-1.amazonaws.com/julienheck/hadoop/7_spark/DBTest-Setup-Stub.dbc
https://s3-us-west-1.amazonaws.com/julienheck/hadoop/datasets/ml-100k/u.data
https://s3-us-west-1.amazonaws.com/julienheck/hadoop/datasets/ml-100k/u.item
https://s3-us-west-1.amazonaws.com/julienheck/hadoop/datasets/ml-100k/u.user
https://s3-us-west-1.amazonaws.com/julienheck/hadoop/datasets/ml-100k/u.genre
https://s3-us-west-1.amazonaws.com/julienheck/hadoop/datasets/ml-100k/u.occupation
https://s3-us-west-1.amazonaws.com/julienheck/hadoop/datasets/crime_data_la/Crime_Data_from_2010_to_Present.csv
https://s3-us-west-1.amazonaws.com/julienheck/hadoop/datasets/crime_data_la/crime_data_la.csv
https://s3-us-west-1.amazonaws.com/julienheck/hadoop/datasets/crime_data_la/crime_data_code_name.csv
https://s3-us-west-1.amazonaws.com/julienheck/hadoop/datasets/crime_data_la/crime_data_area_name.csv