I am quite new with Apache Spark and researching how to get started with Apache Spark + Python (PySpark) + MongoDB connector. This is what I found.
1. Read the basic knowledge of Apache Spark and try to get started locally and code sample spark app with CSV file by following https://www.tutorialspoint.com/apache_spark/index.htm
2. If you use macOS, you can install spark by following https://notadatascientist.com/install-spark-on-macos/
3. More related tutorial and document
- https://www.youtube.com/watch?v=-hb-oV_yz54 + sample code on Github repository https://github.com/datyrlab/apache-spark/blob/master/05-02-convert-string-to-date.py
- https://programmer.group/read-and-write-operations-on-mongodb-on-sparksql-python-version.html
- https://www.mongodb.com/blog/post/getting-started-with-mongodb-pyspark-and-jupyter-notebook
ไม่มีความคิดเห็น:
แสดงความคิดเห็น