End-to-end Data Lakehouse project built on Databricks, following the Medallion Architecture (Bronze, Silver, Gold). Covers real-world data engineering and analytics workflows using Spark, PySpark, SQL ...
The National Cancer Institute Imaging Data Commons (IDC) is a cloud-based repository of publicly available cancer imaging data, co-located with analysis and exploration tools. As part of the NCI ...