What is a metadata driven pipeline? Wikipedia defines metadata as “data that provides information about other data.” As a developer, we can create a non-parameterized pipeline and/or notebook to solve a business problem. However, if we must solve the same problem a hundred times, the amount of code can get unwieldly. A better way to solve this problem is to store metadata in the delta lake. This data will drive how the Azure Data Factory and Spark Notebooks execute.
Business Problem
Our manager has asked us to create a metadata driven solution to ingest CSV files from external storage into