Skip to main content

Azure Data Factory

Azure Data Factory is a managed, serverless ETL tool with a drag & drop UI for use in the Azure cloud. It is a good product, but lacking in maturity in some areas - mainly surrounding the UI itself.

Advantages

  • Drag & drop functionality makes simple tasks easy.
  • Managed service no infrastructure to manage in-house for running ETL processes, or orchestration of them.
  • Integrates well with other Azure services, external APIs, and on-premises data stores.
  • DataFlow within Data Factory is essentially "managed DataBricks", which scales well for heavy data processing tasks, again configurable through the UI.
  • Exports to an ARM template for automation.

Disadvantages

  • There are still several bugs in the UI, which can cause frustration while debugging.
  • Working from the UI is very "manual" but developing the jobs-as-code is too slow, as this means changing the ARM template, redeploying and running.
  • Lack of online resources from the wider community, so less to learn from other's experiences. This should change as adoption increases.
  • Price can be higher than other solutions, when coupled with "DataFlow" but DataFlow is where the bulk of transformations happen. DataFlow is better suited to heavy processing tasks, rather than basic/simple transformations consider other options here.

Note

It's worth remembering that Data Factory is under active development, and it has great potential as a product when these "teething bugs" are ironed out. These observations were last updated: 14th February, 2020