Azure Data Factory – Copy Data – Logging

This post demonstrates the configuration and runtime results for Azure Data Factory Copy Data logging.

Locating the Copy Data Logging Configuration

The logging configuration for the “Copy data” activity is found in the “Settings” tab. Continuing with the Pipeline from https://blog.westmorr.com/2024/01/07/azure-data-factory-blob-to-azure-data-lake/, open the “Settings” tab and check the “Enable logging” checkbox.

Afterwards, the “Logging settings” controls are enabled:

There are three settings to configure:

  • Storage connection name: a pointer to a Linked Service
  • Logging level
    • Info: will log all copied files, skipped files, and skipped rows.
    • Warning: will log skipped files and skipped rows only
  • Logging mode
    • Reliable: will flush logs immediate once data is copied and may have an impact on performance and throughput of the Copy Data activity
    • Best effort: will flush logs with batch(es) of records
  • Folder path: the output folder within the specified Storage container where the logs will be stored

After a successful run, ADF automatically creates a “copyactivity-logs” folder underneath the folder specified in the “Folder path”. Another subdirectory is then created which matches the “Name” setting for the Copy Data activity.

Then, finally, one more subdirectory is added which matches the Activity run ID:

Copy Data – Logging level – Info

After navigating into the logging container and drilling into the subfolders a text file can be downloaded (which is actually a CSV). The “Info” details are available for review and include: file size, last modified date, and an MD5 has among other metadata. Also, the time-to-run can be calculated from the “Timestamp” field:

Copy Data – Logging level – Warning

As noted above, only skipped files and skipped rows will be logged when the Logging level is set to “Warning”. In order to force a failure

In order to force a failure with this specific data set, I updated the Source Mapping for “Height (Meters)” to be Int32, and then deleted the value from the first row. The activity shifts the following value (a string) into the “Height (Meters)” column and that causes the row to be skipped. The row is skipped because I also set the “Fault tolerance” setting to “Skip incompatible rows”. You can read more about fault tolerance at https://learn.microsoft.com/en-us/azure/data-factory/copy-activity-fault-tolerance.

Here is a screen shot of the Source Mapping update:

Here is a screen shot of the updated source CSV file with changes to cause the failure:

After applying these updates, ADF lists the skipped row as expected in the Activities details:

Finally, the log file contains the Warning level data as desired:

Conclusion

Azure Data Factory Copy Data logging is configured within the “Settings” tab for Copy Data activities. In this post, both “Info” and “Warning” levels were covered along with “Fault tolerance”.



Categories: Azure Data Factory

Tags: , , , , ,

Leave a Reply

Discover more from Westmorr Consulting

Subscribe now to keep reading and get access to the full archive.

Continue reading