Targets Examples
This directory contains example Targets pipelines and helper scripts for R-based data workflows.
Files
_targets.R- Main pipeline definition file (required for all Targets pipelines)run_pipeline.R- Script to run the pipeline from command lineinspect_pipeline.R- Script to inspect pipeline status and dependenciesREADME.md- This file
Setup
- Install Targets:
install.packages("targets") - Install optional dependencies:
install.packages(c("dplyr", "readr", "tarchetypes", "visNetwork")) # Optional: for plotting install.packages("ggplot2")Note:
visNetworkis required fortar_visnetwork()to visualize the dependency graph. - Verify installation:
library(targets) packageVersion("targets")
Running the Example
Basic Usage
Copy
_targets.Rto your project directory (or work in this directory)- Run the pipeline in R:
library(targets) tar_make() - View the dependency graph:
tar_visnetwork() - Read a target:
tar_read(processed_data)
Using Helper Scripts
Run the pipeline:
Rscript run_pipeline.R
Run specific targets:
Rscript run_pipeline.R processed_data summary_stats
Inspect the pipeline:
Rscript inspect_pipeline.R
Understanding the Pipeline
Pipeline Structure
The _targets.R file defines a pipeline with five targets:
- raw_data - Generates or loads raw data
- processed_data - Transforms the raw data
- summary_stats - Calculates summary statistics
- save_results - Saves processed data to a CSV file
- data_plot - Creates a visualization (optional, requires ggplot2)
Target Dependencies
Targets automatically tracks dependencies:
processed_datadepends onraw_datasummary_statsdepends onprocessed_datasave_resultsdepends onprocessed_datadata_plotdepends onprocessed_data
Key Commands
# Run the pipeline
tar_make()
# Run specific targets
tar_make(names = c("processed_data", "summary_stats"))
# View dependency graph
tar_visnetwork()
# Read a target
tar_read(processed_data)
# List all targets
tar_manifest()
# Check outdated targets
tar_outdated()
# View pipeline metadata
tar_meta()
# Load targets into environment
tar_load(processed_data, summary_stats)
# Clean pipeline (remove all targets)
tar_destroy()
Pipeline Features
Incremental Execution
Targets only runs targets that are out of date. If you modify raw_data, only downstream targets will be re-run.
Dependency Tracking
Targets automatically infers dependencies from your code. If processed_data uses raw_data, the dependency is automatically tracked.
File Tracking
The save_results target uses format = "file" to track file timestamps. If the file is modified externally, the target will be marked as outdated.
Error Handling
The pipeline is configured with error = "continue" to continue execution even if one target fails.
Customization
Adding New Targets
Add targets to the list() in _targets.R:
tar_target(
name = new_target,
command = {
# Your code here
processed_data %>% filter(value > 5)
},
packages = "dplyr"
)
Using External Data Files
Modify the raw_data target to read from a file:
tar_target(
name = raw_data,
command = {
readr::read_csv("data/my_data.csv", show_col_types = FALSE)
},
format = "file" # Track file timestamps
)
Parallel Execution
Configure parallel execution:
tar_option_set(
workers = 4 # Number of parallel workers
)
Troubleshooting
Common Issues
Package not found: Ensure all required packages are installed and listed in
tar_option_set(packages = ...)Target not found: Check that the target name is spelled correctly and exists in
_targets.RDependency errors: Use
tar_visnetwork()to visualize dependencies and identify issuesOutdated targets: Use
tar_outdated()to see which targets need to be updated
Getting Help
- Check Targets logs in the
_targets/directory - Use
tar_meta(fields = error)to view error messages - Consult the Targets Documentation
Additional Resources
For comprehensive documentation, tutorials, and additional resources, see the Targets documentation page.