Flyte Entity Identifiers
Flyte Entity Identifiers provide a robust and hierarchical system for uniquely identifying all entities within the Flyte platform, encompassing both definitional components (like tasks and workflows) and their corresponding runtime executions. This identification system is fundamental for traceability, API interactions, and managing the lifecycle of your data pipelines.
Definitional Entity Identifiers
The core Identifier class serves as the primary mechanism for uniquely identifying definitional entities such as tasks, workflows, and launch plans. Each Identifier is composed of five key attributes:
resource_type: Specifies the type of Flyte entity being identified. This is an enum value fromResourceType.project: The project to which the entity belongs.domain: The domain within the project where the entity resides.name: The unique name of the entity within its project and domain.version: The specific version of the entity. Flyte supports versioning for all definitional entities, allowing for iterative development and deployment.
The ResourceType class defines the available types:
UNSPECIFIED: A default or unknown resource type.TASK: Identifies a Flyte task definition.WORKFLOW: Identifies a Flyte workflow definition.LAUNCH_PLAN: Identifies a Flyte launch plan definition.
Example: Creating a Task Identifier
from flytekit.models.core.identifier import Identifier, ResourceType
# Create an identifier for a specific task
task_id = Identifier(
resource_type=ResourceType.TASK,
project="my-project",
domain="development",
name="my_data_processing_task",
version="v1.0.0"
)
print(f"Task Identifier: {task_id}")
# Expected output: TASK:my-project:development:my_data_processing_task:v1.0.0
The to_flyte_idl() and from_flyte_idl() methods facilitate seamless conversion between the Python object representation and Flyte's internal protobuf (IDL) format, which is crucial for interacting with the Flyte Admin service.
Execution Identifiers
Flyte also provides a set of specialized identifiers for tracking the runtime execution of workflows, nodes, and tasks. These identifiers are inherently hierarchical, reflecting the nested structure of Flyte executions.
Workflow Execution Identifier
The WorkflowExecutionIdentifier uniquely identifies a specific instance of a workflow execution. It is defined by:
project: The project where the workflow execution occurred.domain: The domain within the project where the workflow execution occurred.name: The unique name assigned to this particular workflow execution.
Example: Creating a Workflow Execution Identifier
from flytekit.models.core.identifier import WorkflowExecutionIdentifier
workflow_exec_id = WorkflowExecutionIdentifier(
project="my-project",
domain="development",
name="my_workflow_run_12345"
)
print(f"Workflow Execution Identifier: {workflow_exec_id}")
Node Execution Identifier
The NodeExecutionIdentifier identifies a specific execution of a node within a workflow. It is composed of:
node_id: The unique identifier of the node within the workflow definition.execution_id: TheWorkflowExecutionIdentifierof the parent workflow execution.
Example: Creating a Node Execution Identifier
from flytekit.models.core.identifier import NodeExecutionIdentifier
node_exec_id = NodeExecutionIdentifier(
node_id="my_task_node",
execution_id=workflow_exec_id # Using the workflow_exec_id from the previous example
)
print(f"Node Execution Identifier: {node_exec_id}")
Task Execution Identifier
The TaskExecutionIdentifier pinpoints a specific attempt of a task execution within a node. This is the most granular execution identifier and includes:
task_id: TheIdentifierof the task definition being executed.node_execution_id: TheNodeExecutionIdentifierof the parent node execution.retry_attempt: An integer indicating the specific retry attempt (0 for the first attempt, 1 for the first retry, and so on).
Example: Creating a Task Execution Identifier
from flytekit.models.core.identifier import TaskExecutionIdentifier
# Using task_id and node_exec_id from previous examples
task_exec_id = TaskExecutionIdentifier(
task_id=task_id,
node_execution_id=node_exec_id,
retry_attempt=0 # First attempt
)
print(f"Task Execution Identifier: {task_exec_id}")
Signal Identifier
The SignalIdentifier is used to uniquely identify a signal within a workflow execution. Signals are typically used for external inputs or gates that pause workflow execution until a condition is met.
signal_id: A user-provided name for the signal (often corresponding to a gate node).execution_id: TheWorkflowExecutionIdentifierof the workflow execution that the signal belongs to.
Example: Creating a Signal Identifier
from flytekit.models.core.identifier import SignalIdentifier
# Using workflow_exec_id from a previous example
signal_id = SignalIdentifier(
signal_id="wait_for_approval",
execution_id=workflow_exec_id
)
print(f"Signal Identifier: {signal_id}")
Common Use Cases and Best Practices
Flyte Entity Identifiers are critical for:
- Interacting with the Flyte Admin API: All API calls to fetch, update, or manage Flyte entities and executions rely on these identifiers as primary keys. For example, retrieving a specific task definition or monitoring a workflow's status requires providing the correct
IdentifierorWorkflowExecutionIdentifier. - Monitoring and Observability: Tools and dashboards built on top of Flyte leverage these identifiers to track the progress, status, and logs of individual tasks, nodes, and workflows.
- Debugging and Troubleshooting: When an execution fails, the detailed hierarchical identifiers allow pinpointing the exact task attempt that encountered an issue.
- Building Custom Flyte Clients and Integrations: Any custom application that needs to programmatically interact with Flyte will extensively use these identifier objects.
Best Practices:
- Consistency: Maintain consistent naming conventions for
project,domain, andnameacross your Flyte entities to ensure clarity and ease of management. - Versioning: Always use meaningful versions for definitional entities. This enables reproducible runs and safe updates without affecting ongoing executions.
- Hierarchical Navigation: Understand the nested nature of execution identifiers. To get a
TaskExecutionIdentifier, you typically need its parentNodeExecutionIdentifier, which in turn needs its parentWorkflowExecutionIdentifier. This structure is fundamental to how Flyte organizes execution data.