Memory Profiling with Memray Plugin
The Memray Plugin provides a seamless way to profile the memory usage of Python functions using the Memray library. It helps developers identify memory leaks, optimize memory consumption, and understand allocation patterns within their applications by generating detailed, interactive reports.
Core Capabilities
The memray_profiling decorator enables comprehensive memory tracking for any Python function. It captures detailed allocation information and generates insightful HTML reports.
-
Detailed Memory Tracking: The plugin leverages
memray.Trackerto record memory allocations during function execution. This includes:- Native Traces: Capture native stack frames alongside Python frames to pinpoint memory allocations originating from C/C++ extensions or underlying system calls. Enable this by setting
native_traces=True. - Python Allocator Tracing: Trace allocations made by Python's internal allocators as independent events, providing deeper insight into Python's memory management. Set
trace_python_allocators=Trueto activate. - Fork Tracking: Continue tracking memory usage in subprocesses created via
os.fork(). This is crucial for applications that spawn child processes. Enable withfollow_fork=True. - Memory Interval Control: Adjust the frequency of resident set size updates using
memory_interval_ms. This parameter dictates how often the total virtual memory allocated by the process is recorded, influencing the granularity of memory usage graphs in reports.
- Native Traces: Capture native stack frames alongside Python frames to pinpoint memory allocations originating from C/C++ extensions or underlying system calls. Enable this by setting
-
Automated Report Generation: After profiling, the plugin automatically generates interactive HTML reports.
- Reporter Selection: Choose between
flamegraph(default) andtablereporters using thememray_html_reporterparameter. Flame graphs visually represent memory usage across the call stack, while table reports offer a tabular breakdown of allocations. - Custom Reporter Arguments: Pass additional command-line arguments directly to the chosen Memray reporter via
memray_reporter_args. This allows for fine-grained control over report generation, such as filtering or sorting options. For example,memray_reporter_args=["--leaks"]can be used with thetablereporter to show only leaked allocations.
- Reporter Selection: Choose between
-
Organized Output: All profiling data (
.binfiles) and generated HTML reports are stored in a dedicatedmemray_bindirectory, ensuring a clean separation of profiling artifacts. -
Integrated Report Display: The generated HTML report content is captured and passed to a
Deckcomponent, facilitating direct display within an integrated environment or UI.
Usage
To profile a function, apply the memray_profiling decorator to it.
import time
import os
import sys
import memray
from typing import Optional, Callable, List
# Assume ClassDecorator and Deck are defined elsewhere in the system
# For demonstration, we'll mock them
class ClassDecorator:
def __init__(self, task_function: Optional[Callable] = None, **kwargs):
self.task_function = task_function
self.__dict__.update(kwargs)
def __call__(self, *args, **kwargs):
return self.execute(*args, **kwargs)
class Deck:
def __init__(self, title: str, content: str):
print(f"--- Deck: {title} ---")
# In a real system, this would render the HTML content
print(f"HTML content snippet: {content[:200]}...")
# The actual memray_profiling class from the codebase
class memray_profiling(ClassDecorator):
def __init__(
self,
task_function: Optional[Callable] = None,
native_traces: bool = False,
trace_python_allocators: bool = False,
follow_fork: bool = False,
memory_interval_ms: int = 10,
memray_html_reporter: str = "flamegraph",
memray_reporter_args: Optional[List[str]] = None,
):
if memray_html_reporter not in ["flamegraph", "table"]:
raise ValueError(f"{memray_html_reporter} is not a supported html reporter.")
if memray_reporter_args is not None and not all(
isinstance(arg, str) and "--" in arg for arg in memray_reporter_args
):
raise ValueError(
f"unrecognized arguments for {memray_html_reporter} reporter. Please check https://bloomberg.github.io/memray/{memray_html_reporter}.html"
)
self.native_traces = native_traces
self.trace_python_allocators = trace_python_allocators
self.follow_fork = follow_fork
self.memory_interval_ms = memory_interval_ms
self.dir_name = "memray_bin"
self.memray_html_reporter = memray_html_reporter
self.memray_reporter_args = memray_reporter_args if memray_reporter_args else []
super().__init__(
task_function,
native_traces=native_traces,
trace_python_allocators=trace_python_allocators,
follow_fork=follow_fork,
memory_interval_ms=memory_interval_ms,
memray_html_reporter=memray_html_reporter,
memray_reporter_args=memray_reporter_args,
)
def execute(self, *args, **kwargs):
if not os.path.exists(self.dir_name):
os.makedirs(self.dir_name)
bin_filepath = os.path.join(
self.dir_name,
f"{self.task_function.__name__}.{time.strftime('%Y%m%d%H%M%S')}.bin",
)
with memray.Tracker(
bin_filepath,
native_traces=self.native_traces,
trace_python_allocators=self.trace_python_allocators,
follow_fork=self.follow_fork,
memory_interval_ms=self.memory_interval_ms,
):
output = self.task_function(*args, **kwargs)
self.generate_flytedeck_html(reporter=self.memray_html_reporter, bin_filepath=bin_filepath)
return output
def generate_flytedeck_html(self, reporter, bin_filepath):
html_filepath = bin_filepath.replace(
self.task_function.__name__, f"{reporter}.{self.task_function.__name__}"
).replace(".bin", ".html")
memray_reporter_args_str = " ".join(self.memray_reporter_args)
# Mock os.system for demonstration
# In a real scenario, this would execute the memray command
print(f"Executing: {sys.executable} -m memray {reporter} -o {html_filepath} {memray_reporter_args_str} {bin_filepath}")
# Simulate successful execution and file creation
with open(html_filepath, "w", encoding="utf-8") as f:
f.write(f"<html><body><h1>Memray {reporter.capitalize()} Report for {self.task_function.__name__}</h1><p>Generated with args: {memray_reporter_args_str}</p></body></html>")
if os.path.exists(html_filepath): # Simulate success
with open(html_filepath, "r", encoding="utf-8") as file:
html_content = file.read()
Deck(f"Memray {reporter.capitalize()}", html_content)
else:
print(f"Failed to generate HTML report at {html_filepath}")
def get_extra_config(self):
return {}
# Example 1: Basic memory profiling with default flamegraph report
@memray_profiling()
def process_data_basic(size: int):
"""Allocates a list of integers."""
data = [i for i in range(size)]
return len(data)
# Example 2: Profiling with native traces and a table report
@memray_profiling(native_traces=True, memray_html_reporter="table", memray_reporter_args=["--leaks"])
def process_data_advanced(num_objects: int):
"""Creates objects that might leak memory (for demonstration)."""
class MyObject:
def __init__(self, value):
self.value = value
self.large_data = bytearray(1024 * 1024) # 1MB per object
objects = []
for i in range(num_objects):
objects.append(MyObject(i))
# Simulate a leak by not clearing some objects
return objects[:num_objects // 2] # Half of them are "leaked" from this function's scope
# Run the examples
if __name__ == "__main__":
print("--- Running basic profiling ---")
result_basic = process_data_basic(100000)
print(f"Basic processing complete. Result: {result_basic}")
print("\n--- Running advanced profiling with native traces and table report ---")
# Clean up previous memray_bin directory for a fresh run
if os.path.exists("memray_bin"):
import shutil
shutil.rmtree("memray_bin")
result_advanced = process_data_advanced(5) # Create 5 objects, "leak" 2-3
print(f"Advanced processing complete. Result (first half of objects): {len(result_advanced)}")
# Clean up generated files
if os.path.exists("memray_bin"):
import shutil
shutil.rmtree("memray_bin")
When process_data_basic or process_data_advanced executes, the plugin automatically:
- Creates a
memray_bindirectory if it doesn't exist. - Starts a
memray.Trackerinstance, configured with the specified parameters. - Executes the decorated function.
- Stops the tracker, saving raw profiling data to a
.binfile (e.g.,memray_bin/process_data_basic.YYYYMMDDHHMMSS.bin). - Generates an HTML report (e.g.,
memray_bin/flamegraph.process_data_basic.YYYYMMDDHHMMSS.html) using the chosen reporter and arguments. - Passes the HTML content to the
Deckcomponent for display.
Common Use Cases
- Identifying Memory Leaks: Pinpoint exactly where memory is allocated and not properly released, leading to increasing memory consumption over time. The
tablereporter with--leaksargument is particularly useful here. - Optimizing Memory Footprint: Understand which parts of the code consume the most memory and identify opportunities to reduce allocations or use more memory-efficient data structures. Flame graphs provide an excellent visual aid for this.
- Benchmarking Memory Performance: Compare the memory usage of different algorithms or implementations for a given task to choose the most efficient one.
- Debugging Out-of-Memory Errors: When an application crashes due to excessive memory usage, the plugin helps trace back to the root cause by showing the allocation history leading up to the error.
- Analyzing Native Extension Memory: With
native_traces=True, investigate memory allocations originating from C/C++ extensions, which are often harder to debug with pure Python tools.
Considerations
- Performance Overhead: Memory profiling introduces some overhead due to the instrumentation required to track allocations. Use the plugin judiciously, typically during development, testing, or specific debugging sessions, rather than in production environments unless absolutely necessary.
- Reporter Argument Validation: The plugin performs basic validation for
memray_reporter_argsto ensure they start with--. However, it does not validate the semantic correctness of the arguments for the specific Memray reporter. Refer to the official Memray documentation for theflamegraphandtablereporters to ensure valid arguments are passed. DeckComponent Integration: The plugin relies on an externalDeckcomponent to display the generated HTML reports. Ensure this component is properly configured and available in your environment for the reports to be rendered.- Temporary Files: The plugin creates
.binfiles containing raw profiling data and.htmlfiles for reports. These files are stored in thememray_bindirectory. Manage these files as needed, especially in CI/CD pipelines or environments with strict storage policies.