Develop your Node

A node is a directory with a unique <node_id> that adheres to the following minimal structure:

<node_id>/
├── code/
│   └── main.py
├── data/
│   └── <data files>
├── logs.txt

The code folder contains the node's source code, with main.py serving as the entry point.
The data folder stores output results generated by the node.
The logs.txt file records execution logs.

The main.py script is executed by the pipeline when the node runs. It can access data from parent nodes and may include calls to Python scripts, Jupyter notebooks, MATLAB scripts, or other executable code.

Initial Setup

When a node is created from the UI, a dedicated Python virtual environment is automatically created using uv. This environment is located in a .venv folder inside your node's directory.

To work on your node's code, you first need to navigate to its directory and activate the virtual environment.

Navigate to the node folder:

You can copy the path from the GUI, then use it in your terminal:
```
cd <path_to_your_node>
```
Activate the virtual environment:

From within the node's directory, run:
```
source .venv/bin/activate
```
Your terminal prompt should now indicate that you are in the virtual environment.

Development Workflow

Here’s a typical workflow for developing the logic for your node:

Develop your code:

You can write your analysis, simulation, or machine learning code in a Jupyter notebook, a Python script, or even a MATLAB script. You can find examples in the examples directory of the project.
Integrate with main.py:

The main.py file is the entry point for your node's execution within the pipeline. You need to modify it to call the code you developed. The file already contains examples of how to call different types of scripts.
Test your node locally:

Before running the node as part of the full pipeline, you can test it in isolation. From your node's code directory, run:
```
uv run python main.py
```
This command uses the node's dedicated virtual environment to run your main.py script, simulating how the pipeline will execute it. Make sure you have all the necessary dependencies installed in the virtual environment.
Run the node from the pipeline:

Once you are satisfied with your local tests, you can run the node from the pipeline's user interface. This will execute the node in the correct order based on its dependencies.

User API

To access data from other nodes or manage the current node's data, fusionpipe provides a simple API. Here are the main functions you can use in your scripts:

get_node_id(): Retrieves the ID of the current node. The ID follows the format n_<datetime>_<random_4digit_integers>.
get_all_parent_node_folder_paths(node_id): Returns a list of folder paths for all parent nodes of the specified node. This is how you access the output data from the nodes that run before yours.
get_folder_path_node(): Retrieves the folder path of the current node. This is useful for saving your node's output to its data subfolder.

Using Jupyter

If you prefer to develop using Jupyter Notebook or JupyterLab, you can set up a dedicated kernel for your node. This ensures that your notebook uses the same environment and dependencies as the pipeline.

To create and set up the Jupyter kernel, navigate to your node's code directory and run:

uv run python init_node_kernel.py

This will create a new Jupyter kernel with the same name as your node's ID. You can then select this kernel in your Jupyter environment.

For more advanced topics, such as developing a node in conjunction with a custom Python package, see the Best practices for pipeline and package development.