Skip to content

fusionpipe is a lightweight pipeline orchestrator designed to streamline data analysis, simulations, and machine learning workflows, fostering better collaboration among users. It enables rapid prototyping with minimal interface complexity while scaling seamlessly to full production systems.

The guiding principle behind fusionpipe is that "fast iteration is key to data science exploration." As a result, the backend interface is intentionally kept minimal, allowing you to develop your code with as little overhead as possible—just as you normally would.

fusionpipe consists of few core components:

  • Node
  • Pipeline

Node

A node is a directory with a unique <node_id> that adheres to the following minimal structure:

<node_id>/
├── code/
│   └── main.py
├── data/
│   └── <data files>
├── logs.txt
  • The code folder contains the node's source code, with main.py serving as the entry point.
  • The data folder stores output results generated by the node.
  • The logs.txt file records execution logs.

The main.py script is executed by the pipeline when the node runs.

Being able to access data from your parent node is the only interface that you need integrate convert your code into a pipeline, and convenience user APIs are provided for that.

A node may include calls to Python scripts, Jupyter notebooks, MATLAB scripts, or other executable code.

Pipeline

A pipeline is a directed acyclic graph (DAG) that connects multiple nodes. Each node can have multiple parent and child nodes, enabling the creation of complex workflows. The pipeline orchestrator handles node execution based on dependencies, ensuring that parent nodes are processed before their children.

Pipeline Example

Start using it

Depending on your role there are several way to can get started with fusionpipe:

  • user on managed instance: In this case fusionpipe is already availabe as a service in your machine/VM. Ask your maintainer how access and follow user guide to develop your first node.
  • user with local installation/developer: In this case you are going to install fusionpipe on your machine to either use it personally or contribute to it. Follow the single user installation guidelines
  • maintainer: In this case your are installing fusionpipe your your barematal cluster as administrator for multiple to allow access to multiple users. Follow the multiple users installation guidelines