Introduction

Collaborative industrial project which uses Deep Reinforcemnt Learning Agent to efficiently allocate tasks in nodes in an adaptive distributed embedded system. Additionally the agent handles critical tasks ensuring fail-safety compliance and optimizes message passing among tasks, solving a NP Hard combinatorial problem in linear time.

Installation

Make sure you have latest version of Python 3.9 installed

pip install -r requirements.txt

Inference

Navigate to the src folder and run:

Format

python main.py --config [PATH_CONFIG_1] [PATH_CONFIG_2] [--Param_Header1] [Param_Value1] .. so on

Example

python main.py --train false --model_path ../experiments/models/p1/trnc_c/early_term_1000 --config utils/configs/problem_1.yaml utils/configs/experiment_trnc_c.yaml --experiment_name custom_experiments --run_name first_inference

Note: Each and every parameter in existing configuration files is modifable. It can be changed and treated as a command line argument by putting double dash (--) as prefix.

Note 2: For Act-Replace category (either inference or training), set parameter invalid_action_replacement as true in default.yaml

Note 3: For Act-Mask category (either inference or training), replace PPOModel with MaskablePPOModel in main.py by importing from Maskable PPO Implementation using from models.maskable_ppo import MaskablePPOModel

Training

python main.py --config utils/configs/problem_1.yaml utils/configs/experiment_tn.yaml --experiment_name custom_experiments --run_name first_train

Note: You may also provide your own custom configuration file

Experiments

Problem Sets and Configuration Variants

The study defines three problem sets and multiple configuration variants to evaluate the performance of RL agents in a CADES (Configurable Adaptive Distributed Execution System). These scenarios aim to emulate the dynamic and unpredictable conditions of real-world systems.

Problem Sets

Problem 1: A static system configuration with fixed tasks and nodes. This serves as a baseline to evaluate basic performance.
Problem 2: Introduces variability in task numbers and costs while keeping nodes constant. It reflects fluctuating task demands with stable hardware resources.
Problem 3: Adds complexity by varying tasks, their costs, and the number of nodes. This scenario includes potential node downtimes, representing real-life challenges with dynamic task demands and resource failures.

Problem No.	Tasks (#)	Task Cost	Nodes (#)	Node Capacity
1	12	4	6	12
2	8 to 10	4 to 6	6	12
3	8 to 10	4 to 6	6 to 8	10 to 12

Configuration Variants

To capture different scenarios that may arise during the reconfiguration of a CADES, we propose several distinct configuration variants:

TN: Tasks and nodes are available, but no replicas or communication are required. Represents non-critical, independent task execution scenarios.
TRN: Adds replicas for critical tasks but no communication. Focuses on fault tolerance for critical tasks.
TRNC: Includes tasks, nodes, replicas, and communication, divided into:
- TRNC A: Communication among non-critical tasks.
- TRNC B: Communication among critical tasks.
- TRNC C: Combines communication for both critical and non-critical tasks.

Each of these variants captures different levels of complexity, reflecting the diverse operational conditions that a CADES may encounter during reconfiguration.

Category	Tasks (T)	Nodes (N)	Replicas (R)	Communication (C)
TN	✔	✔	✘	✘
TRN	✔	✔	✔	✘
TRNC A	✔	✔	✔	Non-critical tasks only
TRNC B	✔	✔	✔	Critical tasks only
TRNC C	✔	✔	✔	Both non-critical and critical tasks

Invalid Action Strategies

Different invalid action handling strategies are employed to conduct a comparative study of their effects on different configuration problems. These techniques are referenced in the results section and are summarized as follows:

Early-Term: This technique stands for Early Termination and applies termination for invalid actions.
Act-Replace: This technique stands for Action Replacement and applies replacement mechanism for invalid actions.
Act-Mask: This technique stands for Action Masking and applies logits masking for invalid actions.

These strategies are evaluated to understand their impact on solving different configuration problems effectively.

Summary

The combination of problem sets and configuration variants provides a comprehensive framework for evaluating the RL agent's ability to handle dynamic, real-world challenges in a CADES. These scenarios test the agent's fault tolerance, adaptability, and task allocation efficiency under varying levels of complexity.

Results

Success Rate (%)

Problem No.	Strategy	TN	TRN	TRNC A	TRNC B	TRNC C
1	Early-Term	100	98	93	97	88
	Act-Replace	99	97	94	95	88
	Act-Mask	100	100	59	60	57
2	Early-Term	100	96	97	99	91
	Act-Replace	97	99	96	92	90
	Act-Mask	100	99	91	95	96
3	Early-Term	94	93	88	86	83
	Act-Replace	96	98	90	88	88
	Act-Mask	91	93	80	86	84

Note: Bolded values indicate the highest performance in each category (TN, TRN, TRNC A, TRNC B, TRNC C) for that specific Problem Number.

Detailed results can be found in the paper.

Future Work

Optimizing deep learning agent to fulfill message passing among tasks more efficiently
Trying other RL frameworks i.e, Q-learning
Applying efficient reward shaping strategies

Name		Name	Last commit message	Last commit date
Latest commit History 193 Commits
experiments		experiments
imgs		imgs
src		src
.gitignore		.gitignore
LICENSE		LICENSE
Pipfile		Pipfile
Pipfile.lock		Pipfile.lock
README.md		README.md
mlflow_to_experiments_copy.py		mlflow_to_experiments_copy.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Introduction

Installation

Inference

Training

Experiments

Problem Sets and Configuration Variants

Problem Sets

Configuration Variants

Invalid Action Strategies

Summary

Results

Success Rate (%)

Future Work

About

Releases

Packages

Languages

License

akbaig/Allocation-in-Fault-Tolerant-Adaptive-Distributed-Embedded-Systems-using-RL

Folders and files

Latest commit

History

Repository files navigation

Introduction

Installation

Inference

Training

Experiments

Problem Sets and Configuration Variants

Problem Sets

Configuration Variants

Invalid Action Strategies

Summary

Results

Success Rate (%)

Future Work

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages