Releases: AliceO2Group/Control
v0.18.90
This release brings some performance and reliability improvements when terminating tasks. It also delivers some bug fixes, as well as an update to ODC integration.
Task termination:
- [executor] Tweak timeout system for the task kill procedure
- [executor] Always use the shell PGID as backup PID for killing
- [executor] Further tweak kill timeouts
ODC integration:
- [odcshim] Correctly handle early GetState calls
Miscellaneous:
- [core] Allow underscore in configuration path
- [misc] Gather more metrics in the test script
- [occ] Avoid using private namespace in gRPC
v0.18.2
This is a patch release that fixes some inconsistent state reporting behavior when a task moves to ERROR during the STOP transition. It also fixes a related issue that caused the environment destroy operation to fail.
- [core] remove goroutine from transition_configure
- [core] handle workflow state error
- [core] Fix hanging force shutdown
- [executor] Improve executorcmd log output
v0.18.1
v0.18.0
This release brings an in-depth overhaul of the task manager component of the AliECS core, resulting in significant performance improvements. It also delivers numerous bug fixes, as well as integration improvements for the configuration system.
-
Task manager refactor:
- [core] draft of managerV2
- [core] Rebase and add TriggerHooks
- [core] Refactor scheduler into task.Manager
- [core] Fix pending teardown with asynchronous ReleaseTasks
- [core] remove mutex from TaskMan, use mutex for roster, classes and Task
- [core] Remove obsolete warning from repo.Manager
- [core] merge manager with managerv2
- [core] fix discrepancy between task and environment states
- [core] fix infinite loop due to wfState
-
Configuration:
- [core] Expose deployment constants (including consul_endpoint) to env
- [core] Push environment_id as property to all tasks
- [core] Only append --control-port parameter to FairMQ tasks
- [occ] Always prefer OCC_CONTROL_PORT variable to --control-port param
- [occ] Ensure the run number is always a uint32_t in the C++ interface
-
Miscellaneous:
- [build] Use latest FSM
- [coconut] Remove usage of UUID
- [core] Correct task ID generation
- [core] Force shutdown reply when all processes are gone
- [core] Make log output more relevant
- [misc] New benchmark script
- [misc] Test script workflow template default to @master
- [odcshim] Add support for ODC partitioning
v0.17.82
This release brings an in-depth overhaul of the task manager component of the AliECS core, resulting in significant performance improvements. It also delivers numerous bug fixes, as well as integration improvements for the configuration system.
-
Task manager refactor:
- [core] draft of managerV2
- [core] Rebase and add TriggerHooks
- [core] Refactor scheduler into task.Manager
- [core] Fix pending teardown with asynchronous ReleaseTasks
- [core] remove mutex from TaskMan, use mutex for roster, classes and Task
- [core] Remove obsolete warning from repo.Manager
- [core] merge manager with managerv2
- [core] fix discrepancy between task and environment states
- [core] fix infinite loop due to wfState
-
Configuration:
- [core] Expose deployment constants (including consul_endpoint) to env
- [core] Push environment_id as property to all tasks
- [core] Only append --control-port parameter to FairMQ tasks
- [occ] Always prefer OCC_CONTROL_PORT variable to --control-port param
- [occ] Ensure the run number is always a uint32_t in the C++ interface
-
Miscellaneous:
- [build] Use latest FSM
- [coconut] Remove usage of UUID
- [core] Correct task ID generation
- [core] Force shutdown reply when all processes are gone
- [core] Make log output more relevant
- [misc] New benchmark script
- [misc] Test script workflow template default to @master
- [odcshim] Add support for ODC partitioning
v0.17.81
This release brings an in-depth overhaul of the task manager component of the AliECS core, resulting in significant performance improvements. It also delivers numerous bug fixes, as well as integration improvements for the configuration system.
-
Task manager refactor:
- [core] draft of managerV2
- [core] Rebase and add TriggerHooks
- [core] Refactor scheduler into task.Manager
- [core] Fix pending teardown with asynchronous ReleaseTasks
- [core] remove mutex from TaskMan, use mutex for roster, classes and Task
- [core] Remove obsolete warning from repo.Manager
- [core] merge manager with managerv2
- [core] fix discrepancy between task and environment states
- [core] fix infinite loop due to wfState
-
Miscellaneous:
- [build] Use latest FSM
- [coconut] Remove usage of UUID
- [core] Correct task ID generation
- [core] Force shutdown reply when all processes are gone
- [core] Make log output more relevant
- [core] Only append --control-port parameter to FairMQ tasks
- [core] Push environment_id as property to all tasks
- [misc] New benchmark script
- [occ] Always prefer OCC_CONTROL_PORT variable to --control-port param
- [occ] Ensure the run number is always a uint32_t in the C++ interface
v0.17.80
This release brings an in-depth overhaul of the task manager component of the AliECS core, resulting in significant performance improvements. It also delivers numerous bug fixes, as well as integration improvements for the configuration system.
-
Task manager refactor:
- [core] draft of managerV2
- [core] Rebase and add TriggerHooks
- [core] Refactor scheduler into task.Manager
- [core] Fix pending teardown with asynchronous ReleaseTasks
- [core] remove mutex from TaskMan, use mutex for roster, classes and Task
- [core] Remove obsolete warning from repo.Manager
- [core] merge manager with managerv2
-
Miscellaneous:
- [build] Use latest FSM
- [coconut] Remove usage of UUID
- [core] Correct task ID generation
- [core] Force shutdown reply when all processes are gone
- [core] Make log output more relevant
- [core] Only append --control-port parameter to FairMQ tasks
- [misc] New benchmark script
- [occ] Always prefer OCC_CONTROL_PORT variable to --control-port param
v0.17.2
This is a patch release that works around an issue with aliswmod delaying task launch. It also improves the machine id source for the unique task/environment ID generator.
- [common] Acquire machine-id for Sonyflake generator
- [executor] Increase gRPC dial timeout to workaround slow aliswmod issue
v0.17.1
v0.17.0
This release brings numerous reliability and integration improvements. Important user-facing changes include shortened Environment and Task ID strings, and richer InfoLogger output. This release also includes a preview of walnut, the AliECS workflow administration and linting utility, which provides validation of AliECS workflow configuration files and conversion of DPL dumps into AliECS workflow and task templates.
-
Log output improvements:
- [common] Default InfoLogger role field to detected hostname
- [common] Pass correct username to IL
- [common] Ensure a default rolename is always passed to IL
- [common] Set IL level for all messages
- [core][executor] Update logger facility field
- [core] Always set system to FLP for controlled tasks
- [core] Override system for O2_ROLE/O2_SYSTEM env vars passed to tasks
- [core] Set less relevant messages to Trace severity
- [core] Push envId as partition in core
-
Short IDs:
- [core] Environment ID is now a 20-char xid vs. 36 for earlier UUID
- [core] Task ID is now a xid
- [core] Executor ID is now a xid
- [core] MesosCommand ID is now a xid
- [core] Use sony/sonyflake via indigo for short environment IDs
- [core] Use short UID for task ID and executor ID
- [coconut] Adapt task id table for short UID
-
Miscellaneous improvements:
- [build] Ensure we build a true static binary
- [coconut] Fix documentation string for coconut repo default-revision
- [core] Remove obsolete locking in server.go
- [core] Clarify naming of some task slices/maps in scheduler.go
- [core] Adjust executor name string
- [executor] Ensure we always have a PID we can use to kill a task
- [executor] Workaround for os.user.GroupIds unavailable with !osusergo
-
New documentation formatter (Mkdocs):
- [build] Minor make doc/docs improvement
- [misc] Mkdocs stub
- [misc] Rebuild coconut documentation
- [misc] Split peanut docs from OCC
- [misc] Update documentation TOC
- [misc] Update deployment instructions
- [misc] Fix coconut env create documentation
- [misc] Fix template list documentation
- [misc] Update doc nav
- [misc] Update documentation
- [misc] Trash all Minimesos and DCOS stuff
- [misc] Update running and development instructions
- [misc] Update README
- [misc] Add FAQ page stub
- [misc] Begin AliECS Handbook with user-oriented documentation
- [misc] Point to ControlWorkflows readme in mkdocs.yml