RELEASE_NOTES

RELEASE NOTES FOR SLURM VERSION 14.11
6 July 2014


IMPORTANT NOTE:
If using the slurmdbd (Slurm DataBase Daemon) you must update this first.
The 14.11 slurmdbd will work with Slurm daemons of version 2.6 and above.
You will not need to update all clusters at the same time, but it is very
important to update slurmdbd first and having it running before updating
any other clusters making use of it.  No real harm will come from updating
your systems before the slurmdbd, but they will not talk to each other
until you do.  Also at least the first time running the slurmdbd you need to
make sure your my.cnf file has innodb_buffer_pool_size equal to at least 64M.
You can accomplish this by adding the line

innodb_buffer_pool_size=64M

under the [mysqld] reference in the my.cnf file and restarting the mysqld.
This is needed when converting large tables over to the new database schema.

Slurm can be upgraded from version 2.6 or 14.03 to version 14.11 without loss of
jobs or other state information. Upgrading directly from an earlier version of
Slurm will result in loss of state information.


HIGHLIGHTS
==========
 -- Added job array data structure and removed 64k array size restriction.
 -- Added support for reserving CPUs and/or memory on a compute node for system
    use.
 -- Added support for allocation of generic resources by model type for
    heterogenous systems (e.g. request a Kepler GPU, a Tesla GPU, or a GPU of
    any type).
 -- Added support for non-consumable generic resources that are limited, but
    can be shared between jobs.
 -- Added a Route plugin to allow messages to be forwarded using switch
    topology information.

RPMBUILD CHANGES
================


CONFIGURATION FILE CHANGES (see man appropriate man page for details)
=====================================================================
 -- Modify etc/cgroup.release_common.example to set specify full path to the
    scontrol command. Also find cgroup mount point by reading cgroup.conf file.
 -- Added SchedulerParameters options of bf_yield_interval and bf_yield_sleep
    to control how frequently and for how long the backfill scheduler will
    relinquish its locks.
 -- To support larger numbers of jobs when the StateSaveDirectory is on a
    file system that supports a limited number of files in a directory, add a
    subdirectory called "hash.#" based upon the last digit of the job ID.
 -- Added GRES type (e.g. model name) and "non_consume" fields for resources
    that are limited, but can be shared between jobs.
 -- Modify AuthInfo configuration parameter to accept credential lifetime
    option.
 -- Added ChosLoc configuration parameter in slurm.conf (Chroot OS tool
    location).
 -- Added MemLimitEnforce configuration parameter in slurm.conf (Used to disable
    enforcement of memory limits)
 -- Added PriorityParameters configuration parameter in slurm.conf (String used
    to hold configuration information for the PriorityType plugin).
 -- Added RequeueExit and RequeueExitHold configuration parameter in slurm.conf
    (Defines job exit codes which trigger a job being requeued and/or held).
 -- Add SelectTypeParameters option of CR_PACK_NODES to pack a job's tasks
    tightly on its allocated nodes rather than distributing them evenly across
    the allocated nodes.
 -- Added PriorityFlags option of Calulate_Running to continue recalculating
    the priority of running jobs.
 -- Add new node configuration parameters CoreSpecCount, CPUSpecList and
    MemSpecLimit which support the reservation of resources for system use
    with Linux cgroup.
 -- Added AllowSpecResourcesUsage configuration parameter in slurm.conf. This
    allows jobs to use specialized resources on nodes allocated to them if the
    job designates --core-spec=0.
 -- Add new SchedulerParameters option of build_queue_timeout to throttle how
    much time can be consumed building the job queue for scheduling.
 -- Added HealthCheckNodeState option of "cycle" to cycle through the compute
    nodes over the course of HealthCheckInterval rather than running all at
    the same time.
 -- Added CpuFreqDef configuration parameter in slurm.conf to specify the
    default CPU frequency and governor to be set at job end.
 -- Add RoutePlugin with route/default and route/topology implementations to
    allow messages to be forwarded through the switch network defined in
    the topology.conf file for TopologyPlugin=topology/tree.
 -- Add DebugFlags=Route to allow debugging of RoutePlugin.
 -- Added SchedulerParameters options of bf_max_job_array_resv to control how
    many tasks of a job array should have resources reserved for them.

DBD CONFIGURATION FILE CHANGES (see "man slurmdbd.conf" for details)
====================================================================
 -- Added DebugFlags

COMMAND CHANGES (see man pages for details)
===========================================
 -- Improve qsub wrapper support for passing environment variables.
 -- Modify sdiag to report Slurm RPC traffic by type, count and time consumed.
 -- Enable display of nodes anticipated to be used for pending jobs by squeue,
    sview or scontrol.
 -- Modify squeue --start option to print the nodes expected to be used for
    pending job (in addition to expected start time, etc.).
 -- Add srun --cpu-freq options to set the CPU governor (OnDemand, Performance,
    PowerSave or UserSpace).
 -- Added squeue -O/--Format option that makes all job and step fields available
    for printing.
 -- Add "CPUs" count to output of "scontrol show step".
 -- Add job "reboot" option for Linux clusters. This invokes the configured
    RebootProgram to reboot nodes allocated to a job before it begins execution.
 -- Added squeue -O/--Format option that makes all job and step fields available
    for printing.
 -- Add "CPUs" count to output of "scontrol show step".
 -- Added support for job email triggers: TIME_LIMIT, TIME_LIMIT_90 (reached
    90% of time limit), TIME_LIMIT_80 (reached 80% of time limit), and
    TIME_LIMIT_50 (reached 50% of time limit). Applies to salloc, sbatch and
    srun commands.
 -- Added srun --export option to set/export specific environment variables.
 -- Scontrol modified to print separate error messages for job arrays with
    different exit codes on the different tasks of the job array. Applies to
    job suspend and resume operations.
 -- Add node state string suffix of "$" to identify nodes in maintenance
    reservation or scheduled for reboot. This applies to scontrol, sinfo,
    and sview commands.
 -- Enable scontrol to clear a nodes's scheduled reboot by setting its state
    to "RESUME".
 -- Added squeue -P/--priority option that can be used to display pending jobs
    in the same order as used by the Slurm scheduler even if jobs are submitted
    to multiple partitions (job is reported once per usable partition).
 -- Add sbatch job array option to limit the number of simultaneously running
    tasks from a job array (e.g. "--array=0-15%4").
 -- Removed --cpu_bind from sbatch and salloc.  It just seemed to cause
    confusion and wasn't ever handled in the allocation.  A user can now only
    specify the option with srun.
 -- Modify scontrol job operations to accept comma delimited list of job IDs.
    Applies to job update, hold, release, suspend, resume, requeue, and
    requeuehold operations.

OTHER CHANGES
=============
 -- Add job "reboot" option for Linux clusters. This invokes the configured
    RebootProgram to reboot nodes allocated to a job before it begins execution.
 -- In the job_submit plugin: Remove all slurmctld locks prior to job_submit()
    being called for improved performance. If any slurmctld data structures are
    read or modified, add locks directly in the plugin.

API CHANGES
===========


Changed members of the following structs
========================================


Added the following struct definitions
======================================
 -- Added the following fields to struct stats_info_response_msg:
    rpc_type_size, rpc_type_id, rpc_type_cnt, rpc_type_time,
    rpc_user_size, rpc_user_id, rpc_user_cnt, rpc_user_time.
 -- Added the following fields to struct job_info:
    reboot, sched_nodes
 -- Added the following fields to struct node_info:
    gres_drain and gres_used
    core_spec_cnt, cpu_spec_list, mem_spec_limit
 -- Added the following fields to struct slurm_ctl_conf:
    chos_loc, mem_limit_enforce, priority_params
    requeue_exit, requeue_exit_hold, route_plugin


Changed the following enums and #defines
========================================
-- Added #define DEBUG_FLAG_ROUTE to list of debug flags.


Added the following API's
=========================


Change the following API's
===========================


DBD API Changes
===============

Changed members of the following structs
========================================

Added the following struct definitions
======================================


Added the following enums and #defines
========================================


Added the following API's
=========================