Changelog
Contents
Changelog¶
Listed here are the changes between each release of SmartSim and SmartRedis.
Jump to SmartRedis Changelog
SmartSim¶
0.4.2¶
Released on April 12, 2023
Description
This release of SmartSim had a focus on polishing and extending exiting features already provided by SmartSim. Most notably, this release provides support to allow users to colocate their models with an orchestrator using Unix domain sockets and support for launching models as batch jobs.
Additionally, SmartSim has updated its tool chains to provide a better user experience. Notably, SmarSim can now be used with Python 3.10, Redis 7.0.5, and RedisAI 1.2.7. Furthermore, SmartSim now utilizes SmartRedis’s aggregation lists to streamline the use and extension of ML data loaders, making working with popular machine learning frameworks in SmartSim a breeze.
A full list of changes and detailed notes can be found below:
Add support for colocating an orchestrator over UDS
Add support for Python 3.10, deprecate support for Python 3.7 and RedisAI 1.2.3
Drop support for Ray
Update ML data loaders to make use of SmartRedis’s aggregation lists
Allow for models to be launched independently as batch jobs
Update to current version of Redis to 7.0.5
Add support for RedisAI 1.2.7, pyTorch 1.11.0, Tensorflow 2.8.0, ONNXRuntime 1.11.1
Fix bug in colocated database entrypoint when loading PyTorch models
Fix test suite behavior with environment variables
Detailed Notes
Running some tests could result in some SmartSim-specific environment variables to be set. Such environment variables are now reset after each test execution. Also, a warning for environment variable usage in Slurm was added, to make the user aware in case an environment variable will not be assigned the desired value with –export. (PR270)
The PyTorch and TensorFlow data loaders were update to make use of aggregation lists. This breaks their API, but makes them easier to use. (PR264)
The support for Ray was dropped, as its most recent versions caused problems when deployed through SmartSim. We plan to release a separate add-on library to accomplish the same results. If you are interested in getting the Ray launch functionality back in your workflow, please get in touch with us! (PR263)
Update from Redis version 6.0.8 to 7.0.5. (PR258_)
Adds support for Python 3.10 without the ONNX machine learning backend. Deprecates support for Python 3.7 as it will stop receiving security updates. Deprecates support for RedisAI 1.2.3. Update the build process to be able to correctly fetch supported dependencies. If a user attempts to build an unsupported dependency, an error message is shown highlighting the discrepancy. (PR256_)
Models were given a batch_settings attribute. When launching a model through Experiment.start the Experiment will first check for a non-nullish value at that attribute. If the check is satisfied, the Experiment will attempt to wrap the underlying run command in a batch job using the object referenced at Model.batch_settings as the batch settings for the job. If the check is not satisfied, the Model is launched in the traditional manner as a job step. (PR245_)
Fix bug in colocated database entrypoint stemming from uninitialized variables. This bug affects PyTorch models being loaded into the database. (PR237)
The release of RedisAI 1.2.7 allows us to update support for recent versions of PyTorch, Tensorflow, and ONNX (PR234_)
Make installation of correct Torch backend more reliable according to instruction from PyTorch
In addition to TCP, add UDS support for colocating an orchestrator with models. Methods Model.colocate_db_tcp and Model.colocate_db_uds were added to expose this functionality. The Model.colocate_db method remains and uses TCP for backward compatibility (PR246_)
0.4.1¶
Released on June 24, 2022
Description: This release of SmartSim introduces a new experimental feature to help make SmartSim workflows more portable: the ability to run simulations models in a container via Singularity. This feature has been tested on a small number of platforms and we encourage users to provide feedback on its use.
We have also made improvements in a variety of areas: new utilities to load scripts and machine learning models into the database directly from SmartSim driver scripts and install-time choice to use either KeyDB or Redis for the Orchestrator. The RunSettings API is now more consistent across subclasses. Another key focus of this release was to aid new SmartSim users by including more extensive tutorials and improving the documentation. The docker image containing the SmartSim tutorials now also includes a tutorial on online training.
Launcher improvements
New methods for specifying RunSettings parameters (SmartSim-PR166) (SmartSim-PR170)
Better support for mpirun, mpiexec, and orterun as launchers (SmartSim-PR186)
Experimental: add support for running models via Singularity (SmartSim-PR204)
Documentation and tutorials
Tutorial updates (SmartSim-PR155) (SmartSim-PR203) (SmartSim-PR208)
Add SmartSim Zoo info to documentation (SmartSim-PR175)
New tutorial for demonstrating online training (SmartSim-PR176) (SmartSim-PR188)
General improvements and bug fixes
Set models and scripts at the driver level (SmartSim-PR185)
Optionally use KeyDB for the orchestrator (SmartSim-PR180)
Ability to specify system-level libraries (SmartSim-PR154) (SmartSim-PR182)
Fix the handling of LSF gpus_per_shard (SmartSim-PR164)
Fix error when re-running smart build (SmartSim-PR165)
Fix generator hanging when tagged configuration variables are missing (SmartSim-PR177)
Dependency updates
CMake version from 3.10 to 3.13 (SmartSim-PR152)
Update click to 8.0.2 (SmartSim-PR200)
0.4.0¶
Released on Feb 11, 2022
Description: In this release SmartSim continues to promote ease of use. To this end SmartSim has introduced new portability features that allow users to abstract away their targeted hardware, while providing even more compatibility with existing libraries.
A new feature, Co-located orchestrator deployments has been added which provides scalable online inference capabilities that overcome previous performance limitations in seperated orchestrator/application deployments. For more information on advantages of co-located deployments, see the Orchestrator section of the SmartSim documentation.
The SmartSim build was significantly improved to increase
customization of build toolchain and the smart
command
line inferface was expanded.
Additional tweaks and upgrades have also been made to ensure an optimal experience. Here is a comprehensive list of changes made in SmartSim 0.4.0.
Orchestrator Enhancements:
Add Orchestrator Co-location (SmartSim-PR139)
Add Orchestrator configuration file edit methods (SmartSim-PR109)
Emphasize Driver Script Portability:
Add ability to create run settings through an experiment (SmartSim-PR110)
Add ability to create batch settings through an experiment (SmartSim-PR112)
Add automatic launcher detection to experiment portability functions (SmartSim-PR120)
Expand Machine Learning Library Support:
Data loaders for online training in Keras/TF and Pytorch (SmartSim-PR115) (SmartSim-PR140)
ML backend versions updated with expanded support for multiple versions (SmartSim-PR122)
Launch Ray internally using
RunSettings
(SmartSim-PR118)Add Ray cluster setup and deployment to SmartSim (SmartSim-PR50)
Expand Launcher Setting Options:
Add ability to use base
RunSettings
on a Slurm, PBS, or Cobalt launchers (SmartSim-PR90)Add ability to use base
RunSettings
on LFS launcher (SmartSim-PR108)
Deprecations and Breaking Changes
Orchestrator classes combined into single implementation for portability (SmartSim-PR139)
smartsim.constants
changed tosmartsim.status
(SmartSim-PR122)
smartsim.tf
migrated tosmartsim.ml.tf
(SmartSim-PR115) (SmartSim-PR140)TOML configuration option removed in favor of environment variable approach (SmartSim-PR122)
General Improvements and Bug Fixes:
Improve and extend parameter handling (SmartSim-PR107) (SmartSim-PR119)
Abstract away non-user facing implementation details (SmartSim-PR122)
Add various dimensions to the CI build matrix for SmartSim testing (SmartSim-PR130)
Add missing functions to LSFSettings API (SmartSim-PR113)
Add RedisAI checker for installed backends (SmartSim-PR137)
Remove heavy and unnecessary dependencies (SmartSim-PR116) (SmartSim-PR132)
Fix LSFLauncher and LSFOrchestrator (SmartSim-PR86)
Fix over greedy Workload Manager Parsers (SmartSim-PR95)
Fix Slurm handling of comma-separated env vars (SmartSim-PR104)
Fix internal method calls (SmartSim-PR138)
Documentation Updates:
Updates to documentation build process (SmartSim-PR133) (SmartSim-PR143)
Updates to documentation content (SmartSim-PR96) (SmartSim-PR129) (SmartSim-PR136) (SmartSim-PR141)
Update SmartSim Examples (SmartSim-PR68) (SmartSim-PR100)
0.3.2¶
Released on August 10, 2021
Description:
Upgraded RedisAI backend to 1.2.3 (SmartSim-PR69)
PyTorch 1.7.1, TF 2.4.2, and ONNX 1.6-7 (SmartSim-PR69)
LSF launcher for IBM machines (SmartSim-PR62)
Improved code coverage by adding more unit tests (SmartSim-PR53)
Orchestrator methods to get address and check status (SmartSim-PR60)
Added Manifest object that tracks deployables in Experiments (SmartSim-PR61)
Bug fixes (SmartSim-PR52) (SmartSim-PR58) (SmartSim-PR67) (SmartSim-PR73)
Updated documentation and examples (SmartSim-PR51) (SmartSim-PR57) (SmartSim-PR71)
Improved IP address aquisition (SmartSim-PR72)
Binding database to network interfaces
0.3.1¶
Released on May 5, 2021
Description:
This release was dedicated to making the install process
easier. SmartSim can be installed from PyPI now and the
smart
cli tool makes installing the machine learning
runtimes much easier.
Pip install (SmartSim-PR42)
smart
cli tool for ML backends (SmartSim-PR42)Build Documentation for updated install (SmartSim-PR43)
Migrate from Jenkins to Github Actions CI (SmartSim-PR42)
Bug fix for setup.cfg (SmartSim-PR35)
0.3.0¶
Released on April 1, 2021
Description:
initial 0.3.0 (first public) release of SmartSim
SmartRedis¶
0.4.0¶
Released on April 12, 2023
Description
This release provides a variety of features to improve usability and debugging of the SmartRedis library, notably including Unix domain socket support, logging, the ability to print a textual representation of a string or dataset, dataset inspection, documentation updates, fixes to the multi-GPU support, and much more:
Prepare 0.4.0 release
Disable codecov CI tests
Improved error message in to_string methods in C interface
Streamlined PyBind interface layer
Updated Python API documentation
Streamlined C interface layer
Improved performance of get, put, and copy dataset methods
Fix a bug which prevented multi-GPU model set in some cases
Streamline pipelined execution of tasks for backend database
Enhance code coverage to include all 4 languages supported by SmartRedis
Fix a bug which resulted in wrong key prefixing when retrieving aggregation lists in ensembles
Correct assorted API documentation errors
Improve documentation of exception handling in Redis server classes
Improve error handling for setting of scripts and models
Add support to inspect the dimensions of a tensor via get_tensor_dims()
Split dataset prefixing control from use_tensor_ensemble_prefix() to use_dataset_ensemble_prefix()
Update to the latest version of redis-plus-plus
Update to the latest version of PyBind
Change documentation theme to sphinx_book_theme and fix doc strings
Add print capability for Client and DataSet
Add support for inspection of tensors and metadata inside datasets
Add support for user-directed logging for Python clients, using Client, Dataset, or LogContext logging methods
Add support for user-directed logging for C and Fortran clients without a Client or Dataset context
Additional error reporting for connections to and commands run against Redis databases
Improved error reporting capabilities for Fortran clients
Python error messages from SmartRedis contain more information
Added logging functionality to the SmartRedis library
A bug related to thread pool initialization was fixed.
This version adds new functionality in the form of support for Unix Domain Sockets.
Fortran client can now be optionally built with the rest of the library
Initial support for dataset conversions, specifically Xarray.
Detailed Notes
Update docs and version numbers in preparation for version 0.4.0. Clean up duplicate marking of numpy dependency (PR321)
Remove codecov thresholds to avoid commits being marked as ‘failed’ due to coverage variance (PR317)
Corrected the error message in to_string methods in C interface to not overwrite the returned error message and to name the function (PR320_)
Streamlined PyBind interface layer to reduce repetitive boilerplate code (PR315_)
Updated Python API summary table to include new methods (PR313)
Streamlined C interface layer to reduce repetitive boilerplate code (PR312)
Leveraged Redis pipelining to improve performance of get, put, and copy dataset methods (PR311)
Redis::set_model_multigpu() will now upload the correct model to all GPUs (PR310)
RedisCluster::_run_pipeline() will no longer unconditionally apply a retry wait before returning (PR309)
Expand code coverage to all four languages and make the CI/CD more efficent (PR308)
An internal flag was set incorrectly, it resulted in wrong key prefixing when accessing (retrieving or querying) lists created in ensembles (PR306)
Corrected a variety of Doxygen errors and omissions in the API documentation (PR305)
Added throw documentation for exception handling in redis.h, redisserver.h, rediscluster.h (PR301)
Added error handling for a rare edge condition when setting scripts and models (PR300)
Added support to inspect the dimensions of a tensor via new get_tensor_dims() method (PR299)
The use_tensor_ensemble_prefix() API method no longer controls whether datasets are prefixed. A new API method, use_dataset_ensemble_prefix() now manages this. (PR298)
Updated from redis-plus-plus v1.3.2 to v1.3.5 (PR296)
Updated from PyBind v2.6.2 to v2.10.3 (PR295)
Change documentation theme to sphinx_book_theme to match SmartSim documentation theme and fix Python API doc string errors (PR294)
Added print capability for Client and DataSet to give details diagnostic information for debugging (PR293)
Added support for retrieval of names and types of tensors and metadata inside datasets (PR291)
Added support for user-directed logging for Python clients via {Client, Dataset, LogContext}.{log_data, log_warning, log_error} methods (PR289)
Added support for user-directed logging without a Client or Dataset context to C and Fortran clients via _string() methods (PR288)
Added logging to capture transient errors that arise in the _run() and _connect() methods of the Redis and RedisCluster classes (PR287)
Tweak direct testing of Redis and RedisCluster classes (PR286)
Resolve a disparity in the construction of Python client and database classes (PR285)
Fortran clients can now access error text and source location (PR284)
Add exception location information from CPP code to Python exceptions (PR283)
Added client activity and manual logging for developer use (PR281)
Fix thread pool error (PR280)
Update library linking instructions and update Fortran tester build process (PR277)
Added add_metadata_for_xarray and transform_to_xarray methods in DatasetConverter class for initial support with Xarray (PR262)
Change Dockerfile to use Ubuntu 20.04 LTS image (PR276)
Implemented support for Unix Domain Sockets, including refactorization of server address code, test cases, and check-in tests. (PR252)
A new make target make lib-with-fortran now compiles the Fortran client and dataset into its own library which applications can link against (PR245_)
0.3.1¶
Released on June 24, 2022
Description
Version 0.3.1 adds new functionality in the form of DataSet aggregation lists for pipelined retrieval of data, convenient support for multiple GPUs, and the ability to delete scripts and models from the backend database. It also introduces multithreaded execution for certain tasks that span multiple shards of a clustered database, and it incorporates a variety of internal improvements that will enhance the library going forward.
Detailed Notes
Implemented DataSet aggregation lists in all client languages, for pipelined retrieval of data across clustered and non-clustered backend databases. (PR258_) (PR257) (PR256_) (PR248) New commands are:
append_to_list()
delete_list()
copy_list()
rename_list()
get_list_length()
poll_list_length()
poll_list_length_gte()
poll_list_length_lte()
get_datasets_from_list()
get_dataset_list_range()
use_list_ensemble_prefix()
Implemented multithreaded execution for parallel dataset list retrieval on clustered databases. The number of threads devoted for this purpose is controlled by the new environment variable SR_THERAD_COUNT. The value defaults to 4, but may be any positive integer or special value zero, which will cause the SmartRedis runtime to allocate one thread for each available hardware context. (PR251) (PR246_)
Augmented support for GPUs by implementing multi-GPU convenience functions for all client languages. (PR254) (PR250) (PR244) New commands are:
set_model_from_file_multigpu()
set_model_multigpu()
set_script_from_file_multigpu()
set_script_multigpu()
run_model_multigpu()
run_script_multigpu()
delete_model_multigpu()
delete_script_multigpu()
Added API calls for all clients to delete models and scripts from the backend database. (PR240) New commands are:
delete_script()
delete_model()
Updated the use of backend RedisAI API calls to discontinue use of deprecated methods for model selection (AI.MODELSET) and execution (AI.MODELRUN) in favor of current methods AI.MODELSTORE and AI.MODELEXECUTE, respectively. (PR234_)
SmartRedis will no longer call the C runtime method srand() to ensure that it does not interfere with random number generation in client code. It now uses a separate instance of the C++ random number generator. (PR233)
Updated the way that the Fortran enum_kind type defined in the fortran_c_interop module is defined in order to better comply with Fortran standard and not interfere with GCC 6.3.0. (PR231)
Corrected the spelling of the word “command” in a few error message strings. (PR221)
SmartRedis now requires a CMake version 3.13 or later in order to utilize the add_link_options CMake command. (PR217)
Updated and improved the documentation of the SmartRedis library. In particular, a new SmartRedis Integration Guide provides an introduction to using the SmartRedis library and integrating it with existing software. (PR261) (PR260) (PR259) (SSPR214)
Added clustered Redis testing to automated GitHub check-in testing. (PR239)
Updated the SmartRedis internal API for building commands for the backend database. (PR223) This change should not be visible to clients.
The SmartRedis example code is now validated through the automated GitHub checkin process. This will help ensure that the examples do not fall out of date. (PR220)
Added missing copyright statements to CMakeLists.txt and the SmartRedis examples. (PR219)
Updated the C++ test coverage to ensure that all test files are properly executed when running “make test”. (PR218)
Fixed an internal naming conflict between a local variable and a class member variable in the DataSet class. (PR215) This should not be visible to clients.
Updated the internal documentation of methods in SmartRedis C++ classes with the override keyword to improve compliance with the latest C++ standards. (PR214) This change should not be visible to clients.
Renamed variables internally to more cleanly differentiate between names that are given to clients for tensors, models, scripts, datasets, etc., and the keys that are used when storing them in the backend database. (PR213) This change should not be visible to clients.
0.3.0¶
Released on Febuary 11, 2022
Description
Improve error handling across all SmartRedis clients (PR159) (PR191) (PR199) (PR205) (PR206)
Includes changes to C and Fortran function prototypes that are not backwards compatible
Includes changes to error class names and enum type names that are not backwards compatible
Add
poll_dataset
functionality to all SmartRedis clients (PR184)Due to other breaking changes made in this release, applications using methods other than
poll_dataset
to check for the existence of a dataset should now usepoll_dataset
Add environment variables to control client connection and command timeout behavior (PR194)
Add AI.INFO command to retrieve statistics on scripts and models via Python and C++ clients (PR197)
Create a Dockerfile for SmartRedis (PR180)
Update
redis-plus-plus
version to 1.3.2 (PR162)Internal client performance and API improvements (PR138) (PR141) (PR163) (PR203)
Expose Redis
FLUSHDB
,CONFIG GET
,CONFIG SET
, andSAVE
commands to the Python client (PR139) (PR160)Extend inverse CRC16 prefixing to all hash slots (PR161)
Improve backend dataset representation to enable performance optimization (PR195)
Simplify SmartRedis build proccess (PR189)
Fix zero-length array transfer in Fortran
convert_char_array_to_c
(PR170)Add continuous integration for all SmartRedis tests (PR165) (PR173) (PR177)
Update SmartRedis documentation and examples (PR202) (PR208) (PR210)
0.2.0¶
Released on August, 5, 2021
Description
Improved tensor memory management in the Python client (PR70)
Improved metadata serialization and removed protobuf dependency (PR61)
Added unit testing infrastructure for the C++ client (PR96)
Improve command execution fault handling (PR65) (PR97) (PR105)
Added copy, rename, and delete tensor and DataSet commands in the Python client (PR66)
Upgrade to RedisAI 1.2.3 (PR101)
Fortran and C interface improvements (PR93) (PR94) (PR95) (PR99)
Add Redis INFO command execution to the Python client (PR83)
Add Redis CLUSTER INFO command execution to the Python client (PR105)
0.1.1¶
Released on May 5, 2021
Description
0.1.0¶
Released on April 1, 2021
Description
Initial 0.1.0 release of SmartRedis