Changelog#

Listed here are the changes between each release of SmartSim, SmartRedis and SmartDashboard.

Jump to:

SmartSim#

0.8.0#

Released on 27 September, 2024

Description

  • Add instructions for Frontier to set the MIOPEN cache

  • Refine Frontier documentation for proper use of miniforge3

  • Refactor to the RedisAI build to allow more flexibility in versions and sources of ML backends

  • Add Dockerfiles with GPU support

  • Fine grain build support for GPUs

  • Update Torch to 2.1.0, Tensorflow to 2.15.0

  • Better error messages in build process

  • Allow specifying Model and Ensemble parameters with number-like types (e.g. numpy types)

  • Pin watchdog to 4.x

  • Update codecov to 4.5.0

  • Remove build of Redis from setup.py

  • Mitigate dependency installation issues

  • Fix internal host name representation for Dragon backend

  • Make dependencies more discoverable in setup.py

  • Add hardware pinning capability when using dragon

  • Pin NumPy version to 1.x

  • New launcher support for SGE (and similar derivatives)

  • Fix test outputs being created in incorrect directory

  • Improve support for building SmartSim without ML backends

  • Update packaging dependency

  • Remove broken oss.redis.com URI blocking documentation generation

Detailed Notes

  • On Frontier, the MIOPEN cache may need to be set prior to using RedisAI in the smart validate. The instructions for Frontier have been updated accordingly. (SmartSim-PR727)

  • On Frontier, the recommended way to activate conda environments is to go through source activate. This also means that conda init is not needed. The instructions for Frontier have been updated to reflect this. (SmartSim-PR719)

  • The RedisAIBuilder class was completely overhauled to allow users to express a wider range of support for hardware/software stacks. This will be extended to support ROCm, CUDA-11, and CUDA-12. (SmartSim-PR669)

  • Versions for each of these packages are no longer specified in an internal class. Instead a default set of JSON files specifies the sources and versions. Users can specify their own custom specifications at smart build time. (SmartSim-PR669)

  • Because all build configuration has been moved to static files and all backends are compiled during smart build, SmartSim can now be shipped as a pure python wheel. (SmartSim-PR728)

  • Two new Dockerfiles are now provided (one each for 11.8 and 12.1) that can be used to build a container to run the tutorials. No HPC support should be expected at this time (SmartSim-PR669)

  • As a result of the previous change, SmartSim now requires C++17 and a minimum Cuda version of 11.8 in order to build Torch 2.1.0. (SmartSim-PR669)

  • Error messages were not being interpolated correctly. This has been addressed to provide more context when exposing error messages to users. (SmartSim-PR669)

  • The serializer would fail if a parameter for a Model or Ensemble was specified as a numpy dtype. The constructors for these methods now validate that the input is number-like and convert them to strings (SmartSim-PR676)

  • Pin watchdog to 4.x because v5 introduces new types and requires updates to the type-checking (SmartSim-PR690)

  • Update codecov to 4.5.0 to mitigate GitHub action failure (SmartSim-PR657)

  • The builder module was included in setup.py to allow us to ship the main Redis binaries (not RedisAI) with installs from PyPI. To allow easier maintenance of this file and enable future complexity this has been removed. The Redis binaries will thus be built by users during the smart build step

  • Installation of mypy or dragon in separate build actions caused some dependencies (typing_extensions, numpy) to be upgraded and caused runtime failures. The build actions were tweaked to include all optional dependencies to be considered by pip during resolution. Additionally, the numpy version was capped on dragon installations. (SmartSim-PR653)

  • setup.py used to define dependencies in a way that was not amenable to code scanning tools. Direct dependencies now appear directly in the setup call and the definition of the SmartRedis version has been removed (SmartSim-PR635)

  • The separate definition of dependencies for the docs in requirements-doc.txt is now defined as an extra. (SmartSim-PR635)

  • The new major version release of Numpy is incompatible with modules compiled against Numpy 1.x. For both SmartSim and SmartRedis we request a 1.x version of numpy. This is needed in SmartSim because some of the downstream dependencies request NumPy (SmartSim-PR623)

  • SGE is now a supported launcher for SmartSim. Users can now define BatchSettings which will be monitored by the TaskManager. Additionally, if the MPI implementation was built with SGE support, Orchestrators can use mpirun without needing to specify the hosts (SmartSim-PR610)

  • Ensure outputs from tests are written to temporary tests/test_output directory

  • Fix an error that would prevent smart build from moving a successfully compiled RedisAI shared object to the install location expected by SmartSim if no ML backend installations were found. Previously, this would effectively require users to build and install an ML backend to use the SmartSim orchestrator even if it was not necessary for their workflow. Users can install SmartSim without ML backends by running smart build --no_tf --no_pt and the RedisAI shared object will now be placed in the expected location. (SmartSim-PR601)

  • Fix packaging failures due to deprecated pkg_resources. (SmartSim-PR598)

0.7.0#

Released on 14 May, 2024

Description

  • Update tutorials and tutorial containers

  • Improve Dragon server shutdown

  • Add dragon runtime installer

  • Add launcher based on Dragon

  • Reuse Orchestrators within the testing suite to improve performance.

  • Fix building of documentation

  • Preview entities on experiment before start

  • Update authentication in release workflow

  • Auto-generate type-hints into documentation

  • Auto-post release PR to develop

  • Bump manifest.json to version 0.0.4

  • Fix symlinking batch ensemble and model bug

  • Fix noisy failing WLM test

  • Remove defensive regexp in .gitignore

  • Upgrade ubuntu to 22.04

  • Remove helper function init_default

  • Fix telemetry monitor logging errors for task history

  • Change default path for entities

  • Drop Python 3.8 support

  • Update watchdog dependency

  • Historical output files stored under .smartsim directory

  • Fixes unfalsifiable test that tests SmartSim’s custom SIGINT signal handler

  • Add option to build Torch backend without the Intel Math Kernel Library

  • Fix ReadTheDocs build issue

  • Disallow uninitialized variable use

  • Promote device options to an Enum

  • Update telemetry monitor, add telemetry collectors

  • Add method to specify node features for a Slurm job

  • Colo Orchestrator setup now blocks application start until setup finished

  • Refactor areas of the code where mypy potential errors

  • Minor enhancements to test suite

  • ExecArgs handling correction

  • ReadTheDocs config file added and enabled on PRs

  • Enforce changelog updates

  • Fix Jupyter notebook math expressions

  • Remove deprecated SmartSim modules

  • SmartSim Documentation refactor

  • Promote SmartSim statuses to a dedicated type

  • Update the version of Redis from [7.0.4]{.title-ref} to [7.2.4]{.title-ref}

  • Increase disk space in doc builder container

  • Update Experiment API typing

  • Prevent duplicate entity names

  • Fix publishing of development docs

Detailed Notes

  • The tutorials are up-to date with SmartSim and SmartRedis APIs. Additionally, the tutorial containers’ Docker files are updated. (SmartSim-PR589)

  • The Dragon server will now terminate any process which is still running when a request of an immediate shutdown is sent. (SmartSim-PR582)

  • Add --dragon option to smart build. Install appropriate Dragon runtime from Dragon GitHub release assets. (SmartSim-PR580)

  • Add new launcher, based on Dragon. The new launcher is compatible with the Slurm and PBS schedulers and can be selected by specifying launcher="dragon" when creating an Experiment, or by using DragonRunSettings to launch a job. The Dragon launcher is at an early stage of development: early adopters are referred to the dedicated documentation section to learn more about it. (SmartSim-PR580)

  • Tests may now request a given configuration and will reconnect to the existing orchestrator instead of building up and tearing down a new one each test. (SmartSim-PR567)

  • Manually ensure that typing_extensions==4.6.1 in Dockerfile used to build docs. This fixes the deploy_dev_docs Github action (SmartSim-PR564)

  • Added preview functionality to Experiment, including preview of all entities, active infrastructure and client configuration. (SmartSim-PR525)

  • Replace the developer created token with the GH_TOKEN environment variable. (SmartSim-PR570)

  • Add extension to auto-generate function type-hints into documentation. (SmartSim-PR561)

  • Add to github release workflow to auto generate a pull request from master into develop for release. (SmartSim-PR566)

  • The manifest.json version needs to match the SmartDashboard version, which is 0.0.4 in the upcoming release. (SmartSim-PR563)

  • Properly symlinks batch ensembles and batch models. (SmartSim-PR547)

  • Remove defensive regexp in .gitignore and ensure tests write to test_output. (SmartSim-PR560)

  • After dropping support for Python 3.8, ubuntu needs to be upgraded. (SmartSim-PR558)

  • Remove helper function init_default and replace with traditional type narrowing. (SmartSim-PR545)

  • Ensure the telemetry monitor does not track a task_id for a managed task. (SmartSim-PR557)

  • The default path for an entity is now the path to the experiment / the entity name. create_database and create_ensemble now have path arguments. All path arguments are compatible with relative paths. Relative paths are relative to the CWD. (SmartSim-PR533)

  • Python 3.8 is reaching its end-of-life in October, 2024, so it will no longer continue to be supported. (SmartSim-PR544)

  • Update watchdog dependency from 3.x to 4.x, fix new type issues (SmartSim-PR540)

  • The dashboard needs to display historical logs, so log files are written out under the .smartsim directory and files under the experiment directory are symlinked to them. (SmartSim-PR532)

  • Add an option to smart build “–torch_with_mkl”/”–no_torch_with_mkl” to prevent Torch from trying to link in the Intel Math Kernel Library. This is needed because on machines that have the Intel compilers installed, the Torch will unconditionally try to link in this library, however fails because the linking flags are incorrect. (SmartSim-PR538)

  • Change typing_extensions and pydantic versions in readthedocs environment to enable docs build. (SmartSim-PR537)

  • Promote devices to a dedicated Enum type throughout the SmartSim code base. (SmartSim-PR527)

  • Update the telemetry monitor to enable retrieval of metrics on a scheduled interval. Switch basic experiment tracking telemetry to default to on. Add database metric collectors. Improve telemetry monitor logging. Create telemetry subpackage at [smartsim._core.utils.telemetry]{.title-ref}. Refactor telemetry monitor entrypoint. (SmartSim-PR460)

  • Users can now specify node features for a Slurm job through SrunSettings.set_node_feature. The method accepts a string or list of strings. (SmartSim-PR529)

  • The request to the colocated entrypoints file within the shell script is now a blocking process. Once the Orchestrator is setup, it returns which moves the process to the background and allows the application to start. This prevents the application from requesting a ML model or script that has not been uploaded to the Orchestrator yet. (SmartSim-PR522)

  • Add checks and tests to ensure SmartSim users cannot initialize run settings with a list of lists as the exe_args argument. (SmartSim-PR517)

  • Add readthedocs configuration file and enable readthedocs builds on pull requests. Additionally added robots.txt file generation when readthedocs environment detected. (SmartSim-PR512)

  • Add Github Actions workflow that checks if changelog is edited on pull requests into develop. (SmartSim-PR518)

  • Add path to MathJax.js file so that Sphinx will use to render math expressions. (SmartSim-PR516)

  • Removed deprecated SmartSim modules: slurm and mpirunSettings. (SmartSim-PR514)

  • Implemented new structure of SmartSim documentation. Added examples images and further detail of SmartSim components. (SmartSim-PR463)

  • Promote SmartSim statuses to a dedicated type named SmartSimStatus. (SmartSim-PR509)

  • Update Redis version to [7.2.4]{.title-ref}. This change fixes an issue in the Redis build scripts causing failures on Apple Silicon hosts. (SmartSim-PR507)

  • The container which builds the documentation for every merge to develop was failing due to a lack of space within the container. This was fixed by including an additional Github action that removes some unneeded software and files that come from the default Github Ubuntu container. (SmartSim-PR504)

  • Update the generic [t.Any]{.title-ref} typehints in Experiment API. (SmartSim-PR501)

  • The CI will fail static analysis if common erroneous truthy checks are detected. (SmartSim-PR524)

  • Prevent the launch of duplicate named entities. Allow completed entities to run. (SmartSim-PR480)

  • The CI will fail static analysis if a local variable used while potentially undefined. (SmartSim-PR521)

  • Remove previously deprecated behavior present in test suite on machines with Slurm and Open MPI. (SmartSim-PR520)

  • Experiments in the WLM tests are given explicit paths to prevent unexpected directory creation. Ensure database are not left open on test suite failures. Update path to pickle file in tests/full_wlm/test_generic_orc_launch_batch.py::test_launch_cluster_orc_reconnect to conform with changes made in (SmartSim-PR533). (SmartSim-PR559)

  • When calling Experiment.start SmartSim would register a signal handler that would capture an interrupt signal (^C) to kill any jobs launched through its JobManager. This would replace the default (or user defined) signal handler. SmartSim will now attempt to kill any launched jobs before calling the previously registered signal handler. (SmartSim-PR535)

0.6.2#

Released on 16 February, 2024

Description

  • Patch SmartSim dependency version

Detailed Notes

  • A critical performance concern was identified and addressed in SmartRedis. A patch fix was deployed, and SmartSim was updated to ensure users do not inadvertently pull the unpatched version of SmartRedis. (SmartSim-PR493)

0.6.1#

Released on 15 February, 2024

Description

  • Duplicate for DBModel/Script prevented

  • Update license to include 2024

  • Telemetry monitor is now active by default

  • Add support for Mac OSX on Apple Silicon

  • Remove Torch warnings during testing

  • Validate Slurm timing format

  • Expose Python Typehints

  • Fix test_logs to prevent generation of directory

  • Fix Python Typehint for colocated database settings

  • Python 3.11 Support

  • Quality of life [smart validate]{.title-ref} improvements

  • Remove Cobalt support

  • Enrich logging through context variables

  • Upgrade Machine Learning dependencies

  • Override sphinx-tabs background color

  • Add concurrency group to test workflow

  • Fix index when installing torch through smart build

Detailed Notes

  • Modify the [git clone]{.title-ref} for both Redis and RedisAI to set the line endings to unix-style line endings when using MacOS on ARM. (SmartSim-PR482)

  • Separate install instructions are now provided for Mac OSX on x64 vs ARM64 (SmartSim-PR479)

  • Prevent duplicate ML model and script names being added to an Ensemble member if the names exists. (SmartSim-PR475)

  • Updates [Copyright (c) 2021-2023]{.title-ref} to [Copyright (c) 2021-2024]{.title-ref} in all of the necessary files. (SmartSim-PR485)

  • Bug fix which prevents the expected behavior when the [SMARTSIM_LOG_LEVEL]{.title-ref} environment variable was set to [developer]{.title-ref}. (SmartSim-PR473)

  • Sets the default value of the “enable telemetry” flag to on. Bumps the output [manifest.json]{.title-ref} version number to match that of [smartdashboard]{.title-ref} and pins a watchdog version to avoid build errors. (SmartSim-PR477)

  • Refactor logic of [Manifest.has_db_objects]{.title-ref} to remove excess branching and improve readability/maintainability. (SmartSim-PR476)

  • SmartSim can now be built and used on platforms using Apple Silicon (ARM64). Currently, only the PyTorch backend is supported. Note that libtorch will be downloaded from a CrayLabs github repo. (SmartSim-PR465)

  • Tests that were saving Torch models were emitting warnings. These warnings were addressed by updating the model save test function. (SmartSim-PR472)

  • Validate the timing format when requesting a slurm allocation. (SmartSim-PR471)

  • Add and ship [py.typed]{.title-ref} marker to expose inline type hints. Fix type errors related to SmartRedis. (SmartSim-PR468)

  • Fix the [test_logs.py::test_context_leak]{.title-ref} test that was erroneously creating a directory named [some value]{.title-ref} in SmartSim’s root directory. (SmartSim-PR467)

  • Add Python type hinting to colocated settings. (SmartSim-PR462)

  • Add github actions for running black and isort checks. (SmartSim-PR464)

  • Relax the required version of [typing_extensions]{.title-ref}. (SmartSim-PR459)

  • Addition of Python 3.11 to SmartSim. (SmartSim-PR461)

  • Quality of life [smart validate]{.title-ref} improvements such as setting [CUDA_VISIBLE_DEVICES]{.title-ref} environment variable within [smart validate]{.title-ref} prior to importing any ML deps to prevent false negatives on multi-GPU systems. Additionally, move SmartRedis logs from standard out to dedicated log file in the validation temporary directory as well as suppress [sklearn]{.title-ref} deprecation warning by pinning [KMeans]{.title-ref} constructor argument. Lastly, move TF test to last as TF may reserve the GPUs it uses. (SmartSim-PR458)

  • Some actions in the current GitHub CI/CD workflows were outdated. They were replaced with the latest versions. (SmartSim-PR446)

  • As the Cobalt workload manager is not used on any system we are aware of, its support in SmartSim was terminated and classes such as [CobaltLauncher]{.title-ref} have been removed. (SmartSim-PR448)

  • Experiment logs are written to a file that can be read by the dashboard. (SmartSim-PR452)

  • Updated SmartSim’s machine learning backends to PyTorch 2.0.1, Tensorflow 2.13.1, ONNX 1.14.1, and ONNX Runtime 1.16.1. As a result of this change, there is now an available ONNX wheel for use with Python 3.10, and wheels for all of SmartSim’s machine learning backends with Python 3.11. (SmartSim-PR451) (SmartSim-PR461)

  • The sphinx-tabs documentation extension uses a white background for the tabs component. A custom CSS for those components to inherit the overall theme color has been added. (SmartSim-PR453)

  • Add concurrency groups to GitHub’s CI/CD workflows, preventing multiple workflows from the same PR to be launched concurrently. (SmartSim-PR439)

  • Torch changed their preferred indexing when trying to install their provided wheels. Updated the [pip install]{.title-ref} command within [smart build]{.title-ref} to ensure that the appropriate packages can be found. (SmartSim-PR449)

0.6.0#

Released on 18 December, 2023

Description

  • Conflicting directives in the SmartSim packaging instructions were fixed

  • [sacct]{.title-ref} and [sstat]{.title-ref} errors are now fatal for Slurm-based workflow executions

  • Added documentation section about ML features and TorchScript

  • Added TorchScript functions to Online Analysis tutorial

  • Added multi-DB example to documentation

  • Improved test stability on HPC systems

  • Added support for producing & consuming telemetry outputs

  • Split tests into groups for parallel execution in CI/CD pipeline

  • Change signature of [Experiment.summary()]{.title-ref}

  • Expose first_device parameter for scripts, functions, models

  • Added support for MINBATCHTIMEOUT in model execution

  • Remove support for RedisAI 1.2.5, use RedisAI 1.2.7 commit

  • Add support for multiple databases

Detailed Notes

  • Several conflicting directives between the [setup.py]{.title-ref} and the [setup.cfg]{.title-ref} were fixed to mitigate warnings issued when building the pip wheel. (SmartSim-PR435)

  • When the Slurm functions [sacct]{.title-ref} and [sstat]{.title-ref} returned an error, it would be ignored and SmartSim’s state could become inconsistent. To prevent this, errors raised by [sacct]{.title-ref} or [sstat]{.title-ref} now result in an exception. (SmartSim-PR392)

  • A section named ML Features was added to documentation. It contains multiple examples of how ML models and functions can be added to and executed on the DB. TorchScript-based post-processing was added to the Online Analysis tutorial (SmartSim-PR411)

  • An example of how to use multiple Orchestrators concurrently was added to the documentation (SmartSim-PR409)

  • The test infrastructure was improved. Tests on HPC system are now stable, and issues such as non-stopped [Orchestrators]{.title-ref} or experiments created in the wrong paths have been fixed (SmartSim-PR381)

  • A telemetry monitor was added to check updates and produce events for SmartDashboard (SmartSim-PR426)

  • Split tests into [group_a]{.title-ref}, [group_b]{.title-ref}, [slow_tests]{.title-ref} for parallel execution in CI/CD pipeline (SmartSim-PR417, SmartSim-PR424)

  • Change [format]{.title-ref} argument to [style]{.title-ref} in [Experiment.summary()]{.title-ref}, this is an API break (SmartSim-PR391)

  • Added support for first_device parameter for scripts, functions, and models. This causes them to be loaded to the first num_devices beginning with first_device (SmartSim-PR394)

  • Added support for MINBATCHTIMEOUT in model execution, which caps the delay waiting for a minimium number of model execution operations to accumulate before executing them as a batch (SmartSim-PR387)

  • RedisAI 1.2.5 is not supported anymore. The only RedisAI version is now 1.2.7. Since the officially released RedisAI 1.2.7 has a bug which breaks the build process on Mac OSX, it was decided to use commit 634916c from RedisAI’s GitHub repository, where such bug has been fixed. This applies to all operating systems. (SmartSim-PR383)

  • Add support for creation of multiple databases with unique identifiers. (SmartSim-PR342)

0.5.1#

Released on 14 September, 2023

Description

  • Add typehints throughout the SmartSim codebase

  • Provide support for Slurm heterogeneous jobs

  • Provide better support for [PalsMpiexecSettings]{.title-ref}

  • Allow for easier inspection of SmartSim entities

  • Log ignored error messages from [sacct]{.title-ref}

  • Fix colocated db preparation bug when using [JsrunSettings]{.title-ref}

  • Fix bug when user specify CPU and devices greater than 1

  • Fix bug when get_allocation called with reserved keywords

  • Enabled mypy in CI for better type safety

  • Mitigate additional suppressed pylint errors

  • Update linting support and apply to existing errors

  • Various improvements to the [smart]{.title-ref} CLI

  • Various documentation improvements

  • Various test suite improvements

Detailed Notes

  • Add methods to allow users to inspect files attached to models and ensembles. (SmartSim-PR352)

  • Add a [smart info]{.title-ref} target to provide rudimentary information about the SmartSim installation. (SmartSim-PR350)

  • Remove unnecessary generation producing unexpected directories in the test suite. (SmartSim-PR349)

  • Add support for heterogeneous jobs to [SrunSettings]{.title-ref} by allowing users to set the [–het-group]{.title-ref} parameter. (SmartSim-PR346)

  • Provide clearer guidelines on how to contribute to SmartSim. (SmartSim-PR344)

  • Integrate [PalsMpiexecSettings]{.title-ref} into the [Experiment]{.title-ref} factory methods when using the [“pals”]{.title-ref} launcher. (SmartSim-PR343)

  • Create public properties where appropriate to mitigate [protected-access]{.title-ref} errors. (SmartSim-PR341)

  • Fix a failure to execute [_prep_colocated_db]{.title-ref} due to incorrect named attr check. (SmartSim-PR339)

  • Enabled and mitigated mypy [disallow_any_generics]{.title-ref} and [warn_return_any]{.title-ref}. (SmartSim-PR338)

  • Add a [smart validate]{.title-ref} target to provide a simple smoke test to assess a SmartSim build. (SmartSim-PR336, SmartSim-PR351)

  • Add typehints to [smartsim._core.launcher.step.*]{.title-ref}. (SmartSim-PR334)

  • Log errors reported from slurm WLM when attempts to retrieve status fail. (SmartSim-PR331, SmartSim-PR332)

  • Fix incorrectly formatted positional arguments in log format strings. (SmartSim-PR330)

  • Ensure that launchers pass environment variables to unmanaged job steps. (SmartSim-PR329)

  • Add additional tests surrounding the [RAI_PATH]{.title-ref} configuration environment variable. (SmartSim-PR328)

  • Remove unnecessary execution of unescaped shell commands. (SmartSim-PR327)

  • Add error if user calls get_allocation with reserved keywords in slurm get_allocation. (SmartSim-PR325)

  • Add error when user requests CPU with devices greater than 1 within add_ml_model and add_script. (SmartSim-PR324)

  • Update documentation surrounding ensemble key prefixing. (SmartSim-PR322)

  • Fix formatting of the Frontier site installation. (SmartSim-PR321)

  • Update pylint dependency, update .pylintrc, mitigate non-breaking issues, suppress api breaks. (SmartSim-PR311)

  • Refactor the [smart]{.title-ref} CLI to use subparsers for better documentation and extension. (SmartSim-PR308)

0.5.0#

Released on 6 July, 2023

Description

A full list of changes and detailed notes can be found below:

  • Update SmartRedis dependency to v0.4.1

  • Fix tests for db models and scripts

  • Fix add_ml_model() and add_script() documentation, tests, and code

  • Remove [requirements.txt]{.title-ref} and other places where dependencies were defined

  • Replace [limit_app_cpus]{.title-ref} with [limit_db_cpus]{.title-ref} for co-located orchestrators

  • Remove wait time associated with Experiment launch summary

  • Update and rename Redis conf file

  • Migrate from redis-py-cluster to redis-py

  • Update full test suite to not require a TF wheel at test time

  • Update doc strings

  • Remove deprecated code

  • Relax the coloredlogs version

  • Update Fortran tutorials for SmartRedis

  • Add support for multiple network interface binding in Orchestrator and Colocated DBs

  • Add typehints and static analysis

Detailed notes

  • Updates SmartRedis to the most current release (SmartSim-PR316)

  • Fixes and enhancements to documentation (SmartSim-PR317, SmartSim-PR314, SmartSim-PR287)

  • Various fixes and enhancements to the test suite (SmartSim-PR315, SmartSim-PR312, SmartSim-PR310, SmartSim-PR302, SmartSim-PR283)

  • Fix a defect in the tests related to database models and scripts that was causing key collisions when testing on workload managers (SmartSim-PR313)

  • Remove [requirements.txt]{.title-ref} and other places where dependencies were defined. (SmartSim-PR307)

  • Fix defect where dictionaries used to create run settings can be changed unexpectedly due to copy-by-ref (SmartSim-PR305)

  • The underlying code for Model.add_ml_model() and Model.add_script() was fixed to correctly handle multi-GPU configurations. Tests were updated to run on non-local launchers. Documentation was updated and fixed. Also, the default testing interface has been changed to lo instead of ipogif. (SmartSim-PR304)

  • Typehints have been added. A makefile target [make check-mypy]{.title-ref} executes static analysis with mypy. (SmartSim-PR295, SmartSim-PR301, SmartSim-PR303)

  • Replace [limit_app_cpus]{.title-ref} with [limit_db_cpus]{.title-ref} for co-located orchestrators. This resolves some incorrect behavior/assumptions about how the application would be pinned. Instead, users should directly specify the binding options in their application using the options appropriate for their launcher (SmartSim-PR306)

  • Simplify code in [random_permutations]{.title-ref} parameter generation strategy (SmartSim-PR300)

  • Remove wait time associated with Experiment launch summary (SmartSim-PR298)

  • Update Redis conf file to conform with Redis v7.0.5 conf file (SmartSim-PR293)

  • Migrate from redis-py-cluster to redis-py for cluster status checks (SmartSim-PR292)

  • Update full test suite to no longer require a tensorflow wheel to be available at test time. (SmartSim-PR291)

  • Correct spelling of colocated in doc strings (SmartSim-PR290)

  • Deprecated launcher-specific orchestrators, constants, and ML utilities were removed. (SmartSim-PR289)

  • Relax the coloredlogs version to be greater than 10.0 (SmartSim-PR288)

  • Update the Github Actions runner image from [macos-10.15]{.title-ref}[ to `macos-12]{.title-ref}`. The former began deprecation in May 2022 and was finally removed in May 2023. (SmartSim-PR285)

  • The Fortran tutorials had not been fully updated to show how to handle return/error codes. These have now all been updated. (SmartSim-PR284)

  • Orchestrator and Colocated DB now accept a list of interfaces to bind to. The argument name is still [interface]{.title-ref} for backward compatibility reasons. (SmartSim-PR281)

  • Typehints have been added to public APIs. A makefile target to execute static analysis with mypy is available [make check-mypy]{.title-ref}. (SmartSim-PR295)

0.4.2#

Released on April 12, 2023

Description

This release of SmartSim had a focus on polishing and extending exiting features already provided by SmartSim. Most notably, this release provides support to allow users to colocate their models with an orchestrator using Unix domain sockets and support for launching models as batch jobs.

Additionally, SmartSim has updated its tool chains to provide a better user experience. Notably, SmarSim can now be used with Python 3.10, Redis 7.0.5, and RedisAI 1.2.7. Furthermore, SmartSim now utilizes SmartRedis’s aggregation lists to streamline the use and extension of ML data loaders, making working with popular machine learning frameworks in SmartSim a breeze.

A full list of changes and detailed notes can be found below:

  • Add support for colocating an orchestrator over UDS

  • Add support for Python 3.10, deprecate support for Python 3.7 and RedisAI 1.2.3

  • Drop support for Ray

  • Update ML data loaders to make use of SmartRedis’s aggregation lists

  • Allow for models to be launched independently as batch jobs

  • Update to current version of Redis to 7.0.5

  • Add support for RedisAI 1.2.7, pyTorch 1.11.0, Tensorflow 2.8.0, ONNXRuntime 1.11.1

  • Fix bug in colocated database entrypoint when loading PyTorch models

  • Fix test suite behavior with environment variables

Detailed Notes

  • Running some tests could result in some SmartSim-specific environment variables to be set. Such environment variables are now reset after each test execution. Also, a warning for environment variable usage in Slurm was added, to make the user aware in case an environment variable will not be assigned the desired value with [–export]{.title-ref}. (SmartSim-PR270)

  • The PyTorch and TensorFlow data loaders were update to make use of aggregation lists. This breaks their API, but makes them easier to use. (SmartSim-PR264)

  • The support for Ray was dropped, as its most recent versions caused problems when deployed through SmartSim. We plan to release a separate add-on library to accomplish the same results. If you are interested in getting the Ray launch functionality back in your workflow, please get in touch with us! (SmartSim-PR263)

  • Update from Redis version 6.0.8 to 7.0.5. (SmartSim-PR258)

  • Adds support for Python 3.10 without the ONNX machine learning backend. Deprecates support for Python 3.7 as it will stop receiving security updates. Deprecates support for RedisAI 1.2.3. Update the build process to be able to correctly fetch supported dependencies. If a user attempts to build an unsupported dependency, an error message is shown highlighting the discrepancy. (SmartSim-PR256)

  • Models were given a [batch_settings]{.title-ref} attribute. When launching a model through [Experiment.start]{.title-ref} the [Experiment]{.title-ref} will first check for a non-nullish value at that attribute. If the check is satisfied, the [Experiment]{.title-ref} will attempt to wrap the underlying run command in a batch job using the object referenced at [Model.batch_settings]{.title-ref} as the batch settings for the job. If the check is not satisfied, the [Model]{.title-ref} is launched in the traditional manner as a job step. (SmartSim-PR245)

  • Fix bug in colocated database entrypoint stemming from uninitialized variables. This bug affects PyTorch models being loaded into the database. (SmartSim-PR237)

  • The release of RedisAI 1.2.7 allows us to update support for recent versions of PyTorch, Tensorflow, and ONNX (SmartSim-PR234)

  • Make installation of correct Torch backend more reliable according to instruction from PyTorch

  • In addition to TCP, add UDS support for colocating an orchestrator with models. Methods [Model.colocate_db_tcp]{.title-ref} and [Model.colocate_db_uds]{.title-ref} were added to expose this functionality. The [Model.colocate_db]{.title-ref} method remains and uses TCP for backward compatibility (SmartSim-PR246)

0.4.1#

Released on June 24, 2022

Description: This release of SmartSim introduces a new experimental feature to help make SmartSim workflows more portable: the ability to run simulations models in a container via Singularity. This feature has been tested on a small number of platforms and we encourage users to provide feedback on its use.

We have also made improvements in a variety of areas: new utilities to load scripts and machine learning models into the database directly from SmartSim driver scripts and install-time choice to use either [KeyDB]{.title-ref} or [Redis]{.title-ref} for the Orchestrator. The [RunSettings]{.title-ref} API is now more consistent across subclasses. Another key focus of this release was to aid new SmartSim users by including more extensive tutorials and improving the documentation. The docker image containing the SmartSim tutorials now also includes a tutorial on online training.

Launcher improvements

  • New methods for specifying [RunSettings]{.title-ref} parameters (SmartSim-PR166) (SmartSim-PR170)

  • Better support for [mpirun]{.title-ref}, [mpiexec]{.title-ref}, and [orterun]{.title-ref} as launchers (SmartSim-PR186)

  • Experimental: add support for running models via Singularity (SmartSim-PR204)

Documentation and tutorials

General improvements and bug fixes

Dependency updates

0.4.0#

Released on Feb 11, 2022

Description: In this release SmartSim continues to promote ease of use. To this end SmartSim has introduced new portability features that allow users to abstract away their targeted hardware, while providing even more compatibility with existing libraries.

A new feature, Co-located orchestrator deployments has been added which provides scalable online inference capabilities that overcome previous performance limitations in seperated orchestrator/application deployments. For more information on advantages of co-located deployments, see the Orchestrator section of the SmartSim documentation.

The SmartSim build was significantly improved to increase customization of build toolchain and the smart command line inferface was expanded.

Additional tweaks and upgrades have also been made to ensure an optimal experience. Here is a comprehensive list of changes made in SmartSim 0.4.0.

Orchestrator Enhancements:

Emphasize Driver Script Portability:

  • Add ability to create run settings through an experiment (SmartSim-PR110)

  • Add ability to create batch settings through an experiment (SmartSim-PR112)

  • Add automatic launcher detection to experiment portability functions (SmartSim-PR120)

Expand Machine Learning Library Support:

Expand Launcher Setting Options:

  • Add ability to use base RunSettings on a Slurm, or PBS launchers (SmartSim-PR90)

  • Add ability to use base RunSettings on LFS launcher (SmartSim-PR108)

Deprecations and Breaking Changes

General Improvements and Bug Fixes:

Documentation Updates:

0.3.2#

Released on August 10, 2021

Description:

0.3.1#

Released on May 5, 2021

Description: This release was dedicated to making the install process easier. SmartSim can be installed from PyPI now and the smart cli tool makes installing the machine learning runtimes much easier.

0.3.0#

Released on April 1, 2021

Description:

  • initial 0.3.0 (first public) release of SmartSim


SmartRedis#

0.6.1#

Released on 27 September, 2024

Description

  • Fix a memory leak in the Fortran Dataset implementation

Detailed Notes

  • The dataset object, if used in a loop, would leave memory dangling. To alleviate this, a final procedure has been implemented. Fortran compilers, however, are notoriously bad at detecting when an object goes out of scope and to destroy them automatically. We thus also provide an explicit destructor procedure. (PR514)

0.6.0#

Released on 25 September, 2024

Description

  • Fix instructions for including SmartRedis as an ExternalProject in CMake-based projects

  • Include algorithm import in rediscluster for gcc-14 and updated github artifact version

  • Touch-up outdated information in README.md

  • Update codecov to v4.5.0 for github actions

  • Remove broken oss.redis.com URLs from documentation

  • Add option to allow SmartRedis Fortran library to retain the path to the main client library

  • Update examples and tests to use find_package(smartredis)

  • Generate config files necessary to allow CMake projects to add SmartRedis via find_package

  • Allow users to specify install location of SmartRedis libraries

  • Streamline compilation of SmartRedis dependencies

  • Pin NumPy version to 1.x

Detailed Notes

  • Instructions for including SmartRedis as a CMake ExternalProject had a couple of missing closing parentheses and typo in the definition of the libsmartredis-fortran block (PR503)

  • Include algorithm import in rediscluster.h to satisfy gcc-14 compilation error. (PR505)

  • Update github actions to upload-artifact@v3 and download-artifact@v3 (PR505) (PR511) (PR512)

  • Update links to install documentation and remove outdated version numbers in the README.md (PR501)

  • Update codecov to v4.5.0 for github actions (PR502)

  • As part of this cleanup, some behaviors of how the libraries were named have been removed. The testing suite now distinguishes between various build types (e.g. Debug, Coverage, etc.) by specifying the CMAKE_INSTALL_PREFIX instead of appending it as part of the name of the library itself. (PR497)

  • The SmartRedis Fortran library now by default will retain the path to the SmartRedis C/C++ library. This should avoid occasional problems where users were getting “library not found” errors if they had moved libraries post-installation (PR497)

  • All the examples and tests now use the find_package functionality to setup linking flags (PR497)

  • The install process now generates package configuration files for the C/C++ SmartRedis library and the Fortran SmartRedis library. Users can use the find_package() command in their CMakeLists.txt to setup the linking and include flags automatically (PR497)

  • The CMakeLists.txt for SmartRedis now includes the install commands which allow users to specify the specific install prefix to install the SmartRedis libraries, header files, and Fortran .mod files (PR497)

  • hiredis, redis++, and pybind are now retrieved and installed in CMakeLists.txt instead of in the Makefile. This decouples the user-facing side of SmartRedis from the Makefile, which now can be used pureley as a convenient interface to compile SmartRedis with various options and coordinate testing (PR497)

  • The new major version release of Numpy is incompatible with modules compiled against Numpy 1.x. For both SmartSim and SmartRedis we request a 1.x version of numpy. This is needed in SmartSim because some of the downstream dependencies request NumPy. (PR498)

  • Ensure errors raised from client include details

0.5.3#

Released on 14 May 2024

Description

  • Improve client error logging

  • Fix pylint regression error

  • Fix build wheel error

  • Fix header styling issue

  • Correct changelog indention

  • Automate the creation of release notes

  • Auto-post release PR to develop from master

  • Upgrade ubuntu to 22.04 and gcc to 11

  • Drop Python 3.8 support

  • Fix C++ cosmetic defects leading to compiler warnings

  • Re-enable SR_PEDANTIC for the Makefile targets

  • Enforce changelog updates

  • Removed unused TensorBase constructor parameter

  • Remove unused parameter in internal redis cluster method

  • Enforce matching TensorType for DataSet::unpack_tensor()

  • Update CI for Intel suite

  • Add socket time out environment variable

  • Fix inconsistency in C-API ConfigOptions is_configured() parameters

  • Bump redis dep to 7.2.4

  • Fix widths field for list-table in install documentation

  • Remove a vestigial requirements.txt file

Detailed Notes

  • Ensure errors raised from client include details (PR485)

  • Pin pylint to fix regression error (PR492)

  • Add cstdint import to fix ubuntu with gcc wheel build (PR491)

  • Incorrect lineup of the changelog page index. This fixes the header sizes to avoid this issue. (PR489)

  • After converting from rst to md, readthedocs began throwing indention errors in old release info. This fixes the styling. (PR488)

  • Add a configuration file to the root of .github/ to configure the generated release notes. (PR487)

  • Add to github release workflow to auto generate a pull request from master into develop for release. (PR486)

  • After dropping support for Python 3.8, ubuntu and gcc need to be upgraded. (PR484)

  • Python 3.8 is reaching its end-of-life in October, 2024, so it will no longer continue to be supported. (PR482)

  • Fixes some mainly cosmetic defects in the C++ client that were leading to warnings when pedantic compiler flags were enabled (PR476)

  • Re-enable SR_PEDANTIC for the [test-lib]{.title-ref} and [test-lib-with-fortran]{.title-ref} Makefile targets (PR476)

  • Add Github Actions workflow that checks if changelog is edited on pull requests into develop. (PR480)

  • The TensorBase constructor SRMemoryLayout parameter was removed because it was not used. It is not needed as a member variable because all Tensor<T> objects store internal representations in contiguous memory. (PR479)

  • Client::unpack_tensor() enforces that the user-provided TensorType matches the known tensor type. Now DataSet::unpack_tensor() enforces the same condition. (PR478)

  • Removes an unused parameter in the RedisCluster::_get_model_script_db() method. (PR477)

  • Version numbers changed for the Intel Compiler chain that lead to the C and C++ compilers not being available. Now, the entirety of the Base and HPC kits are installed to ensure consistent versions. (PR475)

  • Add the socket timeout parameter as a user-configurable option via environment variables. (PR474)

  • Fix an inconsistency in the C-API ConfigOptions is_configured() parameter names. (PR471)

  • Fix an issue where incorrect compiler flags are defined and result in build failures due to the redis_fstat macro. (PR470)

  • Fix wrong widths value which was preventing table from displaying. (PR468)

  • The requirements.txt file is unused, therefore removing. (PR462)

0.5.2#

Released on February 16, 2024

Description

  • Fixed bug which was sending tensors to the database twice (Python Client)

Detailed Notes

  • A previous bug fix for the Python client which addressed a problem when sending numpy views inadvertently kept the original put_tensor call in place. This essentially doubles the cost of the operation. (PR464)

0.5.1#

Released on February 15, 2024

Description

  • Fix bug when sending an array view

  • Add concurrency groups for Github Action testing

  • Update license to include 2024

  • Increase build space for Github Actions

  • Update README python versions

  • Expose Typehints

  • Update supported python versions [Add 3.11, remove 3.7]

  • Tweak the build system to enable building SmartRedis with Nvidia’s NVHPC toolchain

  • Improvements/upgrades to the container used for Github actions

  • Code updates to avoid compiler warnings

  • Added developer documentation on how to run a single test case and eliminated duplicative environment variables

  • Resolve a linting issue with pybind-to-python error propagation

  • Use mutable fields to enable Dataset get methods that store memory to be marked const

Detailed Notes

  • Detect whether the tensor the user is sending is a view and if so, make an explicit copy. (PR453)

  • Add support to concurrency groups in the [run_tests]{.title-ref} workflow. (PR456)

  • Update license to include 2024. (PR454)

  • Add new Github Action that removes unneeded packages and resizes the root disk space. (PR455)

  • Update developer documentation to reflect newly supported versions of Python (PR450) (PR452)

  • Add and ship [py.typed]{.title-ref} marker to expose inline type hints (PR451)

  • Deprecate support for Python 3.7 by removing from the allowed Python versions (PR450)

  • Update Python package dependencies to add support for Python 3.11 (PR450)

  • Change the order of arguments in our MakeFile to ensure that all dependencies are compiled with GCC (PR448)

  • Add new user-configurable parameters DEP_CC, DEP_CXX to control which compiler is used to build dependencies (PR448)

  • Ameliorate some compiler warnings related that were flagged in GCC 12 (unreachable code blocks, signed/unsigned mismatches) (PR448)

  • CI/CD: Bump the container version used in Github Actions Ubuntu 22.04 to be able to start testing GCC 12 (PR448)

  • CI/CD: Bump the versions of GCC used in testing to the currently maintained versions (PR448)

  • CI/CD: Add NVHPC to the testing matrix (PR448)

  • CI/CD: Test the shared/static compilations and examples with all compilers (PR448)

  • CI/CD: Compile Redis and RedisAI and use those versions in testing instead of extracting from a container (PR448)

  • CI/CD: Bump the version of Redis used in testing to 7.0.5, the same version as we use with SmartSim (PR448)

  • CI/CD: Pin the Torch version to 1.11.0, the same as supported in SmartSim (PR448)

  • Added developer documentation on how to run a single test case with the new test/build system and eliminated use of SMARTREDIS_TEST_DEVICE and SMARTREDIS_TEST_CLUSTER environment variables (PR445)

  • Resolve a linting issue with pybind-to-python error propagation by changing import format and narrowing the lookup of pybind error names to the error module (PR444)

  • Use mutable fields to enable Dataset get methods that store memory to be marked const (PR443)

0.5.0#

Released on December 18, 2023

Description

  • Unpin the Intel Fortran compiler in CI/CD

  • Added a missing space in an error message

  • Improved consistency of namespace declarations for C++ pybind interface

  • Improved const correctness of C++ Client

  • Improved const correctness of C++ Dataset

  • Updated documentation

  • Added test cases for all Client construction parameter combinations

  • Centralized dependency tracking to setup.cfg

  • Improved robustness of Python client construction

  • Updated Client and Dataset documentation

  • Expanded list of allowed characters in the SSDB address

  • Added coverage to SmartRedis Python API functions

  • Improved responsiveness of library when attempting connection to missing backend database

  • Moved testing of examples to on-commit testing in CI/CD pipeline

  • Added name retrieval function to the DataSet object

  • Updated RedisAI version used in post-commit check-in testing in Github pipeline

  • Allow strings in Python interface for Client.run_script, Client.run_script_multiGPU

  • Improved support for model execution batching

  • Added support for model chunking

  • Updated the third-party RedisAI component

  • Updated the third-party lcov component

  • Add link to contributing guidelines

  • Added link to contributing guidelines

  • Added support for multiple backend databases via a new Client constructor that accepts a ConfigOptions object

Detailed Notes

  • Unpin the Intel Fortran compiler in CI/CD. This requires running the compiler setup script twice, once for Fortran and once for other languages, since they’re on different releases (PR436)

  • Added a missing space in an error message (PR435)

  • Made the declaration of the py namespace in py*.h consistently outside the SmartRedis namespace declaration (PR434)

  • Fields in several C++ API methods are now properly marked as const (PR430)

  • The Dataset add_tensor method is now const correct, as are all internal the methods it calls (PR427)

  • Some broken links in the documentation were fixed, and the instructions to run the tests were updated (PR423)

  • Added test cases for all Client construction parameter combinations (PR422)

  • Merged dependency lists from requirements.txt and requirements-dev.txt into setup.cfg to have only one set of dependencies going forward (PR420)

  • Improved robustness of Python client construction by adding detection of invalid kwargs (PR419), (PR421)

  • Updated the Client and Dataset API documentation to clarify which interacts with the backend db (PR416)

  • The SSDB address can now include ‘-’ and ‘_’ as special characters in the name. This gives users more options for naming the UDS socket file (PR415)

  • Added tests to increase Python code coverage

  • Employed a Redis++ ConnectionsObject in the connection process to establish a TCP timeout of 100ms during connection attempts (PR413)

  • Moved testing of examples to on-commit testing in CI/CD pipeline (PR412)

  • Added a function to the DataSet class and added a test

  • Updated RedisAI version used in post-commit check-in testing in Github pipeline to a version that supports fetch of model chunking size (PR408)

  • Allow users to pass single keys for the inputs and outputs parameters as a string for Python run_script and run_script_multigpu

  • Exposed access to the Redis.AI MINBATCHTIMEOUT parameter, which limits the delay in model execution when trying to accumulate multiple executions in a batch (PR406)

  • Models will now be automatically chunked when sent to/received from the backed database. This allows use of models greater than 511MB in size. (PR404)

  • Updated from RedisAI v1.2.3 (test target)/v1.2.4 and v1.2.5 (CI/CD pipeline) to v1.2.7 (PR402)

  • Updated lcov from version 1.15 to 2.0 (PR396)

  • Create CONTRIBUTIONS.md file that points to the contribution guideline for both SmartSim and SmartRedis (PR395)

  • Migrated to ConfigOptions-based Client construction, adding multiple database support (PR353)

0.4.2#

Released on September 13, 2023

Description

  • Reduced number of suppressed lint errors

  • Expanded documentation of aggregation lists

  • Updated third-party software dependencies to current versions

  • Updated post-merge tests in CI/CD to work with new test system

  • Enabled static builds of SmartRedis

  • Improve robustness of test runs

  • Fixed installation link

  • Updated supported languages documentation

  • Removed obsolete files

  • Added pylint to CI/CD pipeline and mitigate existing errors

  • Improved clustered redis initialization

Detailed Notes

  • Refactor factory for ConfigOptions to avoid using protected member outside an instance (PR393)

  • Added a new advanced topics documentation page with a section on aggregation lists (PR390)

  • Updated pybind (2.10.3 => 2.11.1), hiredis (1.1.0 => 1.2.0), and redis++ (1.3.5 => 1.3.10) dependencies to current versions (PR389)

  • Post-merge tests in CI/CD have been updated to interface cleanly with the new test system that was deployed in the previous release (PR388)

  • Static builds of SmartRedis can now work with Linux platforms. Fortran is tested with GNU, PGI, Intel compilers (PR386)

  • Preserve the shell output of test runs while making sure that server shutdown happens unconditionally (PR381)

  • Fix incorrect link to installation documentation (PR380)

  • Update language support matrix in documentation to reflect updates from the last release (PR379)

  • Fix typo causing startup failure in utility script for unit tests (PR378)

  • Update pylint configuration and version, mitigate most errors, execute in CI/CD pipeline (PR371, PR382)

  • Deleted obsolete build and testing files that are no longer needed with the new build and test system (PR366)

  • Reuse existing redis connection when mapping the Redis cluster (PR364)

0.4.1#

Released on July 5, 2023

Description

This release revamps the build and test systems for SmartRedis as well as improving compatibility with different Fortran compilers and laying the groundwork for future support for interacting with multiple concurrent backend databases:

  • Documentation improvements

  • Improved compatibility of type hints with third-party software

  • Added type hints to the Python interface layer

  • Add support for Python 3.10

  • Updated setup.py to work with the new build system

  • Remove unneeded method from Python SRObject class

  • Fixed a memory leak in the C layer

  • Revamp SmartRedis test system

  • Remove debug output in pybind layer

  • Update Hiredis version to 1.1.0

  • Enable parallel build for the SmartRedis examples

  • Experimental support for Nvidia toolchain

  • Major revamp of build and test systems for SmartRedis

  • Refactor Fortran methods to return default logical kind

  • Update CI/CD tests to use a modern version of MacOS

  • Fix the spelling of the Dataset destructor’s C interface (now DeallocateDataSet)

  • Update Redis++ version to 1.3.8

  • Refactor third-party software dependency installation

  • Add pip-install target to Makefile to automate this process going forward (note: this was later removed)

  • Added infrastructure for multiDB support

Detailed Notes

  • Assorted updates and clarifications to the documentation (PR367)

  • Turn [ParamSpec]{.title-ref} usage into forward references to not require [typing-extensions]{.title-ref} at runtime (PR365)

  • Added type hints to the Python interface layer (PR361)

  • List Python 3.10 support and loosen PyTorch requirement to allow for versions support Python 3.10 (PR360)

  • Streamlined setup.py to simplify Python install (PR359)

  • Remove from_pybind() from Python SRObject class as it’s not needed and didn’t work properly anyway (PR358)

  • Fixed memory leaked from the C layer when calling get_string_option() (PR357)

  • Major revamp to simplify use of SmartRedis test system, automating most test processes (PR356)

  • Remove debug output in pybind layer associated with put_dataset (PR352)

  • Updated to the latest version of Hiredis (1.1.0) (PR351)

  • Enable parallel build for the SmartRedis examples by moving utility Fortran code into a small static library (PR349)

  • For the NVidia toolchain only: Replaces the assumed rank feature of F2018 used in the Fortran client with assumed shape arrays, making it possible to compile SmartRedis with the Nvidia toolchain. (PR346)

  • Rework the build and test system to improve maintainability of the library. There have been several significant changes, including that Python and Fortran clients are no longer built by defaults and that there are Make variables that customize the build process. Please review the build documentation and make help to see all that has changed. (PR341)

  • Many Fortran routines were returning logical kind = c_bool which turns out not to be the same default kind of most Fortran compilers. These have now been refactored so that users need not import [iso_c_binding]{.title-ref} in their own applications (PR340)

  • Update MacOS version in CI/CD tests from 10.15 to 12.0 (PR339)

  • Correct the spelling of the C DataSet destruction interface from DeallocateeDataSet to DeallocateDataSet (PR338)

  • Updated the version of Redis++ to v1.3.8 to pull in a change that ensures the redis++.pc file properly points to the generated libraries (PR334)

  • Third-party software dependency installation is now handled in the Makefile instead of separate scripts

  • New pip-install target in Makefile will be a dependency of the lib target going forward so that users don’t have to manually pip install SmartRedis in the future (PR330)

  • Added ConfigOptions class and API, which will form the backbone of multiDB support (PR303)

0.4.0#

Released on April 12, 2023

Description

This release provides a variety of features to improve usability and debugging of the SmartRedis library, notably including Unix domain socket support, logging, the ability to print a textual representation of a string or dataset, dataset inspection, documentation updates, fixes to the multi-GPU support, and much more:

  • Prepare 0.4.0 release

  • Disable codecov CI tests

  • Improved error message in to_string methods in C interface

  • Streamlined PyBind interface layer

  • Updated Python API documentation

  • Streamlined C interface layer

  • Improved performance of get, put, and copy dataset methods

  • Fix a bug which prevented multi-GPU model set in some cases

  • Streamline pipelined execution of tasks for backend database

  • Enhance code coverage to include all 4 languages supported by SmartRedis

  • Fix a bug which resulted in wrong key prefixing when retrieving aggregation lists in ensembles

  • Correct assorted API documentation errors and omissions

  • Improve documentation of exception handling in Redis server classes

  • Improve error handling for setting of scripts and models

  • Add support to inspect the dimensions of a tensor via get_tensor_dims()

  • Split dataset prefixing control from use_tensor_ensemble_prefix() to use_dataset_ensemble_prefix()

  • Update to the latest version of redis-plus-plus

  • Update to the latest version of PyBind

  • Change documentation theme to sphinx_book_theme and fix doc strings

  • Add print capability for Client and DataSet

  • Add support for inspection of tensors and metadata inside datasets

  • Add support for user-directed logging for Python clients, using Client, Dataset, or LogContext logging methods

  • Add support for user-directed logging for C and Fortran clients without a Client or Dataset context

  • Additional error reporting for connections to and commands run against Redis databases

  • Improved error reporting capabilities for Fortran clients

  • Python error messages from SmartRedis contain more information

  • Added logging functionality to the SmartRedis library

  • A bug related to thread pool initialization was fixed.

  • This version adds new functionality in the form of support for Unix Domain Sockets.

  • Fortran client can now be optionally built with the rest of the library

  • Initial support for dataset conversions, specifically Xarray.

Detailed Notes

  • Update docs and version numbers in preparation for version 0.4.0. Clean up duplicate marking of numpy dependency (PR321)

  • Remove codecov thresholds to avoid commits being marked as ‘failed’ due to coverage variance (PR317)

  • Corrected the error message in to_string methods in C interface to not overwrite the returned error message and to name the function (PR320)

  • Streamlined PyBind interface layer to reduce repetitive boilerplate code (PR315)

  • Updated Python API summary table to include new methods (PR313)

  • Streamlined C interface layer to reduce repetitive boilerplate code (PR312)

  • Leveraged Redis pipelining to improve performance of get, put, and copy dataset methods (PR311)

  • Redis::set_model_multigpu() will now upload the correct model to all GPUs (PR310)

  • RedisCluster::_run_pipeline() will no longer unconditionally apply a retry wait before returning (PR309)

  • Expand code coverage to all four languages and make the CI/CD more efficent (PR308)

  • An internal flag was set incorrectly, it resulted in wrong key prefixing when accessing (retrieving or querying) lists created in ensembles (PR306)

  • Corrected a variety of Doxygen errors and omissions in the API documentation (PR305)

  • Added throw documentation for exception handling in redis.h, redisserver.h, rediscluster.h (PR301)

  • Added error handling for a rare edge condition when setting scripts and models (PR300)

  • Added support to inspect the dimensions of a tensor via new get_tensor_dims() method (PR299)

  • The use_tensor_ensemble_prefix() API method no longer controls whether datasets are prefixed. A new API method, use_dataset_ensemble_prefix() now manages this. (PR298)

  • Updated from redis-plus-plus v1.3.2 to v1.3.5 (PR296)

  • Updated from PyBind v2.6.2 to v2.10.3 (PR295)

  • Change documentation theme to sphinx_book_theme to match SmartSim documentation theme and fix Python API doc string errors (PR294)

  • Added print capability for Client and DataSet to give details diagnostic information for debugging (PR293)

  • Added support for retrieval of names and types of tensors and metadata inside datasets (PR291)

  • Added support for user-directed logging for Python clients via {Client, Dataset, LogContext}.{log_data, log_warning, log_error} methods (PR289)

  • Added support for user-directed logging without a Client or Dataset context to C and Fortran clients via _string() methods (PR288)

  • Added logging to capture transient errors that arise in the _run() and _connect() methods of the Redis and RedisCluster classes (PR287)

  • Tweak direct testing of Redis and RedisCluster classes (PR286)

  • Resolve a disparity in the construction of Python client and database classes (PR285)

  • Fortran clients can now access error text and source location (PR284)

  • Add exception location information from CPP code to Python exceptions (PR283)

  • Added client activity and manual logging for developer use (PR281)

  • Fix thread pool error (PR280)

  • Update library linking instructions and update Fortran tester build process (PR277)

  • Added [add_metadata_for_xarray]{.title-ref} and [transform_to_xarray]{.title-ref} methods in [DatasetConverter]{.title-ref} class for initial support with Xarray (PR262)

  • Change Dockerfile to use Ubuntu 20.04 LTS image (PR276)

  • Implemented support for Unix Domain Sockets, including refactorization of server address code, test cases, and check-in tests. (PR252)

  • A new make target [make lib-with-fortran]{.title-ref} now compiles the Fortran client and dataset into its own library which applications can link against (PR245)

0.3.1#

Released on June 24, 2022

Description

Version 0.3.1 adds new functionality in the form of DataSet aggregation lists for pipelined retrieval of data, convenient support for multiple GPUs, and the ability to delete scripts and models from the backend database. It also introduces multithreaded execution for certain tasks that span multiple shards of a clustered database, and it incorporates a variety of internal improvements that will enhance the library going forward.

Detailed Notes

  • Implemented DataSet aggregation lists in all client languages, for pipelined retrieval of data across clustered and non-clustered backend databases. (PR258) (PR257) (PR256) (PR248) New commands are:

    • append_to_list()

    • delete_list()

    • copy_list()

    • rename_list()

    • get_list_length()

    • poll_list_length()

    • poll_list_length_gte()

    • poll_list_length_lte()

    • get_datasets_from_list()

    • get_dataset_list_range()

    • use_list_ensemble_prefix()

  • Implemented multithreaded execution for parallel dataset list retrieval on clustered databases. The number of threads devoted for this purpose is controlled by the new environment variable SR_THERAD_COUNT. The value defaults to 4, but may be any positive integer or special value zero, which will cause the SmartRedis runtime to allocate one thread for each available hardware context. (PR251) (PR246)

  • Augmented support for GPUs by implementing multi-GPU convenience functions for all client languages. (PR254) (PR250) (PR244) New commands are:

    • set_model_from_file_multigpu()

    • set_model_multigpu()

    • set_script_from_file_multigpu()

    • set_script_multigpu()

    • run_model_multigpu()

    • run_script_multigpu()

    • delete_model_multigpu()

    • delete_script_multigpu()

  • Added API calls for all clients to delete models and scripts from the backend database. (PR240) New commands are:

    • delete_script()

    • delete_model()

  • Updated the use of backend RedisAI API calls to discontinue use of deprecated methods for model selection (AI.MODELSET) and execution (AI.MODELRUN) in favor of current methods AI.MODELSTORE and AI.MODELEXECUTE, respectively. (PR234)

  • SmartRedis will no longer call the C runtime method srand() to ensure that it does not interfere with random number generation in client code. It now uses a separate instance of the C++ random number generator. (PR233)

  • Updated the way that the Fortran enum_kind type defined in the fortran_c_interop module is defined in order to better comply with Fortran standard and not interfere with GCC 6.3.0. (PR231)

  • Corrected the spelling of the word “command” in a few error message strings. (PR221)

  • SmartRedis now requires a CMake version 3.13 or later in order to utilize the add_link_options CMake command. (PR217)

  • Updated and improved the documentation of the SmartRedis library. In particular, a new SmartRedis Integration Guide provides an introduction to using the SmartRedis library and integrating it with existing software. (PR261) (PR260) (PR259) (SSPR214)

  • Added clustered Redis testing to automated GitHub check-in testing. (PR239)

  • Updated the SmartRedis internal API for building commands for the backend database. (PR223) This change should not be visible to clients.

  • The SmartRedis example code is now validated through the automated GitHub checkin process. This will help ensure that the examples do not fall out of date. (PR220)

  • Added missing copyright statements to CMakeLists.txt and the SmartRedis examples. (PR219)

  • Updated the C++ test coverage to ensure that all test files are properly executed when running “make test”. (PR218)

  • Fixed an internal naming conflict between a local variable and a class member variable in the DataSet class. (PR215) This should not be visible to clients.

  • Updated the internal documentation of methods in SmartRedis C++ classes with the override keyword to improve compliance with the latest C++ standards. (PR214) This change should not be visible to clients.

  • Renamed variables internally to more cleanly differentiate between names that are given to clients for tensors, models, scripts, datasets, etc., and the keys that are used when storing them in the backend database. (PR213) This change should not be visible to clients.

0.3.0#

Released on February 11, 2022

Description

  • Improve error handling across all SmartRedis clients (PR159) (PR191) (PR199) (PR205) (PR206) Includes changes to C and Fortran function prototypes that are not backwards compatible. Includes changes to error class names and enum type names that are not backwards compatible

  • Add poll_dataset functionality to all SmartRedis clients (PR184) Due to other breaking changes made in this release, applications using methods other than poll_dataset to check for the existence of a dataset should now use poll_dataset

  • Add environment variables to control client connection and command timeout behavior (PR194)

  • Add AI.INFO command to retrieve statistics on scripts and models via Python and C++ clients (PR197)

  • Create a Dockerfile for SmartRedis (PR180)

  • Update redis-plus-plus version to 1.3.2 (PR162)

  • Internal client performance and API improvements (PR138) (PR141) (PR163) (PR203)

  • Expose Redis FLUSHDB, CONFIG GET, CONFIG SET, and SAVE commands to the Python client (PR139) (PR160)

  • Extend inverse CRC16 prefixing to all hash slots (PR161)

  • Improve backend dataset representation to enable performance optimization (PR195)

  • Simplify SmartRedis build proccess (PR189)

  • Fix zero-length array transfer in Fortran convert_char_array_to_c (PR170)

  • Add continuous integration for all SmartRedis tests (PR165) (PR173) (PR177)

  • Update SmartRedis docstrings (PR200) (PR207)

  • Update SmartRedis documentation and examples (PR202) (PR208) (PR210)

0.2.0#

Released on August, 5, 2021

Description

  • Improved tensor memory management in the Python client (PR70)

  • Improved metadata serialization and removed protobuf dependency (PR61)

  • Added unit testing infrastructure for the C++ client (PR96)

  • Improve command execution fault handling (PR65) (PR97) (PR105)

  • Bug fixes (PR52) (PR72) (PR76) (PR84)

  • Added copy, rename, and delete tensor and DataSet commands in the Python client (PR66)

  • Upgrade to RedisAI 1.2.3 (PR101)

  • Fortran and C interface improvements (PR93) (PR94) (PR95) (PR99)

  • Add Redis INFO command execution to the Python client (PR83)

  • Add Redis CLUSTER INFO command execution to the Python client (PR105)

0.1.1#

Released on May 5, 2021

Description

  • Compiled client library build and install update to remove environment variables (PR47)

  • Pip install for Python client (PR45)

0.1.0#

Released on April 1, 2021

Description

  • Initial 0.1.0 release of SmartRedis


SmartDashboard#

0.0.4#

Released on 14 May 2024

Description

0.0.3#

Released on 15 February 2024

Description

0.0.2#

Released on 14 December 2023

Description

  • The initial release of SmartDashboard includes capabilities for viewing experiment entity properties and statuses.