Loaders are created by Sources for Servable Versions, then Loaders are sent as an Aspired Versions to the Dynamic Manager, which loads and serves them to client requests.
The Loader contains the metadata that needs to load the Servable.
The Source uses a callback function to notify the manager of the Aspired Version.
The manager applies the configured Version Policy to find the next action.
If the manager specifies that it’s safe, it provides the required resources to the Loader and tells the Loader to load the new version.
Clients request the manager for the Servable, either determining a version explicitly or just requesting the latest version. The manager returns a handle for the Servable. The Dynamic Manager applies the Version Policy and decides to load the new version.
If there is enough memory the Dynamic Manager tells the Loader. The Loader instantiates the TensorFlow graph with the new weights.
A client requests a handle to the latest version of the model, and the Dynamic Manager returns a handle to the new version of the Servable.