Architecture of TensorFlow
The TensorFlow runtime is a cross-platform library. The system architecture which makes this combination of scale flexible. We have basic familiarity with TensorFlow programming concepts such as the computation graph, operations, and sessions.
Some terms need to be understood first to understand TensorFlow architecture. The terms are TensorFlow Servable, servable Streams, TensorFlow Models, Loaders, Sources, Manager, and Core. The term and their functionality in the architecture of TensorFlow are described below.
TensorFlow architecture is appropriate to read and modify the core TensorFlow code.
1. TensorFlow Servable
These are the central uncompleted units in TensorFlow serving. Servables are the objects that the clients use to perform the computation.
The size of a servable is flexible. A single servable may consist of anything from a lookup table to a unique model in a tuple of interface models. Servable should be of any type and interface, which enabling flexibility and future improvements such as:
- Streaming results
- Asynchronous modes of operation.
- Experimental APIs
2. Servable Versions
TensorFlow server can handle one or more versions of the servables, over the lifetime of any single server instance. It opens the door for new algorithm configurations, weights, and other data can be loaded over time. They also can enable more than one version of a servable to be charged at a time. They also allow more than one version of a servable to be loaded concurrently, supporting roll-out and experimentation gradually.
3. Servable Streams
A sequence of versions of any servable sorted by increasing version of numbers.
4. TensorFlow Models
A serving represents a model in one or more servables. A machine-learned model includes one or more algorithm and lookup the embedding tables. A servable can also serve like a fraction of a model; for example, an example, a large lookup table be served as many instances.
5. TensorFlow Loaders
Loaders manage a servable’s life cycle. The loader API enables common infrastructure which is independent of the specific learning algorithm, data, or product use-cases involved.
6. Sources in TensorFlow Architecture
In simple terms, sources are modules that find and provide servable. Each reference provides zero or more servable streams at a time. For each servable stream, a source supplies only one loader instance for every servable.
Each source also provides zero or more servable streams. For each servable stream, a source supplies only one loader instance and makes available to be loaded.
7. TensorFlow Managers
TensorFlow managers handle the full lifecycle of a Servables, including:
- Loading Servables
- Serving Servables
- Unloading Servables
Manager observes to sources and tracks all versions. The Manager tries to fulfill causes, but it can refuse to load an Aspired version.
Managers may also postpone an “unload.” For example, a manager can wait to unload as far as a newer version completes loading, based on a policy to assure that at least one version is loaded all the times.
For example, GetServableHandle (), for clients to access the loaded servable instances.
8. TensorFlow Core
This manages the below aspects of servables:
- TensorFlow serving core satisfaction servables and loaders as the opaque objects.
9. Life of a Servable
TensorFlow Technical Architecture:
- Sources create loaders for Servable Versions, and then loaders are sent as Aspired versions to the Manager, which will load and serve them to client requests.
- The Loader contains metadata, and it needs to load the servable.
- The source uses a callback to convey the Manager of Aspired version.
- The Manager applies the effective version policy to determine the next action to take.
- If the Manager determines that it gives the Loader to load a new version, clients ask the Manager for the servable, and specifying a version explicitly or requesting the current version. The Manager returns a handle for servable. The dynamic Manager applies the version action and decides to load the newer version of it.
- The dynamic Manager commands the Loader that there is enough memory.
- A client requests a handle for the latest version of the model, and dynamic Manager returns a handle to the new version of servable.
10. TensorFlow Loaders
TensorFlow is one such algorithm backend. For example, we will implement a new loader to load, provide access, and unload an instance of a new type of servable of the machine learning model.
11. Batcher in TensorFlow Architecture
Batching of TensorFlow requests into a single application can significantly reduce the cost f performing inference, especially in the presence of hardware accelerators and GPUs. TensorFlow serving has a claim batching device that approves clients to batch their type-specific assumption beyond request into batch quickly. And request that algorithm systems can process more efficiently.