US Patent 11461300 Dynamic model server for multi-model machine learning inference services

Implementations include receiving, by an application programming interface (API) server of a plurality of API servers, a prediction request from a client system, each of the plurality of API servers including a stateless server, selecting, by the API server, a model server from a plurality of model servers based on the prediction request, each of the plurality of model servers including a stateful server, calling, by the API server, the model server to execute inference using a ML model loaded to memory of the model server, receiving, by the API server, an inference result from the ML model, and sending, by the API server, the inference result to the client system.

Timeline

No Timeline data yet.

Further Resources

Title

Author

Link

Type

Date

No Further Resources data yet.

US Patent 11461300 Dynamic model server for multi-model machine learning inference services

Contents

Patent attributes

Timeline

Further Resources

References

Find more entities like US Patent 11461300 Dynamic model server for multi-model machine learning inference services