Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

.Net: Hybrid Compute - Models #10098

Open
markwallace-microsoft opened this issue Jan 7, 2025 · 0 comments
Open

.Net: Hybrid Compute - Models #10098

markwallace-microsoft opened this issue Jan 7, 2025 · 0 comments
Assignees
Labels
ai connector Anything related to AI connectors Build Features planned for next Build conference .NET Issue or Pull requests regarding .NET code SK-H2-Planning Issues tagged with this label are listed in SK H2 Planning loop

Comments

@markwallace-microsoft
Copy link
Member

markwallace-microsoft commented Jan 7, 2025

Implement a hybrid model orchestration within Semantic Kernel to leverage both local and cloud models. The system should default to local models for inference where available and seamlessly fall back to cloud models. Additionally, it should support local memory storage and retrieval, using cloud-based solutions as a fallback or for additional backup. This hybrid strategy should be abstracted within the Semantic Kernel, enabling developers to specify preferences and priorities without managing the underlying complexities. This should build on top of the capabilities we already have.

Scenarios

  • As a developer, I want my Semantic Kernel application to utilize local models for inference to achieve low-latency responses while falling back to cloud models when local models are unavailable or insufficient.

Requirements

Model Orchestration Layer:

  • Create a model orchestration layer within the Semantic Kernel capable of routing requests to either local or cloud models based on availability and priority settings.
  • Develop a configuration file where users can specify local and cloud model endpoints and prioritize their usage.
  • Inference Abstraction:
  • Abstract model inference calls such that the application can make a single call, and the underlying architecture decides whether to use local or cloud resources.
  • Support dynamic switching between local and cloud models based on real-time performance monitoring (e.g., latency, throughput).
@markwallace-microsoft markwallace-microsoft added triage .NET Issue or Pull requests regarding .NET code ai connector Anything related to AI connectors Build Features planned for next Build conference and removed triage labels Jan 7, 2025
@markwallace-microsoft markwallace-microsoft moved this to Backlog: Planned in Semantic Kernel Jan 7, 2025
@github-actions github-actions bot changed the title Hybrid Compute .Net: Hybrid Compute Jan 7, 2025
@markwallace-microsoft markwallace-microsoft added the SK-H2-Planning Issues tagged with this label are listed in SK H2 Planning loop label Jan 13, 2025
@markwallace-microsoft markwallace-microsoft changed the title .Net: Hybrid Compute .Net: Hybrid Compute - Models Jan 21, 2025
@SergeyMenshykh SergeyMenshykh moved this from Backlog: Planned to Sprint: In Progress in Semantic Kernel Jan 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ai connector Anything related to AI connectors Build Features planned for next Build conference .NET Issue or Pull requests regarding .NET code SK-H2-Planning Issues tagged with this label are listed in SK H2 Planning loop
Projects
Status: Sprint: In Progress
Development

No branches or pull requests

2 participants