-
Notifications
You must be signed in to change notification settings - Fork 6.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Advanced Paste > Paste with AI] Custom Model / Endpoint Selection #32960
Comments
It would be nice to have local models too. |
@minzdrav This would be enabled by my proposed change - Ollama provides partial support for the OpenAI API schema, so you'd be able to point the plugin at your local model |
In particular, supporting an Azure OpenAI endpoint would be great first implementation. It would be even better if the Azure implementation supported Managed Identities so we don't end up with the unmanageable mess of API key distribution and rotation. |
supporting Groq would be nice too |
IMPORTANT |
bump... |
bump |
Has anyone started working on this item? |
To my knowledge, no The basics should be pretty easy to implement though! All you'd need to do to allow for a different api-compatible host and model is add two text fields to the settings page (model, URL) and link them in exactly the same way that the chatgpt token field is currently linked into the app (as far as I know, they are just additional inputs into the same function in the associated library) Obviously making it "Microsoft-quality" will require more work on documentation and integration - see the points @htcfreek has raised in this thread for examples of these I'd be happy to take a look, but I won't be able to for at least a week so you may be better placed than me. |
@nathancartlidge , @tjtanaa
Are you referring to my comment regarding the Group Policies above. |
Yeah, that's what I was referring to! It's a great addition, but also the kind of thing I'd completely overlook when building this sort of feature :) I hadn't seen that thread before, thanks for bringing it up - from a cursory reading it does look like their work could currently be independent from this, as it seems to be exclusively non-ai features - however, I agree that it could make sense to combine them for the sake of reduced development overheads. |
Thank you very much for the suggestions @nathancartlidge . I have a prototype version which leads me to think there are some changes that I am thinking of making. I would be great if I could get some inputs. I am planning to target local LLM Usecase on PC without dedicated GPU. (In most cases, there is only enough resources to host one model at a time).
Other feature improvements would be adding some common usecases as quick access on menu, such as
Moreover, I also saw that there is a branch I am new to |
Google Gemini 1.5 Flash (Sep) is the fastest AI and has a free API. It would also be nice to add some AI actions by default or functionality like in Writing Tools: Where you select a text, you choose or write a task for the AI and then it replaces the selected text with the processed AI text automatically. (Instead of copying and choosing the option to paste the LLM-processed text). You can also translate text prompting 'in Spanish' (and whatever you want). |
+1 on this. Quickest and minimal effort change since the team is probably busy is just let the user override the api root url. Single settings entry + some concat + maybe a bit more robust error handling since endpoint isnt fixed now. Then if the user wants to redirect the request to a local llm server or a middleman script that proxies the request to a different model or whatever they can. Over time though, what everyone else above said. As a temporary solution, I found software like Fiddler Classic can be used to manually redirect calls to openai's api to anywhere you want. I'm not good with regex and I rushed it but this works: Runs just fine on a local gemma-2 9B IQ4 XS on LM Studio Server, but I found some other models may have issues complying with the instruction to not write too much garbage. Llama in particular just kept trying to write me python to complete the task after it already completed the task. |
I also need this feature very much. My network does not allow me to use OpenAI services, and I often have to work offline. I think we can add API address and model name options so that I can use models from Olama and other service providers. |
Hi, I just want to chime in. Cool feature that I unfortunately cannot use right now... Having OpenAI-Endpoints as the only option, is IMHO a red flag for many commercial users in Europe due to GDPR-concerns (especially coupling sth to copy & paste). I also wonder if such bundling is not rather anti-competitive in the end. Similar to the availabilities of different models in GitHub Co-Pilot, it would seem good practice to offer different endpoints. |
I also noticed that Advanced Paste is incompatible with some alternative endpoint formats. I wrote a script for mitmproxy, with the help of LLM. Maybe it can solve this issue. Start command:
|
+1 for ollama support on localhost/IP on local network. |
This works surprisingly well. I ended up adding rules manually in the FiddlerScript. Thanks! |
Trying to get this to run without success. Can you share your Fiddler script? I keep getting 400 responses from ollama :( |
Are there some news on integration of Advanced paste with Microsoft Copilot? |
If they want to stay with openai which is understandable I'd at least appreciate if we can get a model selection for the currently available openai models. For example I'd prefer using gpt-4o or gpt-4o turbo instead of gpr3.5. That'd already help a lot! |
Has there been any progress on this process? |
+1 for this request any improvements ? |
+1 for this request, any improvements? |
1 similar comment
+1 for this request, any improvements? |
+1 for this request, any improvements? |
Description of the new feature / enhancement
It should be possible to configure the model used (currently fixed as
gpt3.5-turbo
) and endpoint (currently fixed as OpenAI's) to arbitrary valuesScenario when this would be used?
Sending requests to an alternative AI endpoint (eg a local model, internal company hosted models, alternative ai providers), or ensuring higher-quality conversions (eg by pointing requests at gpt-4o)
Supporting information
Microsoft's documentation appears to suggest that the underlying library used for AI completions supports other libraries, it just needs to be provided with an endpoint.
The currently used model is a hardcoded string in this repository
The text was updated successfully, but these errors were encountered: