Skip to content

PoC: InferenceClient is also a McpClient #1351

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 37 commits into from
Apr 25, 2025
Merged

PoC: InferenceClient is also a McpClient #1351

merged 37 commits into from
Apr 25, 2025

Conversation

julien-c
Copy link
Member

@julien-c julien-c commented Apr 11, 2025

Required reading

https://modelcontextprotocol.io/quickstart/client

TL;DR: MCP is a standard API to expose sets of Tools that can be hooked to LLMs

Summary of how to use this

You can either use McpClient, or you can run an example Agent directly:

Tiny Agent

We now have a tiny Agent (a while loop, really) in this PR, built on top of the MCP Client. You can run it like this:

cd packages/mcp-client
pnpm run agent

McpClient i.e. the underlying class

const client = new McpClient({
	provider: "together",
	model: "Qwen/Qwen2.5-72B-Instruct",
	apiKey: process.env.HF_TOKEN,
});

await client.addMcpServer({
	// Filesystem "official" mcp-server with access to your Desktop
	command: "npx",
	args: ["-y", "@modelcontextprotocol/server-filesystem", join(homedir(), "Desktop")],
});

Variant where we call a custom, local MCP server

await client.addMcpServer(
	"node",
	["--disable-warning=ExperimentalWarning", join(homedir(), "Desktop/hf-mcp/index.ts")],
	{
		HF_TOKEN: process.env.HF_TOKEN,
	}
);

const response = await client.processQuery(`
	find an app that generates 3D models from text,
	and also get the best paper about transformers
`);

Where to find the MCP Server used here as an example

https://gist.github.com/julien-c/0500ba922e1b38f2dc30447fb81f7dc6

(Note that you can replace it with any MCP Server, from this doc for instance: https://modelcontextprotocol.io/examples)

Python version

Python version will be implemented in huggingface_hub in this PR: huggingface/huggingface_hub#2986

Contributions are welcome!

Copy link

@grll grll left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

huggingface/huggingface_hub#2986 few of my comments there are also relevant here. Probably major one is I think you could add SSE support quite easily.

Also very minor but first time I see typescript with 4 indents, but why not.

@coyotte508
Copy link
Member

Also very minor but first time I see typescript with 4 indents, but why not.

It uses tabs, you can set your tab width to whatever you want :) (2, 4 or 8)

@grll
Copy link

grll commented Apr 17, 2025

Ah the good old tab vs space conundrum then :)

@julien-c
Copy link
Member Author

At this point we should let LLMs decide between tabs and spaces once and for all

@grll
Copy link

grll commented Apr 17, 2025

At this point we should let LLMs decide between tabs and spaces once and for all

2025-04-17-134859_886x333_scrot

Claude has spoken

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@julien-c
Copy link
Member Author

Ok this is ready for review!

Note, there's now a small Agent.ts class, and an example agent you can run with:

cd packages/mcp-client
pnpm run agent

@Wauplin
Copy link
Contributor

Wauplin commented Apr 24, 2025

Not sure I am using this the expected way but I had to tweak a bit to run hf-mcp.ts:

  1. Added some dev dependencies to the project
	"devDependencies": {
		"@huggingface/hub": "workspace:^",
		"ts-node": "^10.9.2",
		"typescript": "^5.8.3",
		"zod": "^3.24.3"
	}
  1. Fixed a compile error in hf-mcp.ts (see https://gist.github.com/julien-c/0500ba922e1b38f2dc30447fb81f7dc6?permalink_comment_id=5553466#gistcomment-5553466)
  2. Added "hf-mcp": "npx ts-node hf-mcp.ts" as official script to package.json (this is only for easier testing for me)
  3. Updated in cli.ts to run the command:
		command: "pnpm",
		args: ["run", "hf-mcp"],
  1. (also had to hardcode the authorized path for filesystem MCP server-filesystem)

Then with everything setup I ran pnpm agent with prompt

List packages implemented under ~/projects/huggingface.js and make a nice looking HTML page listing them shortly. No need to provide further information per package. Once you have that, publish this HTML page as a static Space named "list-huggingface.js-packages"

(yup, not very inspired here)

=> resulting in https://huggingface.co/spaces/Wauplin/list-huggingface.js-packages


Some random feedback:

  • once the model context went too big (600k tokens in a single query). I requested something like "what are the projects" and it listed all files recursively and gave this to the model... Maybe have something to truncate in such a case? (for now it crashes the pnpm agent command)
  • I feel the output of the agent/cli is too bloated. Hard to follow what's interesting and what's not. Nothing much we can do though

@julien-c
Copy link
Member Author

@Wauplin I should have updated the PR description because hf-mcp.ts (an example MCP server) is quite orthogonal to the PR. I'll update the PR desc. right now.

Your agent's Generated Space looks cool though 😍 cc @gary149:

image

Comment on lines +16 to +21
export interface ChatCompletionInputMessageTool extends ChatCompletionInputMessage {
role: "tool";
tool_call_id: string;
content: string;
name?: string;
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing type in our spec probably @hanouticelina

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well:

  • it's not in our generated types (inference.ts)
  • and it's weird to me to use a output message type as an input message
    wdyt?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok ok you're right, I'd rather define another type indeed, but I'd keep this type defined here instead of putting it in our specs if that's okay for you

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes that works

@julien-c julien-c changed the title PoC: InferenceClient is also a MCPClient PoC: InferenceClient is also a McpClient Apr 24, 2025
role: "tool",
tool_call_id: toolCall.id,
content: "",
name: toolName,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to include the tool name here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe not for the LLM inference, but i'm using it in my agent's outer loop (in cli.ts) and i think it can't hurt anyways

@julien-c julien-c merged commit 0e2cf94 into main Apr 25, 2025
6 checks passed
@julien-c julien-c deleted the mcp-client branch April 25, 2025 10:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants