The Problem: LLMs Can’t Access Real-Time Data

Large language models (LLMs) are powerful, but they’ve always had a fundamental limitation: they can only work with knowledge they were trained on. Ask an LLM a simple question like “What time is it?” and you’ll get a response like:

I don’t have real-time data access, so I can’t provide the current time. However, if you need to know the time, you can check your device or use an external service.

This limitation has created a strong desire to extend LLM functionality, allowing them to access real-time information and interact with external systems.

The Vision: A 24/7 AI Assistant

I’ve been fascinated by the potential of function calling, particularly when combined with locally running LLMs. My interest grew when I discovered the CrewAI framework’s function calling capabilities, though I encountered challenges getting it to work consistently with local LLMs (as discussed in this thread and [this thread(https://github.com/crewAIInc/crewAI-tools/issues/7).

My ultimate vision is an LLM running continuously on my home server, with access to:

  • Personal notes and documents
  • Home Assistant automation
  • Music playback control
  • Weather monitoring
  • Real-time data sources

This setup would require the LLM to access information and systems beyond its training data.

Function Calling: A First Attempt at a Solution

The initial solution to this limitation was function calling - giving LLMs access to external systems through defined functions. Here’s how it works:

  1. Configure your LLM with function definitions (functionality, inputs, outputs)
  2. The LLM decides which functions to call during processing
  3. Function results are incorporated into the LLM’s responses

For example, with a datetime function that accepts a timezone parameter, the conversation might go:

User: “What time is it in EST?” LLM: calls datetime function with user’s timezone Response: “It is 9:12 PM in the Eastern timezone”

The Challenge: Inconsistent Implementation

The main issue with function calling was that every LLM implemented it differently. Open AI, and Llama both support function calling. They do so by leveraging specific prompting. An example configuration that could facilitate looking up flight information could look like:

[
   {
      "type":"function",
      "function":{
         "name":"get_flight_info",
         "description":"Get flight information between two locations",
         "parameters":{
            "type":"object",
            "properties":{
               "loc_origin":{
                  "type":"string",
                  "description":"The departure airport, e.g. DUS"
               },
               "loc_destination":{
                  "type":"string",
                  "description":"The destination airport, e.g. HAM"
               }
            }
         },
         "required":[
            "loc_origin",
            "loc_destination"
         ],
         "additionalProperties":false
      }
   }
]

If you specifically want to learn more about function calling with Open AI or Llama, see their linked documentation.

The main take away here is that other LLMs would implement function calling in a totally different way. This made developing tooling for this feature fragile, and often overlooked.

Enter Model Context Protocol (MCP): Anthropic’s Open Protocol Solution

Anthropic, the creators of the LLM Claude, saw this problem and decided to design an open protocol, and release it to the world. They called this the Model Context Protocol (MCP).Here’s a convenient link to the official documentation.

Key Benefits of MCP:

  1. Universal Compatibility: Works with any LLM, not just Anthropic’s models
  2. Multi-Function Support: Can call multiple functions in a single prompt
  3. Function Chaining: Output from one function can feed into another
  4. Standardization: One protocol for all LLMs, including potential integration with ChatGPT

Let’s look at how MCP works:

MCP Clients

MCP Clients are the tools that you the human interact with. They maintain a 1:1 connection with MCP servers. An example of an MCP Client would be a UI chat program, or autocomplete tools that you use, which are powered by LLMs.

You can read about a few clients and their features here.

MCP Servers

MCP Servers are what expose additional functionality and information to your LLMs. Each MCP Server will have 1 to many functions that are offered up to each LLM. An example of an MCP Server would be one that has functionality to say provide your LLM with the current date and time.

Leveraging MCP With Claude

If one wants to use MCP with Claude, you must do so through the

  1. Download, and install Claude for Desktop
  2. Log in with your Anthropic account
  3. In the hamburger menu at the top, go to Access config via File > Settings > Developer > Edit Config. Please note, this is separate from going to the “settings” page near the bottom left corner, where you’re logged in user is indicated. This will open a folder where your Claude desktop configuration file lives.
    • On Windows this will be at %AppData%\Claude\claude_desktop_config.json
    • On Mac this will be at ~/Library/Application Support/Claude/claude_desktop_config.json
    • On Linux this wil be at ~/.config/Claude/claude_desktop_config.json.
  4. Add your configuration. An example configuration file would look like:
{
   "mcpServers": {
      "MCP-timeserver": {
         "command": "uvx",
         "args": ["MCP-timeserver"]
      },
   }
}
  1. Kill the Claude Desktop task using task manager.
  2. Re-open Claude desktop
  3. Verify that you see a “hammer” icon in the bottom right corner, indicating that you have a MCP server configured appropriately. it should look like this:

  1. At this point you can interact with your LLM and based on your prompt, the LLM may interact with your MCP servers. You’ll be prompted for permission each time. It will look like this:

Selecting A Local LLM

The LLM space develops rapidly, with new models coming out quite frequently. I recommend looking at the Berkeley’s function calling leaderboard to decide which model you’ll use for function calling.

At the time of writing this, I tend to use llama3.2 or qwen2.5.

Experimenting With A MCP Server

This approach is somewhat limited when compared to the Anthropic Claude Desktop approach. As the latter lets you easily define many tools, and let the LLM choose which it’ll use. However this approach simply let’s you only define one tool to use at a time. This may change in the future as tooling like Ollama, and Open Web UI develop further, or the mcp-cli tool which we’ll use has this PR merged.

  1. Have Ollama download our model of choice: $ ollama pull [model name] where model name can be an model that you have enough hardware to run, and supports function calling. I’d generically recommend experimenting with say llama3.2 or qwen2.5:7b.
  2. If you’re running Ollama on a separate machine then where you’ll be running the MCP servers, configure Ollama to allow connects from other machines. On windows we’d run $ set OLLAMA_HOST=0.0.0.0, while on a Mac/Linux we’d run $ export OLLAMA_HOST=0.0.0.0.
  3. Have Ollama download your LLM of choice: $ ollama pull [model name].
  4. Have Ollama start, making your LLM ready to be interacted with.
  5. Leverage a tool like mcp-cli to interact with your LLM and MCP servers. Once you have this tool downloaded, simply run: $ uv run mcp-cli --server sqlite --provider ollama --model [model name]. This command assumes your Ollama instance is running on the same machine you’re running the command on. If you run your LLM on a separate machine, modify the command to look like: $ OLLAMA_HOST=[IP of Ollama server] uv run mcp-cli --server sqlite --provider ollama --model [model name].
  6. Inside the MCP Command-Line Tool prompt, begin a chat session: $ chat.
  7. At this point you can interact with your Ollama LLM, and have it leverage its configured tool.

Next Steps

One of the very next things I plan to do is experiment with other MCP Clients that are not Claude’s Desktop application. Looking at the following chart from the official model context protocol documentation:

We can see that only the Claude Desktop App, and Continue have full support for all MCP features. I’d like to experiment with Continue, to leverage some MCP Servers with a locally running LLM via say Ollama. While I’m not personally stoked that Continue is primarily a VSCode extension, I’m willing to live with that for now, while other tooling catches up.

An alternative I’ll explore to being stuck with COntinue is through the “bridging” between Ollama and MCP servers. This would allow one to have their LLM access MCP servers, without being restricted to using VSCode. See these repositories that I may play with, and write about in a future blog post:

Additionally, I’m very interested in writing my own Custom MCP server. Fortunately, Anthropic provided SDKs in typescript, python, and kotlin to make writing one’s own MCP servers easier. Additionally, the community made frameworks that leverage these SDKs and perform a slew of the heavy lifting for you are beginning to pop up. For example, this fastmcp framework in Typescript looks incredibly promising.

Conclusion

Overall, I believe function calling, and Anthropic’s MCP, to be the natural next step for LLMs. I fortunately stumbled up on this early enough, and have had the pleasure of experimenting with these approaches at ground zero.