Skip to content

AutoGen Plugin: LLM

The FlotorchAutogenLLM provides an AutoGen-compatible interface for accessing language models through FloTorch Gateway. It implements AutoGen’s ChatCompletionClient interface, enabling seamless integration with AutoGen’s agent framework while leveraging FloTorch’s managed model infrastructure. It handles complexities such as message conversion, tool call processing, structured output generation, and usage accounting.

Before using FlotorchAutogenLLM, ensure you have completed the general prerequisites outlined in the AutoGen Plugin Overview, including installation and environment configuration.

Configure your LLM instance with the following parameters:

FlotorchAutogenLLM(
model_id: str, # Model identifier from FloTorch Console (required)
api_key: str, # FloTorch API key for authentication (required)
base_url: str # FloTorch Gateway endpoint URL (required)
)

Parameter Details:

  • model_id - The unique identifier of the model configured in FloTorch Console
  • api_key - Authentication key for accessing FloTorch Gateway (can be set via environment variable)
  • base_url - The FloTorch Gateway endpoint URL (can be set via environment variable)

Fully implements AutoGen’s ChatCompletionClient interface:

  • Message Conversion - Seamlessly converts AutoGen messages to FloTorch format
  • Tool Support - Handles tool calls and function bindings
  • Structured Output - Supports structured JSON output when json_output is provided
  • Usage Accounting - Tracks token usage and provides usage statistics

Provides comprehensive response handling:

  • Content Extraction - Extracts text content from model responses
  • Function Calls - Processes function calls and tool invocations
  • Finish Reasons - Handles various completion states (stop, length, tool_calls)
  • Streaming Support - Supports streaming responses via create_stream method

Seamlessly integrates with FloTorch Gateway:

  • OpenAI-Compatible API - Uses FloTorch Gateway /api/openai/v1/chat/completions endpoint
  • Model Registry - Works with models configured in FloTorch Model Registry
  • Authentication - Handles API key authentication automatically
  • Error Handling - Provides robust error handling for network and API issues

Enables structured output generation:

  • JSON Output - Supports structured JSON output when json_output is provided
  • Automatic Activation - Automatically enables structured output when tools are absent or tool results are present
  • Schema Validation - Validates output against provided schemas
from flotorch.autogen.llm import FlotorchAutogenLLM
# Initialize FloTorch LLM
llm = FlotorchAutogenLLM(
model_id="your-model-id",
api_key="your_api_key",
base_url="https://gateway.flotorch.cloud"
)
# Use with AutoGen agents
from autogen import AssistantAgent
agent = AssistantAgent(
name="my-agent",
model_client=llm,
system_message="You are a helpful assistant."
)
from flotorch.autogen.llm import FlotorchAutogenLLM
# Initialize FloTorch LLM
llm = FlotorchAutogenLLM(
model_id="your-model-id",
api_key="your_api_key",
base_url="https://gateway.flotorch.cloud"
)
# Define tools
def get_weather(location: str) -> str:
"""Get weather for a location."""
return f"Weather in {location}: Sunny, 72°F"
tools = [get_weather]
# Create agent with tools
agent = AssistantAgent(
name="weather-agent",
model_client=llm,
tools=tools,
system_message="You are a weather assistant."
)
from flotorch.autogen.llm import FlotorchAutogenLLM
# Initialize FloTorch LLM
llm = FlotorchAutogenLLM(
model_id="your-model-id",
api_key="your_api_key",
base_url="https://gateway.flotorch.cloud"
)
# Stream responses
messages = [{"role": "user", "content": "Tell me a story"}]
async for chunk in llm.create_stream(messages):
if isinstance(chunk, str):
print(chunk, end="", flush=True)
else:
# Final CreateResult
print(f"\nUsage: {chunk.usage}")
  1. Environment Variables - Use environment variables for credentials to enhance security
  2. Model Selection - Choose appropriate models based on your task requirements and performance needs
  3. Error Handling - Implement proper error handling for production environments
  4. Tool Integration - Define tools with clear descriptions and proper error handling
  5. Structured Output - Use structured output for predictable response formats when needed
  6. Streaming - Use streaming for long-running conversations to improve user experience