AutoGen Plugin: LLM
The FlotorchAutogenLLM provides an AutoGen-compatible interface for accessing language models through FloTorch Gateway. It implements AutoGen’s ChatCompletionClient interface, enabling seamless integration with AutoGen’s agent framework while leveraging FloTorch’s managed model infrastructure. It handles complexities such as message conversion, tool call processing, structured output generation, and usage accounting.
Prerequisites
Section titled “Prerequisites”Before using FlotorchAutogenLLM, ensure you have completed the general prerequisites outlined in the AutoGen Plugin Overview, including installation and environment configuration.
Configuration
Section titled “Configuration”Parameters
Section titled “Parameters”Configure your LLM instance with the following parameters:
FlotorchAutogenLLM( model_id: str, # Model identifier from FloTorch Console (required) api_key: str, # FloTorch API key for authentication (required) base_url: str # FloTorch Gateway endpoint URL (required))Parameter Details:
model_id- The unique identifier of the model configured in FloTorch Consoleapi_key- Authentication key for accessing FloTorch Gateway (can be set via environment variable)base_url- The FloTorch Gateway endpoint URL (can be set via environment variable)
Features
Section titled “Features”ChatCompletionClient Interface
Section titled “ChatCompletionClient Interface”Fully implements AutoGen’s ChatCompletionClient interface:
- Message Conversion - Seamlessly converts AutoGen messages to FloTorch format
- Tool Support - Handles tool calls and function bindings
- Structured Output - Supports structured JSON output when
json_outputis provided - Usage Accounting - Tracks token usage and provides usage statistics
Response Processing
Section titled “Response Processing”Provides comprehensive response handling:
- Content Extraction - Extracts text content from model responses
- Function Calls - Processes function calls and tool invocations
- Finish Reasons - Handles various completion states (stop, length, tool_calls)
- Streaming Support - Supports streaming responses via
create_streammethod
Gateway Integration
Section titled “Gateway Integration”Seamlessly integrates with FloTorch Gateway:
- OpenAI-Compatible API - Uses FloTorch Gateway
/api/openai/v1/chat/completionsendpoint - Model Registry - Works with models configured in FloTorch Model Registry
- Authentication - Handles API key authentication automatically
- Error Handling - Provides robust error handling for network and API issues
Structured Output Support
Section titled “Structured Output Support”Enables structured output generation:
- JSON Output - Supports structured JSON output when
json_outputis provided - Automatic Activation - Automatically enables structured output when tools are absent or tool results are present
- Schema Validation - Validates output against provided schemas
Usage Example
Section titled “Usage Example”Basic LLM Usage
Section titled “Basic LLM Usage”from flotorch.autogen.llm import FlotorchAutogenLLM
# Initialize FloTorch LLMllm = FlotorchAutogenLLM( model_id="your-model-id", api_key="your_api_key", base_url="https://gateway.flotorch.cloud")
# Use with AutoGen agentsfrom autogen import AssistantAgent
agent = AssistantAgent( name="my-agent", model_client=llm, system_message="You are a helpful assistant.")LLM with Tools
Section titled “LLM with Tools”from flotorch.autogen.llm import FlotorchAutogenLLM
# Initialize FloTorch LLMllm = FlotorchAutogenLLM( model_id="your-model-id", api_key="your_api_key", base_url="https://gateway.flotorch.cloud")
# Define toolsdef get_weather(location: str) -> str: """Get weather for a location.""" return f"Weather in {location}: Sunny, 72°F"
tools = [get_weather]
# Create agent with toolsagent = AssistantAgent( name="weather-agent", model_client=llm, tools=tools, system_message="You are a weather assistant.")Streaming Responses
Section titled “Streaming Responses”from flotorch.autogen.llm import FlotorchAutogenLLM
# Initialize FloTorch LLMllm = FlotorchAutogenLLM( model_id="your-model-id", api_key="your_api_key", base_url="https://gateway.flotorch.cloud")
# Stream responsesmessages = [{"role": "user", "content": "Tell me a story"}]async for chunk in llm.create_stream(messages): if isinstance(chunk, str): print(chunk, end="", flush=True) else: # Final CreateResult print(f"\nUsage: {chunk.usage}")Best Practices
Section titled “Best Practices”- Environment Variables - Use environment variables for credentials to enhance security
- Model Selection - Choose appropriate models based on your task requirements and performance needs
- Error Handling - Implement proper error handling for production environments
- Tool Integration - Define tools with clear descriptions and proper error handling
- Structured Output - Use structured output for predictable response formats when needed
- Streaming - Use streaming for long-running conversations to improve user experience
Next Steps
Section titled “Next Steps”- Agent Configuration - Learn how to integrate LLMs with agents
- Memory Integration - Add memory capabilities to your LLM-powered agents
- Session Management - Implement persistent conversations