Skip to content

OmniLLM

GLM (Zhipu AI)

plexusone/omnillm

GLM (Zhipu AI)¶

Overview¶

Models: GLM-5, GLM-4.7, GLM-4.6, GLM-4.5 series
Features: Chat completions, streaming, OpenAI-compatible API, thinking modes

Configuration¶

client, err := omnillm.NewClient(omnillm.ClientConfig{
    Providers: []omnillm.ProviderConfig{
        {Provider: omnillm.ProviderNameGLM, APIKey: "your-glm-api-key"},
    },
})

Available Models¶

Model	Context Window	Description
`glm-5`	200K	Flagship MoE model (744B/40B active), forced thinking
`glm-5-code`	200K	Code-specialized GLM-5 variant
`glm-4.7`	200K	Premium model with Interleaved Thinking
`glm-4.7-flashx`	200K	High-speed paid version with priority GPU
`glm-4.7-flash`	200K	Free SOTA model with hybrid thinking
`glm-4.6`	200K	Balanced model with auto-thinking
`glm-4.5`	128K	Unified reasoning/coding/agent model
`glm-4.5-flash`	128K	Free model with function calling

OpenAI Compatibility¶

GLM uses an OpenAI-compatible API endpoint, so standard parameters work seamlessly:

response, err := client.CreateChatCompletion(ctx, &omnillm.ChatCompletionRequest{
    Model: "glm-4.7-flash",
    Messages: []omnillm.Message{
        {Role: omnillm.RoleUser, Content: "Hello!"},
    },
    Temperature: &[]float64{0.7}[0],
    MaxTokens:   &[]int{1000}[0],
})

Streaming¶

stream, err := client.CreateChatCompletionStream(ctx, &omnillm.ChatCompletionRequest{
    Model: "glm-4.7-flash",
    Messages: messages,
})
if err != nil {
    log.Fatal(err)
}
defer stream.Close()

for {
    chunk, err := stream.Recv()
    if err == io.EOF {
        break
    }
    if err != nil {
        log.Fatal(err)
    }
    fmt.Print(chunk.Choices[0].Delta.Content)
}

API Documentation¶