Before we deep dive into, kimi K1.5 testing. lets understand about it
What is Kimi K1.5?
Kimi K1.5 is a multimodal AI model, meaning it can process both text and images. Unlike some other models like DeepSeek R1, which is open-source, Kimi K1.5 is not. However, the developers have shared its technical report, and it is freely available for use on their platform with no rate limits.
This model launched on the same day as R1 and claims to have impressive performance in various benchmarks. It is particularly noted for its short and long chain-of-thought reasoning, outperforming models like GPT-4o and Claude Sonnet 3.5 on AIM Math 500 and Live Code Bench. It also matches OpenAI’s GPT-4o across multiple modalities, including math, visual reasoning (Vista, AMIE), and competitive programming (CodeForces).
Training and Capabilities
- Long Context Scaling: Supports up to 128K token context, improving performance with longer inputs.
- Reinforcement Learning: Uses an advanced optimization method called online mirror descent.
- Multimodal Processing: Jointly trained on both text and vision data, allowing it to analyze images and text together.
Kimi K1.5 is available in two versions on their platform:
- Base Kimi Model – A simpler version without chain-of-thought reasoning.
- Long Thinking Mode – A more advanced version that can process complex reasoning tasks.
For this test, I used the Long Thinking Mode to evaluate its capabilities.
Secret API Access for Free
A few days after DeepSeek’s release, another Chinese model, Kimi K1.5 Long Thinking, gained attention. It is said to be similar to OpenAI’s GPT-4o but at one-tenth the cost. Interestingly, there is a way to access Kimi K1.5’s API for free for an extended period.
How to Get the Free API Key
The Kimi K1.5 team has released a Google Form where users can apply for free API access. Here’s what you need to do:
- Fill in your first name, last name, email address, job title, and a link to your social profile.
- Mention your research interests and usage scenario (e.g., running an AI content YouTube channel).
- State your expected API usage (5-10 requests initially).
- Approximate token usage (mention 10,000 for safety).
- Estimated duration of API access (suggest 1-2 months).
- Your country.
- Submit the form.
After submission, the team will verify your application. Within a week, you should receive an email from Moonshot AI, the parent company of Kimi K1.5, granting API access. The email contains a free API key with a 20 million token quota—which is huge for research and development!
How to Use the Free API Key
Once you receive your API key, here’s how you can start using it:
import openai
openai.api_key = "YOUR_API_KEY_HERE"
openai.api_base = "https://api.moonshot.ai/v1"
response = openai.ChatCompletion.create(
model="kimi-k1.5-preview",
messages=[{"role": "user", "content": "Find the hypotenuse of a right triangle with legs 3 cm and 4 cm."}],
stream=True
)
for chunk in response:
print(chunk)
This setup is very similar to DeepSeek’s API, so if you have used that before, transitioning will be seamless. The model processes the query and provides answers efficiently.
Testing the Model
I ran Kimi K1.5 through 13 different tasks, covering logic, math, language processing, and coding. Below are the results:
Task | Expected Answer | Result |
---|---|---|
Name a country ending in -lia and its capital | Australia, Canberra | ✅ Pass |
Number that rhymes with a tall plant | Three | ✅ Pass |
Haiku where second letters spell simple | Custom haiku | ✅ Pass |
English adjective of Latin origin (11 letters, vowels in order) | Transparent | ✅ Pass |
Correcting overstated count (48 people, 20% over) | 40 | ✅ Pass |
Apple counting word problem | 2 | ✅ Pass |
Counting Sally’s sisters logically | 1 | ✅ Pass |
Hexagon diagonal calculation | 73.9 | ✅ Pass |
HTML page with a confetti button | Functional | ✅ Pass |
Playable synth keyboard (HTML, CSS, JS) | Functional | ❌ Fail |
Generate SVG of a butterfly | Correct shape | ❌ Fail |
3D moving circle in HTML, CSS, JS | Functional | ✅ Pass |
Game of Life in Python (Terminal) | Functional | ✅ Pass |
Analysis: Strengths & Weaknesses
Strengths
- Logical & Math Problems: It handled logic puzzles and math calculations with ease.
- Long Chain-of-Thought Reasoning: The model successfully tackled complex, multi-step problems.
- General Knowledge & Language Tasks: It produced accurate results for trivia and creative writing tasks.
- Web Development Tasks: It successfully created a functional HTML page with interactivity.
Weaknesses
- Coding Performance: It struggled with advanced programming tasks. Unlike DeepSeek R1, which excels at coding, Kimi K1.5 had difficulty with a playable synth keyboard and generating a correct butterfly SVG.
- Token Repetition: Occasionally, it repeated words and tokens, which affected fluency.
- Limited Uniqueness: While it’s a solid model, it doesn’t offer anything groundbreaking beyond what’s available with models like DeepSeek R1.
Final Verdict
Kimi K1.5 is a good multimodal model with solid reasoning capabilities, but it doesn’t bring anything significantly new to the table. While it outperforms certain models in reasoning tasks, its lack of open-source availability and struggles with coding tasks make it less appealing compared to models like DeepSeek R1. Additionally, since its API is not yet widely available, its practical use remains limited.
That said, with free access to 20 million tokens, it is a fantastic opportunity for those who need a high-performance reasoning model at no cost.