mirror of
https://github.com/vxcontrol/pentagi.git
synced 2026-07-01 23:04:46 -04:00
feat(qwen): enhance agent configurations with thinking control parameters
Updated the Qwen agent configuration to include `extra_body` parameters for thinking control across various models. Added `enable_thinking` and `preserve_thinking` options for reasoning agents, while utility agents have `enable_thinking` set to false. Adjusted the Qwen client initialization to support these new configurations. Updated test report to reflect changes in success rates and latencies.
This commit is contained in:
@@ -1,8 +1,32 @@
|
||||
# Qwen (Alibaba Cloud DashScope) agent configuration.
|
||||
#
|
||||
# Strategy: qwen3.7-max (flagship) for critical reasoning, qwen3.6-plus (mid-tier
|
||||
# multimodal Plus) for orchestration, qwen3.5-flash for cheap utility, qwen3-coder-*
|
||||
# for code-specialized tasks (installer/coder).
|
||||
#
|
||||
# CRITICAL thinking control via extra_body (DashScope-specific, not OpenAI standard):
|
||||
# - enable_thinking=false: utility agents (simple, simple_json, reflector, searcher,
|
||||
# enricher) MUST set this explicitly. Qwen3.5/3.6/3.7 hybrid models have thinking
|
||||
# ENABLED by default. Without disabling, reasoning_content leaks into content and
|
||||
# corrupts short deterministic outputs (e.g. docker image selector returning the
|
||||
# full chain-of-thought instead of just "vxcontrol/kali-linux").
|
||||
# - enable_thinking=true: reasoning agents (primary, assistant, generator, refiner,
|
||||
# adviser, pentester) explicitly enable thinking. Redundant with hybrid defaults
|
||||
# but defensive against future provider changes.
|
||||
# - preserve_thinking=true: keeps reasoning_content from previous assistant turns
|
||||
# in subsequent requests. Supported ONLY by qwen3.7-max and qwen3.6-plus families.
|
||||
# Required for agent loops with tool calls to preserve reasoning continuity.
|
||||
# Works together with WithPreserveReasoningContent() in qwen.go.
|
||||
# - qwen3-coder-* (coder, installer) are NOT hybrid thinking models — no thinking
|
||||
# control needed.
|
||||
|
||||
simple:
|
||||
model: "qwen3.5-flash"
|
||||
temperature: 0.6
|
||||
n: 1
|
||||
max_tokens: 8192
|
||||
extra_body:
|
||||
enable_thinking: false
|
||||
price:
|
||||
input: 0.1
|
||||
output: 0.4
|
||||
@@ -14,6 +38,8 @@ simple_json:
|
||||
n: 1
|
||||
max_tokens: 4096
|
||||
json: true
|
||||
extra_body:
|
||||
enable_thinking: false
|
||||
price:
|
||||
input: 0.1
|
||||
output: 0.4
|
||||
@@ -24,6 +50,9 @@ primary_agent:
|
||||
temperature: 1.0
|
||||
n: 1
|
||||
max_tokens: 16384
|
||||
extra_body:
|
||||
enable_thinking: true
|
||||
preserve_thinking: true
|
||||
price:
|
||||
input: 0.5
|
||||
output: 3.0
|
||||
@@ -34,6 +63,9 @@ assistant:
|
||||
temperature: 1.0
|
||||
n: 1
|
||||
max_tokens: 16384
|
||||
extra_body:
|
||||
enable_thinking: true
|
||||
preserve_thinking: true
|
||||
price:
|
||||
input: 0.5
|
||||
output: 3.0
|
||||
@@ -44,6 +76,9 @@ generator:
|
||||
temperature: 1.0
|
||||
n: 1
|
||||
max_tokens: 32768
|
||||
extra_body:
|
||||
enable_thinking: true
|
||||
preserve_thinking: true
|
||||
price:
|
||||
input: 2.5
|
||||
output: 7.5
|
||||
@@ -54,6 +89,9 @@ refiner:
|
||||
temperature: 1.0
|
||||
n: 1
|
||||
max_tokens: 20480
|
||||
extra_body:
|
||||
enable_thinking: true
|
||||
preserve_thinking: true
|
||||
price:
|
||||
input: 2.5
|
||||
output: 7.5
|
||||
@@ -64,6 +102,9 @@ adviser:
|
||||
temperature: 1.0
|
||||
n: 1
|
||||
max_tokens: 8192
|
||||
extra_body:
|
||||
enable_thinking: true
|
||||
preserve_thinking: true
|
||||
price:
|
||||
input: 2.5
|
||||
output: 7.5
|
||||
@@ -74,6 +115,8 @@ reflector:
|
||||
temperature: 0.7
|
||||
n: 1
|
||||
max_tokens: 4096
|
||||
extra_body:
|
||||
enable_thinking: false
|
||||
price:
|
||||
input: 0.1
|
||||
output: 0.4
|
||||
@@ -84,6 +127,8 @@ searcher:
|
||||
temperature: 0.7
|
||||
n: 1
|
||||
max_tokens: 4096
|
||||
extra_body:
|
||||
enable_thinking: false
|
||||
price:
|
||||
input: 0.1
|
||||
output: 0.4
|
||||
@@ -94,6 +139,8 @@ enricher:
|
||||
temperature: 0.7
|
||||
n: 1
|
||||
max_tokens: 4096
|
||||
extra_body:
|
||||
enable_thinking: false
|
||||
price:
|
||||
input: 0.1
|
||||
output: 0.4
|
||||
@@ -124,6 +171,9 @@ pentester:
|
||||
temperature: 0.8
|
||||
n: 1
|
||||
max_tokens: 16384
|
||||
extra_body:
|
||||
enable_thinking: true
|
||||
preserve_thinking: true
|
||||
price:
|
||||
input: 0.5
|
||||
output: 3.0
|
||||
|
||||
@@ -83,11 +83,22 @@ func New(
|
||||
return nil, err
|
||||
}
|
||||
|
||||
// Alibaba Cloud DashScope OpenAI-compatible API. Thinking is controlled via
|
||||
// extra_body.enable_thinking (true/false) in the per-agent config — this is a
|
||||
// DashScope-specific parameter, not OpenAI standard. Qwen3.5/3.6/3.7 hybrid
|
||||
// models have thinking ENABLED by default, so utility agents must explicitly
|
||||
// set enable_thinking=false or reasoning_content will be returned inline as
|
||||
// part of content (corrupting outputs that expect short deterministic answers
|
||||
// like docker image selection or descriptors).
|
||||
// WithPreserveReasoningContent() is required for multi-turn with tool calls
|
||||
// when preserve_thinking=true is set in extra_body for qwen3.7-max/qwen3.6-plus
|
||||
// (other Qwen3 models do not support preserve_thinking).
|
||||
client, err := openai.New(
|
||||
openai.WithToken(cfg.QwenAPIKey),
|
||||
openai.WithModel(QwenAgentModel),
|
||||
openai.WithBaseURL(cfg.QwenServerURL),
|
||||
openai.WithHTTPClient(httpClient),
|
||||
openai.WithPreserveReasoningContent(),
|
||||
)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
|
||||
+313
-313
@@ -1,27 +1,27 @@
|
||||
# LLM Agent Testing Report
|
||||
|
||||
Generated: Wed, 27 May 2026 23:05:00 UTC
|
||||
Generated: Thu, 28 May 2026 16:06:51 UTC
|
||||
|
||||
## Overall Results
|
||||
|
||||
| Agent | Model | Reasoning | Success Rate | Average Latency |
|
||||
|-------|-------|-----------|--------------|-----------------|
|
||||
| simple | qwen3.5-flash | true | 23/23 (100.00%) | 3.014s |
|
||||
| simple_json | qwen3.5-flash | true | 5/5 (100.00%) | 7.088s |
|
||||
| primary_agent | qwen3.6-plus | true | 23/23 (100.00%) | 5.366s |
|
||||
| assistant | qwen3.6-plus | true | 23/23 (100.00%) | 5.758s |
|
||||
| generator | qwen3.7-max | true | 23/23 (100.00%) | 3.473s |
|
||||
| refiner | qwen3.7-max | true | 23/23 (100.00%) | 3.352s |
|
||||
| adviser | qwen3.7-max | true | 22/23 (95.65%) | 2.941s |
|
||||
| reflector | qwen3.5-flash | true | 23/23 (100.00%) | 3.377s |
|
||||
| searcher | qwen3.5-flash | true | 23/23 (100.00%) | 4.025s |
|
||||
| enricher | qwen3.5-flash | true | 23/23 (100.00%) | 2.857s |
|
||||
| coder | qwen3-coder-plus | true | 23/23 (100.00%) | 1.556s |
|
||||
| installer | qwen3-coder-flash | true | 20/23 (86.96%) | 1.060s |
|
||||
| pentester | qwen3.6-plus | true | 23/23 (100.00%) | 5.504s |
|
||||
| simple | qwen3.5-flash | false | 23/23 (100.00%) | 1.194s |
|
||||
| simple_json | qwen3.5-flash | false | 4/5 (80.00%) | 0.945s |
|
||||
| primary_agent | qwen3.6-plus | true | 23/23 (100.00%) | 6.079s |
|
||||
| assistant | qwen3.6-plus | true | 23/23 (100.00%) | 5.512s |
|
||||
| generator | qwen3.7-max | true | 23/23 (100.00%) | 5.172s |
|
||||
| refiner | qwen3.7-max | true | 23/23 (100.00%) | 5.455s |
|
||||
| adviser | qwen3.7-max | true | 23/23 (100.00%) | 4.750s |
|
||||
| reflector | qwen3.5-flash | true | 23/23 (100.00%) | 1.222s |
|
||||
| searcher | qwen3.5-flash | true | 23/23 (100.00%) | 1.083s |
|
||||
| enricher | qwen3.5-flash | true | 23/23 (100.00%) | 0.972s |
|
||||
| coder | qwen3-coder-plus | true | 23/23 (100.00%) | 1.702s |
|
||||
| installer | qwen3-coder-flash | true | 23/23 (100.00%) | 1.124s |
|
||||
| pentester | qwen3.6-plus | true | 23/23 (100.00%) | 9.207s |
|
||||
|
||||
**Total**: 277/281 (98.58%) successful tests
|
||||
**Overall average latency**: 3.587s
|
||||
**Total**: 280/281 (99.64%) successful tests
|
||||
**Overall average latency**: 3.575s
|
||||
|
||||
## Detailed Results
|
||||
|
||||
@@ -31,38 +31,38 @@ Generated: Wed, 27 May 2026 23:05:00 UTC
|
||||
|
||||
| Test | Result | Latency | Error |
|
||||
|------|--------|---------|-------|
|
||||
| Simple Math | ✅ Pass | 2.746s | |
|
||||
| Text Transform Uppercase | ✅ Pass | 2.043s | |
|
||||
| Count from 1 to 5 | ✅ Pass | 3.729s | |
|
||||
| Math Calculation | ✅ Pass | 1.686s | |
|
||||
| Basic Echo Function | ✅ Pass | 1.282s | |
|
||||
| Streaming Simple Math Streaming | ✅ Pass | 1.606s | |
|
||||
| Streaming Count from 1 to 3 Streaming | ✅ Pass | 2.200s | |
|
||||
| Streaming Basic Echo Function Streaming | ✅ Pass | 0.928s | |
|
||||
| Simple Math | ✅ Pass | 1.515s | |
|
||||
| Text Transform Uppercase | ✅ Pass | 1.075s | |
|
||||
| Count from 1 to 5 | ✅ Pass | 1.066s | |
|
||||
| Math Calculation | ✅ Pass | 1.122s | |
|
||||
| Basic Echo Function | ✅ Pass | 1.252s | |
|
||||
| Streaming Simple Math Streaming | ✅ Pass | 0.603s | |
|
||||
| Streaming Count from 1 to 3 Streaming | ✅ Pass | 0.569s | |
|
||||
| Streaming Basic Echo Function Streaming | ✅ Pass | 0.800s | |
|
||||
|
||||
#### Advanced Tests
|
||||
|
||||
| Test | Result | Latency | Error |
|
||||
|------|--------|---------|-------|
|
||||
| JSON Response Function | ✅ Pass | 1.163s | |
|
||||
| Search Query Function | ✅ Pass | 1.264s | |
|
||||
| Ask Advice Function | ✅ Pass | 1.232s | |
|
||||
| Streaming Search Query Function Streaming | ✅ Pass | 0.884s | |
|
||||
| Basic Context Memory Test | ✅ Pass | 2.803s | |
|
||||
| Function Argument Memory Test | ✅ Pass | 0.750s | |
|
||||
| Function Response Memory Test | ✅ Pass | 0.951s | |
|
||||
| Penetration Testing Memory with Tool Call | ✅ Pass | 1.953s | |
|
||||
| Cybersecurity Workflow Memory Test | ✅ Pass | 1.255s | |
|
||||
| Penetration Testing Methodology | ✅ Pass | 7.005s | |
|
||||
| Vulnerability Assessment Tools | ✅ Pass | 19.557s | |
|
||||
| SQL Injection Attack Type | ✅ Pass | 2.703s | |
|
||||
| Penetration Testing Framework | ✅ Pass | 5.521s | |
|
||||
| Web Application Security Scanner | ✅ Pass | 4.520s | |
|
||||
| Penetration Testing Tool Selection | ✅ Pass | 1.533s | |
|
||||
| JSON Response Function | ✅ Pass | 1.795s | |
|
||||
| Search Query Function | ✅ Pass | 1.148s | |
|
||||
| Ask Advice Function | ✅ Pass | 1.347s | |
|
||||
| Streaming Search Query Function Streaming | ✅ Pass | 0.834s | |
|
||||
| Basic Context Memory Test | ✅ Pass | 0.655s | |
|
||||
| Function Argument Memory Test | ✅ Pass | 1.019s | |
|
||||
| Function Response Memory Test | ✅ Pass | 1.023s | |
|
||||
| Penetration Testing Memory with Tool Call | ✅ Pass | 1.142s | |
|
||||
| Cybersecurity Workflow Memory Test | ✅ Pass | 0.996s | |
|
||||
| Penetration Testing Methodology | ✅ Pass | 2.497s | |
|
||||
| Vulnerability Assessment Tools | ✅ Pass | 1.849s | |
|
||||
| SQL Injection Attack Type | ✅ Pass | 0.552s | |
|
||||
| Penetration Testing Framework | ✅ Pass | 1.971s | |
|
||||
| Web Application Security Scanner | ✅ Pass | 1.461s | |
|
||||
| Penetration Testing Tool Selection | ✅ Pass | 1.152s | |
|
||||
|
||||
**Summary**: 23/23 (100.00%) successful tests
|
||||
|
||||
**Average latency**: 3.014s
|
||||
**Average latency**: 1.194s
|
||||
|
||||
---
|
||||
|
||||
@@ -72,15 +72,15 @@ Generated: Wed, 27 May 2026 23:05:00 UTC
|
||||
|
||||
| Test | Result | Latency | Error |
|
||||
|------|--------|---------|-------|
|
||||
| Person Information JSON | ✅ Pass | 4.643s | |
|
||||
| User Profile JSON | ✅ Pass | 5.759s | |
|
||||
| Streaming Person Information JSON Streaming | ✅ Pass | 6.514s | |
|
||||
| Project Information JSON | ✅ Pass | 7.046s | |
|
||||
| Vulnerability Report Memory Test | ✅ Pass | 11.475s | |
|
||||
| Project Information JSON | ✅ Pass | 0.751s | |
|
||||
| User Profile JSON | ✅ Pass | 0.767s | |
|
||||
| Person Information JSON | ✅ Pass | 1.137s | |
|
||||
| Vulnerability Report Memory Test | ❌ Fail | 1.318s | got map\[string\]interface \{\}\{"$schema":"http://json\-schema\.org/draft\-07/schema\#", "properties":map\[string\]interface \{\}\{"open\_ports":\... |
|
||||
| Streaming Person Information JSON Streaming | ✅ Pass | 0.747s | |
|
||||
|
||||
**Summary**: 5/5 (100.00%) successful tests
|
||||
**Summary**: 4/5 (80.00%) successful tests
|
||||
|
||||
**Average latency**: 7.088s
|
||||
**Average latency**: 0.945s
|
||||
|
||||
---
|
||||
|
||||
@@ -90,38 +90,38 @@ Generated: Wed, 27 May 2026 23:05:00 UTC
|
||||
|
||||
| Test | Result | Latency | Error |
|
||||
|------|--------|---------|-------|
|
||||
| Simple Math | ✅ Pass | 5.300s | |
|
||||
| Text Transform Uppercase | ✅ Pass | 3.870s | |
|
||||
| Math Calculation | ✅ Pass | 2.999s | |
|
||||
| Count from 1 to 5 | ✅ Pass | 12.409s | |
|
||||
| Basic Echo Function | ✅ Pass | 3.121s | |
|
||||
| Streaming Simple Math Streaming | ✅ Pass | 4.737s | |
|
||||
| Streaming Count from 1 to 3 Streaming | ✅ Pass | 4.527s | |
|
||||
| Streaming Basic Echo Function Streaming | ✅ Pass | 4.412s | |
|
||||
| Simple Math | ✅ Pass | 5.447s | |
|
||||
| Text Transform Uppercase | ✅ Pass | 5.747s | |
|
||||
| Count from 1 to 5 | ✅ Pass | 6.848s | |
|
||||
| Math Calculation | ✅ Pass | 3.912s | |
|
||||
| Basic Echo Function | ✅ Pass | 2.791s | |
|
||||
| Streaming Simple Math Streaming | ✅ Pass | 4.355s | |
|
||||
| Streaming Count from 1 to 3 Streaming | ✅ Pass | 5.205s | |
|
||||
| Streaming Basic Echo Function Streaming | ✅ Pass | 2.995s | |
|
||||
|
||||
#### Advanced Tests
|
||||
|
||||
| Test | Result | Latency | Error |
|
||||
|------|--------|---------|-------|
|
||||
| JSON Response Function | ✅ Pass | 2.506s | |
|
||||
| Search Query Function | ✅ Pass | 2.580s | |
|
||||
| Ask Advice Function | ✅ Pass | 3.013s | |
|
||||
| Streaming Search Query Function Streaming | ✅ Pass | 2.157s | |
|
||||
| Basic Context Memory Test | ✅ Pass | 4.461s | |
|
||||
| Function Argument Memory Test | ✅ Pass | 4.159s | |
|
||||
| Function Response Memory Test | ✅ Pass | 6.988s | |
|
||||
| Penetration Testing Memory with Tool Call | ✅ Pass | 5.212s | |
|
||||
| Cybersecurity Workflow Memory Test | ✅ Pass | 3.572s | |
|
||||
| Penetration Testing Methodology | ✅ Pass | 11.000s | |
|
||||
| Vulnerability Assessment Tools | ✅ Pass | 11.996s | |
|
||||
| SQL Injection Attack Type | ✅ Pass | 5.396s | |
|
||||
| Penetration Testing Framework | ✅ Pass | 11.164s | |
|
||||
| Web Application Security Scanner | ✅ Pass | 4.543s | |
|
||||
| Penetration Testing Tool Selection | ✅ Pass | 3.287s | |
|
||||
| JSON Response Function | ✅ Pass | 4.066s | |
|
||||
| Search Query Function | ✅ Pass | 2.232s | |
|
||||
| Ask Advice Function | ✅ Pass | 2.860s | |
|
||||
| Streaming Search Query Function Streaming | ✅ Pass | 2.094s | |
|
||||
| Basic Context Memory Test | ✅ Pass | 4.240s | |
|
||||
| Function Argument Memory Test | ✅ Pass | 2.635s | |
|
||||
| Function Response Memory Test | ✅ Pass | 9.775s | |
|
||||
| Penetration Testing Memory with Tool Call | ✅ Pass | 7.318s | |
|
||||
| Cybersecurity Workflow Memory Test | ✅ Pass | 4.524s | |
|
||||
| Penetration Testing Methodology | ✅ Pass | 13.685s | |
|
||||
| Vulnerability Assessment Tools | ✅ Pass | 12.372s | |
|
||||
| SQL Injection Attack Type | ✅ Pass | 11.677s | |
|
||||
| Penetration Testing Framework | ✅ Pass | 12.500s | |
|
||||
| Web Application Security Scanner | ✅ Pass | 9.014s | |
|
||||
| Penetration Testing Tool Selection | ✅ Pass | 3.509s | |
|
||||
|
||||
**Summary**: 23/23 (100.00%) successful tests
|
||||
|
||||
**Average latency**: 5.366s
|
||||
**Average latency**: 6.079s
|
||||
|
||||
---
|
||||
|
||||
@@ -131,38 +131,38 @@ Generated: Wed, 27 May 2026 23:05:00 UTC
|
||||
|
||||
| Test | Result | Latency | Error |
|
||||
|------|--------|---------|-------|
|
||||
| Simple Math | ✅ Pass | 5.339s | |
|
||||
| Text Transform Uppercase | ✅ Pass | 5.104s | |
|
||||
| Count from 1 to 5 | ✅ Pass | 5.229s | |
|
||||
| Math Calculation | ✅ Pass | 4.509s | |
|
||||
| Basic Echo Function | ✅ Pass | 3.068s | |
|
||||
| Streaming Simple Math Streaming | ✅ Pass | 4.267s | |
|
||||
| Streaming Count from 1 to 3 Streaming | ✅ Pass | 4.656s | |
|
||||
| Streaming Basic Echo Function Streaming | ✅ Pass | 2.860s | |
|
||||
| Simple Math | ✅ Pass | 4.896s | |
|
||||
| Text Transform Uppercase | ✅ Pass | 4.750s | |
|
||||
| Count from 1 to 5 | ✅ Pass | 5.467s | |
|
||||
| Math Calculation | ✅ Pass | 3.192s | |
|
||||
| Basic Echo Function | ✅ Pass | 2.627s | |
|
||||
| Streaming Simple Math Streaming | ✅ Pass | 3.657s | |
|
||||
| Streaming Count from 1 to 3 Streaming | ✅ Pass | 4.401s | |
|
||||
| Streaming Basic Echo Function Streaming | ✅ Pass | 5.018s | |
|
||||
|
||||
#### Advanced Tests
|
||||
|
||||
| Test | Result | Latency | Error |
|
||||
|------|--------|---------|-------|
|
||||
| JSON Response Function | ✅ Pass | 3.200s | |
|
||||
| Search Query Function | ✅ Pass | 1.780s | |
|
||||
| Ask Advice Function | ✅ Pass | 2.841s | |
|
||||
| Streaming Search Query Function Streaming | ✅ Pass | 2.020s | |
|
||||
| Basic Context Memory Test | ✅ Pass | 5.116s | |
|
||||
| Function Argument Memory Test | ✅ Pass | 3.908s | |
|
||||
| Function Response Memory Test | ✅ Pass | 4.534s | |
|
||||
| Penetration Testing Memory with Tool Call | ✅ Pass | 8.970s | |
|
||||
| Cybersecurity Workflow Memory Test | ✅ Pass | 7.041s | |
|
||||
| Penetration Testing Methodology | ✅ Pass | 11.914s | |
|
||||
| Vulnerability Assessment Tools | ✅ Pass | 17.492s | |
|
||||
| SQL Injection Attack Type | ✅ Pass | 5.510s | |
|
||||
| Penetration Testing Framework | ✅ Pass | 9.591s | |
|
||||
| Penetration Testing Tool Selection | ✅ Pass | 2.796s | |
|
||||
| Web Application Security Scanner | ✅ Pass | 10.669s | |
|
||||
| JSON Response Function | ✅ Pass | 2.385s | |
|
||||
| Search Query Function | ✅ Pass | 2.864s | |
|
||||
| Ask Advice Function | ✅ Pass | 3.599s | |
|
||||
| Streaming Search Query Function Streaming | ✅ Pass | 1.866s | |
|
||||
| Basic Context Memory Test | ✅ Pass | 4.909s | |
|
||||
| Function Argument Memory Test | ✅ Pass | 3.381s | |
|
||||
| Function Response Memory Test | ✅ Pass | 4.146s | |
|
||||
| Penetration Testing Memory with Tool Call | ✅ Pass | 5.778s | |
|
||||
| Cybersecurity Workflow Memory Test | ✅ Pass | 5.842s | |
|
||||
| Penetration Testing Methodology | ✅ Pass | 9.568s | |
|
||||
| SQL Injection Attack Type | ✅ Pass | 6.860s | |
|
||||
| Vulnerability Assessment Tools | ✅ Pass | 18.097s | |
|
||||
| Penetration Testing Framework | ✅ Pass | 10.065s | |
|
||||
| Web Application Security Scanner | ✅ Pass | 9.724s | |
|
||||
| Penetration Testing Tool Selection | ✅ Pass | 3.665s | |
|
||||
|
||||
**Summary**: 23/23 (100.00%) successful tests
|
||||
|
||||
**Average latency**: 5.758s
|
||||
**Average latency**: 5.512s
|
||||
|
||||
---
|
||||
|
||||
@@ -172,38 +172,38 @@ Generated: Wed, 27 May 2026 23:05:00 UTC
|
||||
|
||||
| Test | Result | Latency | Error |
|
||||
|------|--------|---------|-------|
|
||||
| Simple Math | ✅ Pass | 3.289s | |
|
||||
| Text Transform Uppercase | ✅ Pass | 2.103s | |
|
||||
| Count from 1 to 5 | ✅ Pass | 4.641s | |
|
||||
| Math Calculation | ✅ Pass | 1.319s | |
|
||||
| Basic Echo Function | ✅ Pass | 1.852s | |
|
||||
| Streaming Simple Math Streaming | ✅ Pass | 1.998s | |
|
||||
| Streaming Count from 1 to 3 Streaming | ✅ Pass | 3.591s | |
|
||||
| Streaming Basic Echo Function Streaming | ✅ Pass | 1.928s | |
|
||||
| Simple Math | ✅ Pass | 5.569s | |
|
||||
| Text Transform Uppercase | ✅ Pass | 4.279s | |
|
||||
| Math Calculation | ✅ Pass | 2.508s | |
|
||||
| Basic Echo Function | ✅ Pass | 2.450s | |
|
||||
| Count from 1 to 5 | ✅ Pass | 13.152s | |
|
||||
| Streaming Simple Math Streaming | ✅ Pass | 3.691s | |
|
||||
| Streaming Count from 1 to 3 Streaming | ✅ Pass | 4.566s | |
|
||||
| Streaming Basic Echo Function Streaming | ✅ Pass | 2.392s | |
|
||||
|
||||
#### Advanced Tests
|
||||
|
||||
| Test | Result | Latency | Error |
|
||||
|------|--------|---------|-------|
|
||||
| JSON Response Function | ✅ Pass | 1.388s | |
|
||||
| Search Query Function | ✅ Pass | 2.131s | |
|
||||
| Ask Advice Function | ✅ Pass | 1.544s | |
|
||||
| Streaming Search Query Function Streaming | ✅ Pass | 1.977s | |
|
||||
| Basic Context Memory Test | ✅ Pass | 4.750s | |
|
||||
| Function Argument Memory Test | ✅ Pass | 3.913s | |
|
||||
| Function Response Memory Test | ✅ Pass | 4.594s | |
|
||||
| Penetration Testing Memory with Tool Call | ✅ Pass | 3.141s | |
|
||||
| Cybersecurity Workflow Memory Test | ✅ Pass | 3.417s | |
|
||||
| Penetration Testing Methodology | ✅ Pass | 3.566s | |
|
||||
| Vulnerability Assessment Tools | ✅ Pass | 15.497s | |
|
||||
| SQL Injection Attack Type | ✅ Pass | 4.163s | |
|
||||
| Penetration Testing Framework | ✅ Pass | 2.677s | |
|
||||
| Web Application Security Scanner | ✅ Pass | 4.448s | |
|
||||
| Penetration Testing Tool Selection | ✅ Pass | 1.928s | |
|
||||
| JSON Response Function | ✅ Pass | 2.320s | |
|
||||
| Search Query Function | ✅ Pass | 3.027s | |
|
||||
| Ask Advice Function | ✅ Pass | 4.213s | |
|
||||
| Streaming Search Query Function Streaming | ✅ Pass | 2.036s | |
|
||||
| Basic Context Memory Test | ✅ Pass | 7.051s | |
|
||||
| Function Argument Memory Test | ✅ Pass | 3.966s | |
|
||||
| Function Response Memory Test | ✅ Pass | 7.543s | |
|
||||
| Penetration Testing Memory with Tool Call | ✅ Pass | 3.481s | |
|
||||
| Cybersecurity Workflow Memory Test | ✅ Pass | 4.203s | |
|
||||
| Penetration Testing Methodology | ✅ Pass | 6.768s | |
|
||||
| SQL Injection Attack Type | ✅ Pass | 4.673s | |
|
||||
| Vulnerability Assessment Tools | ✅ Pass | 18.743s | |
|
||||
| Penetration Testing Framework | ✅ Pass | 5.278s | |
|
||||
| Web Application Security Scanner | ✅ Pass | 3.018s | |
|
||||
| Penetration Testing Tool Selection | ✅ Pass | 4.024s | |
|
||||
|
||||
**Summary**: 23/23 (100.00%) successful tests
|
||||
|
||||
**Average latency**: 3.473s
|
||||
**Average latency**: 5.172s
|
||||
|
||||
---
|
||||
|
||||
@@ -213,38 +213,38 @@ Generated: Wed, 27 May 2026 23:05:00 UTC
|
||||
|
||||
| Test | Result | Latency | Error |
|
||||
|------|--------|---------|-------|
|
||||
| Simple Math | ✅ Pass | 4.088s | |
|
||||
| Text Transform Uppercase | ✅ Pass | 3.518s | |
|
||||
| Count from 1 to 5 | ✅ Pass | 4.139s | |
|
||||
| Math Calculation | ✅ Pass | 1.941s | |
|
||||
| Basic Echo Function | ✅ Pass | 1.601s | |
|
||||
| Streaming Simple Math Streaming | ✅ Pass | 2.624s | |
|
||||
| Streaming Count from 1 to 3 Streaming | ✅ Pass | 3.881s | |
|
||||
| Streaming Basic Echo Function Streaming | ✅ Pass | 2.028s | |
|
||||
| Simple Math | ✅ Pass | 4.724s | |
|
||||
| Text Transform Uppercase | ✅ Pass | 4.732s | |
|
||||
| Count from 1 to 5 | ✅ Pass | 8.230s | |
|
||||
| Math Calculation | ✅ Pass | 2.684s | |
|
||||
| Basic Echo Function | ✅ Pass | 2.505s | |
|
||||
| Streaming Simple Math Streaming | ✅ Pass | 5.229s | |
|
||||
| Streaming Count from 1 to 3 Streaming | ✅ Pass | 6.093s | |
|
||||
| Streaming Basic Echo Function Streaming | ✅ Pass | 1.982s | |
|
||||
|
||||
#### Advanced Tests
|
||||
|
||||
| Test | Result | Latency | Error |
|
||||
|------|--------|---------|-------|
|
||||
| JSON Response Function | ✅ Pass | 2.901s | |
|
||||
| Search Query Function | ✅ Pass | 2.251s | |
|
||||
| Ask Advice Function | ✅ Pass | 1.898s | |
|
||||
| Streaming Search Query Function Streaming | ✅ Pass | 2.260s | |
|
||||
| Basic Context Memory Test | ✅ Pass | 2.202s | |
|
||||
| Function Argument Memory Test | ✅ Pass | 2.319s | |
|
||||
| Function Response Memory Test | ✅ Pass | 4.750s | |
|
||||
| Penetration Testing Memory with Tool Call | ✅ Pass | 2.590s | |
|
||||
| Cybersecurity Workflow Memory Test | ✅ Pass | 3.847s | |
|
||||
| Penetration Testing Methodology | ✅ Pass | 4.130s | |
|
||||
| Vulnerability Assessment Tools | ✅ Pass | 12.315s | |
|
||||
| SQL Injection Attack Type | ✅ Pass | 4.225s | |
|
||||
| Penetration Testing Framework | ✅ Pass | 2.234s | |
|
||||
| Web Application Security Scanner | ✅ Pass | 2.434s | |
|
||||
| Penetration Testing Tool Selection | ✅ Pass | 2.917s | |
|
||||
| JSON Response Function | ✅ Pass | 2.442s | |
|
||||
| Search Query Function | ✅ Pass | 2.913s | |
|
||||
| Ask Advice Function | ✅ Pass | 3.594s | |
|
||||
| Streaming Search Query Function Streaming | ✅ Pass | 3.382s | |
|
||||
| Basic Context Memory Test | ✅ Pass | 4.152s | |
|
||||
| Function Argument Memory Test | ✅ Pass | 3.718s | |
|
||||
| Function Response Memory Test | ✅ Pass | 5.162s | |
|
||||
| Penetration Testing Memory with Tool Call | ✅ Pass | 4.590s | |
|
||||
| Cybersecurity Workflow Memory Test | ✅ Pass | 4.049s | |
|
||||
| Penetration Testing Methodology | ✅ Pass | 5.504s | |
|
||||
| SQL Injection Attack Type | ✅ Pass | 5.041s | |
|
||||
| Vulnerability Assessment Tools | ✅ Pass | 20.343s | |
|
||||
| Penetration Testing Framework | ✅ Pass | 10.747s | |
|
||||
| Web Application Security Scanner | ✅ Pass | 10.091s | |
|
||||
| Penetration Testing Tool Selection | ✅ Pass | 3.551s | |
|
||||
|
||||
**Summary**: 23/23 (100.00%) successful tests
|
||||
|
||||
**Average latency**: 3.352s
|
||||
**Average latency**: 5.455s
|
||||
|
||||
---
|
||||
|
||||
@@ -254,38 +254,38 @@ Generated: Wed, 27 May 2026 23:05:00 UTC
|
||||
|
||||
| Test | Result | Latency | Error |
|
||||
|------|--------|---------|-------|
|
||||
| Simple Math | ✅ Pass | 3.094s | |
|
||||
| Text Transform Uppercase | ✅ Pass | 3.196s | |
|
||||
| Count from 1 to 5 | ✅ Pass | 3.210s | |
|
||||
| Math Calculation | ✅ Pass | 2.542s | |
|
||||
| Basic Echo Function | ✅ Pass | 1.431s | |
|
||||
| Streaming Simple Math Streaming | ✅ Pass | 2.476s | |
|
||||
| Streaming Count from 1 to 3 Streaming | ✅ Pass | 2.746s | |
|
||||
| Streaming Basic Echo Function Streaming | ✅ Pass | 1.526s | |
|
||||
| Simple Math | ✅ Pass | 3.761s | |
|
||||
| Text Transform Uppercase | ✅ Pass | 6.663s | |
|
||||
| Count from 1 to 5 | ✅ Pass | 8.046s | |
|
||||
| Math Calculation | ✅ Pass | 2.987s | |
|
||||
| Basic Echo Function | ✅ Pass | 2.306s | |
|
||||
| Streaming Simple Math Streaming | ✅ Pass | 3.074s | |
|
||||
| Streaming Basic Echo Function Streaming | ✅ Pass | 2.033s | |
|
||||
| Streaming Count from 1 to 3 Streaming | ✅ Pass | 7.517s | |
|
||||
|
||||
#### Advanced Tests
|
||||
|
||||
| Test | Result | Latency | Error |
|
||||
|------|--------|---------|-------|
|
||||
| JSON Response Function | ✅ Pass | 2.530s | |
|
||||
| Search Query Function | ✅ Pass | 1.782s | |
|
||||
| Ask Advice Function | ✅ Pass | 1.862s | |
|
||||
| Streaming Search Query Function Streaming | ✅ Pass | 1.915s | |
|
||||
| Basic Context Memory Test | ✅ Pass | 3.090s | |
|
||||
| Function Argument Memory Test | ❌ Fail | 3.063s | expected text 'Go programming language' not found |
|
||||
| Function Response Memory Test | ✅ Pass | 4.037s | |
|
||||
| Penetration Testing Memory with Tool Call | ✅ Pass | 2.505s | |
|
||||
| Cybersecurity Workflow Memory Test | ✅ Pass | 3.618s | |
|
||||
| Penetration Testing Methodology | ✅ Pass | 2.493s | |
|
||||
| Vulnerability Assessment Tools | ✅ Pass | 6.921s | |
|
||||
| SQL Injection Attack Type | ✅ Pass | 3.418s | |
|
||||
| Penetration Testing Framework | ✅ Pass | 4.994s | |
|
||||
| Web Application Security Scanner | ✅ Pass | 1.890s | |
|
||||
| Penetration Testing Tool Selection | ✅ Pass | 3.285s | |
|
||||
| JSON Response Function | ✅ Pass | 1.833s | |
|
||||
| Search Query Function | ✅ Pass | 2.622s | |
|
||||
| Ask Advice Function | ✅ Pass | 3.735s | |
|
||||
| Streaming Search Query Function Streaming | ✅ Pass | 1.859s | |
|
||||
| Basic Context Memory Test | ✅ Pass | 4.274s | |
|
||||
| Function Argument Memory Test | ✅ Pass | 5.488s | |
|
||||
| Function Response Memory Test | ✅ Pass | 7.424s | |
|
||||
| Penetration Testing Memory with Tool Call | ✅ Pass | 2.711s | |
|
||||
| Cybersecurity Workflow Memory Test | ✅ Pass | 4.780s | |
|
||||
| Penetration Testing Methodology | ✅ Pass | 6.058s | |
|
||||
| Vulnerability Assessment Tools | ✅ Pass | 13.545s | |
|
||||
| SQL Injection Attack Type | ✅ Pass | 3.451s | |
|
||||
| Penetration Testing Framework | ✅ Pass | 5.348s | |
|
||||
| Web Application Security Scanner | ✅ Pass | 6.648s | |
|
||||
| Penetration Testing Tool Selection | ✅ Pass | 3.077s | |
|
||||
|
||||
**Summary**: 22/23 (95.65%) successful tests
|
||||
**Summary**: 23/23 (100.00%) successful tests
|
||||
|
||||
**Average latency**: 2.941s
|
||||
**Average latency**: 4.750s
|
||||
|
||||
---
|
||||
|
||||
@@ -295,38 +295,38 @@ Generated: Wed, 27 May 2026 23:05:00 UTC
|
||||
|
||||
| Test | Result | Latency | Error |
|
||||
|------|--------|---------|-------|
|
||||
| Simple Math | ✅ Pass | 3.446s | |
|
||||
| Text Transform Uppercase | ✅ Pass | 2.472s | |
|
||||
| Count from 1 to 5 | ✅ Pass | 3.109s | |
|
||||
| Math Calculation | ✅ Pass | 2.244s | |
|
||||
| Basic Echo Function | ✅ Pass | 1.191s | |
|
||||
| Streaming Simple Math Streaming | ✅ Pass | 2.377s | |
|
||||
| Streaming Count from 1 to 3 Streaming | ✅ Pass | 2.817s | |
|
||||
| Streaming Basic Echo Function Streaming | ✅ Pass | 0.952s | |
|
||||
| Simple Math | ✅ Pass | 1.664s | |
|
||||
| Text Transform Uppercase | ✅ Pass | 0.999s | |
|
||||
| Count from 1 to 5 | ✅ Pass | 0.706s | |
|
||||
| Math Calculation | ✅ Pass | 0.808s | |
|
||||
| Basic Echo Function | ✅ Pass | 1.256s | |
|
||||
| Streaming Simple Math Streaming | ✅ Pass | 0.597s | |
|
||||
| Streaming Count from 1 to 3 Streaming | ✅ Pass | 0.636s | |
|
||||
| Streaming Basic Echo Function Streaming | ✅ Pass | 1.226s | |
|
||||
|
||||
#### Advanced Tests
|
||||
|
||||
| Test | Result | Latency | Error |
|
||||
|------|--------|---------|-------|
|
||||
| JSON Response Function | ✅ Pass | 1.376s | |
|
||||
| Search Query Function | ✅ Pass | 1.034s | |
|
||||
| Ask Advice Function | ✅ Pass | 1.177s | |
|
||||
| Streaming Search Query Function Streaming | ✅ Pass | 1.399s | |
|
||||
| Basic Context Memory Test | ✅ Pass | 2.150s | |
|
||||
| Function Argument Memory Test | ✅ Pass | 0.884s | |
|
||||
| Function Response Memory Test | ✅ Pass | 0.900s | |
|
||||
| Penetration Testing Memory with Tool Call | ✅ Pass | 2.927s | |
|
||||
| Cybersecurity Workflow Memory Test | ✅ Pass | 1.201s | |
|
||||
| Penetration Testing Methodology | ✅ Pass | 8.294s | |
|
||||
| SQL Injection Attack Type | ✅ Pass | 2.431s | |
|
||||
| Vulnerability Assessment Tools | ✅ Pass | 24.680s | |
|
||||
| Penetration Testing Framework | ✅ Pass | 4.850s | |
|
||||
| Web Application Security Scanner | ✅ Pass | 4.195s | |
|
||||
| Penetration Testing Tool Selection | ✅ Pass | 1.546s | |
|
||||
| JSON Response Function | ✅ Pass | 0.959s | |
|
||||
| Search Query Function | ✅ Pass | 1.138s | |
|
||||
| Ask Advice Function | ✅ Pass | 1.422s | |
|
||||
| Streaming Search Query Function Streaming | ✅ Pass | 1.120s | |
|
||||
| Basic Context Memory Test | ✅ Pass | 0.691s | |
|
||||
| Function Argument Memory Test | ✅ Pass | 0.528s | |
|
||||
| Function Response Memory Test | ✅ Pass | 0.553s | |
|
||||
| Penetration Testing Memory with Tool Call | ✅ Pass | 1.283s | |
|
||||
| Cybersecurity Workflow Memory Test | ✅ Pass | 1.122s | |
|
||||
| Penetration Testing Methodology | ✅ Pass | 2.639s | |
|
||||
| Vulnerability Assessment Tools | ✅ Pass | 2.019s | |
|
||||
| SQL Injection Attack Type | ✅ Pass | 1.036s | |
|
||||
| Penetration Testing Framework | ✅ Pass | 1.655s | |
|
||||
| Web Application Security Scanner | ✅ Pass | 3.147s | |
|
||||
| Penetration Testing Tool Selection | ✅ Pass | 0.902s | |
|
||||
|
||||
**Summary**: 23/23 (100.00%) successful tests
|
||||
|
||||
**Average latency**: 3.377s
|
||||
**Average latency**: 1.222s
|
||||
|
||||
---
|
||||
|
||||
@@ -336,38 +336,38 @@ Generated: Wed, 27 May 2026 23:05:00 UTC
|
||||
|
||||
| Test | Result | Latency | Error |
|
||||
|------|--------|---------|-------|
|
||||
| Simple Math | ✅ Pass | 3.959s | |
|
||||
| Text Transform Uppercase | ✅ Pass | 1.535s | |
|
||||
| Count from 1 to 5 | ✅ Pass | 3.085s | |
|
||||
| Math Calculation | ✅ Pass | 1.883s | |
|
||||
| Basic Echo Function | ✅ Pass | 1.491s | |
|
||||
| Streaming Simple Math Streaming | ✅ Pass | 1.364s | |
|
||||
| Streaming Count from 1 to 3 Streaming | ✅ Pass | 1.694s | |
|
||||
| Streaming Basic Echo Function Streaming | ✅ Pass | 1.338s | |
|
||||
| Simple Math | ✅ Pass | 1.541s | |
|
||||
| Text Transform Uppercase | ✅ Pass | 0.541s | |
|
||||
| Count from 1 to 5 | ✅ Pass | 0.660s | |
|
||||
| Math Calculation | ✅ Pass | 0.556s | |
|
||||
| Basic Echo Function | ✅ Pass | 0.701s | |
|
||||
| Streaming Simple Math Streaming | ✅ Pass | 0.537s | |
|
||||
| Streaming Count from 1 to 3 Streaming | ✅ Pass | 1.012s | |
|
||||
| Streaming Basic Echo Function Streaming | ✅ Pass | 0.886s | |
|
||||
|
||||
#### Advanced Tests
|
||||
|
||||
| Test | Result | Latency | Error |
|
||||
|------|--------|---------|-------|
|
||||
| JSON Response Function | ✅ Pass | 1.277s | |
|
||||
| Search Query Function | ✅ Pass | 1.249s | |
|
||||
| Ask Advice Function | ✅ Pass | 1.653s | |
|
||||
| Streaming Search Query Function Streaming | ✅ Pass | 0.925s | |
|
||||
| Basic Context Memory Test | ✅ Pass | 3.426s | |
|
||||
| Function Argument Memory Test | ✅ Pass | 1.552s | |
|
||||
| Penetration Testing Memory with Tool Call | ✅ Pass | 2.090s | |
|
||||
| Cybersecurity Workflow Memory Test | ✅ Pass | 1.792s | |
|
||||
| Function Response Memory Test | ✅ Pass | 22.405s | |
|
||||
| Penetration Testing Methodology | ✅ Pass | 6.872s | |
|
||||
| SQL Injection Attack Type | ✅ Pass | 2.512s | |
|
||||
| Vulnerability Assessment Tools | ✅ Pass | 19.206s | |
|
||||
| Penetration Testing Framework | ✅ Pass | 4.480s | |
|
||||
| Web Application Security Scanner | ✅ Pass | 4.976s | |
|
||||
| Penetration Testing Tool Selection | ✅ Pass | 1.793s | |
|
||||
| JSON Response Function | ✅ Pass | 0.744s | |
|
||||
| Search Query Function | ✅ Pass | 1.350s | |
|
||||
| Ask Advice Function | ✅ Pass | 1.986s | |
|
||||
| Streaming Search Query Function Streaming | ✅ Pass | 0.726s | |
|
||||
| Basic Context Memory Test | ✅ Pass | 1.089s | |
|
||||
| Function Argument Memory Test | ✅ Pass | 0.558s | |
|
||||
| Function Response Memory Test | ✅ Pass | 0.620s | |
|
||||
| Penetration Testing Memory with Tool Call | ✅ Pass | 1.284s | |
|
||||
| Cybersecurity Workflow Memory Test | ✅ Pass | 0.607s | |
|
||||
| Penetration Testing Methodology | ✅ Pass | 2.167s | |
|
||||
| Vulnerability Assessment Tools | ✅ Pass | 1.673s | |
|
||||
| SQL Injection Attack Type | ✅ Pass | 1.089s | |
|
||||
| Penetration Testing Framework | ✅ Pass | 1.859s | |
|
||||
| Web Application Security Scanner | ✅ Pass | 1.590s | |
|
||||
| Penetration Testing Tool Selection | ✅ Pass | 1.127s | |
|
||||
|
||||
**Summary**: 23/23 (100.00%) successful tests
|
||||
|
||||
**Average latency**: 4.025s
|
||||
**Average latency**: 1.083s
|
||||
|
||||
---
|
||||
|
||||
@@ -377,38 +377,38 @@ Generated: Wed, 27 May 2026 23:05:00 UTC
|
||||
|
||||
| Test | Result | Latency | Error |
|
||||
|------|--------|---------|-------|
|
||||
| Simple Math | ✅ Pass | 1.626s | |
|
||||
| Text Transform Uppercase | ✅ Pass | 2.623s | |
|
||||
| Count from 1 to 5 | ✅ Pass | 2.794s | |
|
||||
| Math Calculation | ✅ Pass | 1.657s | |
|
||||
| Basic Echo Function | ✅ Pass | 1.406s | |
|
||||
| Streaming Simple Math Streaming | ✅ Pass | 2.086s | |
|
||||
| Streaming Count from 1 to 3 Streaming | ✅ Pass | 2.390s | |
|
||||
| Streaming Basic Echo Function Streaming | ✅ Pass | 1.279s | |
|
||||
| Simple Math | ✅ Pass | 0.582s | |
|
||||
| Text Transform Uppercase | ✅ Pass | 1.107s | |
|
||||
| Count from 1 to 5 | ✅ Pass | 0.693s | |
|
||||
| Math Calculation | ✅ Pass | 0.516s | |
|
||||
| Basic Echo Function | ✅ Pass | 0.700s | |
|
||||
| Streaming Simple Math Streaming | ✅ Pass | 0.523s | |
|
||||
| Streaming Count from 1 to 3 Streaming | ✅ Pass | 0.544s | |
|
||||
| Streaming Basic Echo Function Streaming | ✅ Pass | 0.772s | |
|
||||
|
||||
#### Advanced Tests
|
||||
|
||||
| Test | Result | Latency | Error |
|
||||
|------|--------|---------|-------|
|
||||
| JSON Response Function | ✅ Pass | 1.314s | |
|
||||
| Search Query Function | ✅ Pass | 1.160s | |
|
||||
| Ask Advice Function | ✅ Pass | 1.645s | |
|
||||
| Streaming Search Query Function Streaming | ✅ Pass | 0.948s | |
|
||||
| Basic Context Memory Test | ✅ Pass | 3.003s | |
|
||||
| Function Argument Memory Test | ✅ Pass | 2.254s | |
|
||||
| Function Response Memory Test | ✅ Pass | 4.521s | |
|
||||
| Penetration Testing Memory with Tool Call | ✅ Pass | 2.225s | |
|
||||
| Cybersecurity Workflow Memory Test | ✅ Pass | 1.494s | |
|
||||
| Penetration Testing Methodology | ✅ Pass | 5.008s | |
|
||||
| Vulnerability Assessment Tools | ✅ Pass | 11.717s | |
|
||||
| SQL Injection Attack Type | ✅ Pass | 4.587s | |
|
||||
| Penetration Testing Framework | ✅ Pass | 4.199s | |
|
||||
| Web Application Security Scanner | ✅ Pass | 4.499s | |
|
||||
| Penetration Testing Tool Selection | ✅ Pass | 1.261s | |
|
||||
| JSON Response Function | ✅ Pass | 0.830s | |
|
||||
| Search Query Function | ✅ Pass | 1.114s | |
|
||||
| Ask Advice Function | ✅ Pass | 0.989s | |
|
||||
| Streaming Search Query Function Streaming | ✅ Pass | 0.729s | |
|
||||
| Basic Context Memory Test | ✅ Pass | 0.571s | |
|
||||
| Function Argument Memory Test | ✅ Pass | 0.820s | |
|
||||
| Function Response Memory Test | ✅ Pass | 0.683s | |
|
||||
| Penetration Testing Memory with Tool Call | ✅ Pass | 1.762s | |
|
||||
| Cybersecurity Workflow Memory Test | ✅ Pass | 1.029s | |
|
||||
| Penetration Testing Methodology | ✅ Pass | 2.050s | |
|
||||
| Vulnerability Assessment Tools | ✅ Pass | 1.990s | |
|
||||
| SQL Injection Attack Type | ✅ Pass | 0.651s | |
|
||||
| Penetration Testing Framework | ✅ Pass | 1.320s | |
|
||||
| Web Application Security Scanner | ✅ Pass | 1.317s | |
|
||||
| Penetration Testing Tool Selection | ✅ Pass | 1.060s | |
|
||||
|
||||
**Summary**: 23/23 (100.00%) successful tests
|
||||
|
||||
**Average latency**: 2.857s
|
||||
**Average latency**: 0.972s
|
||||
|
||||
---
|
||||
|
||||
@@ -418,38 +418,38 @@ Generated: Wed, 27 May 2026 23:05:00 UTC
|
||||
|
||||
| Test | Result | Latency | Error |
|
||||
|------|--------|---------|-------|
|
||||
| Simple Math | ✅ Pass | 0.873s | |
|
||||
| Text Transform Uppercase | ✅ Pass | 0.824s | |
|
||||
| Count from 1 to 5 | ✅ Pass | 0.953s | |
|
||||
| Math Calculation | ✅ Pass | 0.842s | |
|
||||
| Basic Echo Function | ✅ Pass | 1.514s | |
|
||||
| Streaming Simple Math Streaming | ✅ Pass | 1.386s | |
|
||||
| Streaming Count from 1 to 3 Streaming | ✅ Pass | 1.346s | |
|
||||
| Streaming Basic Echo Function Streaming | ✅ Pass | 1.009s | |
|
||||
| Simple Math | ✅ Pass | 1.889s | |
|
||||
| Text Transform Uppercase | ✅ Pass | 1.323s | |
|
||||
| Count from 1 to 5 | ✅ Pass | 0.900s | |
|
||||
| Math Calculation | ✅ Pass | 1.026s | |
|
||||
| Basic Echo Function | ✅ Pass | 1.016s | |
|
||||
| Streaming Simple Math Streaming | ✅ Pass | 1.920s | |
|
||||
| Streaming Count from 1 to 3 Streaming | ✅ Pass | 1.298s | |
|
||||
| Streaming Basic Echo Function Streaming | ✅ Pass | 0.987s | |
|
||||
|
||||
#### Advanced Tests
|
||||
|
||||
| Test | Result | Latency | Error |
|
||||
|------|--------|---------|-------|
|
||||
| JSON Response Function | ✅ Pass | 1.687s | |
|
||||
| Search Query Function | ✅ Pass | 0.994s | |
|
||||
| Ask Advice Function | ✅ Pass | 1.286s | |
|
||||
| Streaming Search Query Function Streaming | ✅ Pass | 1.041s | |
|
||||
| Basic Context Memory Test | ✅ Pass | 0.905s | |
|
||||
| Function Argument Memory Test | ✅ Pass | 0.845s | |
|
||||
| Function Response Memory Test | ✅ Pass | 0.806s | |
|
||||
| Penetration Testing Memory with Tool Call | ✅ Pass | 2.240s | |
|
||||
| Cybersecurity Workflow Memory Test | ✅ Pass | 0.862s | |
|
||||
| Penetration Testing Methodology | ✅ Pass | 4.261s | |
|
||||
| Vulnerability Assessment Tools | ✅ Pass | 3.398s | |
|
||||
| SQL Injection Attack Type | ✅ Pass | 0.869s | |
|
||||
| Penetration Testing Framework | ✅ Pass | 3.691s | |
|
||||
| Web Application Security Scanner | ✅ Pass | 2.796s | |
|
||||
| Penetration Testing Tool Selection | ✅ Pass | 1.349s | |
|
||||
| JSON Response Function | ✅ Pass | 1.585s | |
|
||||
| Search Query Function | ✅ Pass | 1.567s | |
|
||||
| Ask Advice Function | ✅ Pass | 1.272s | |
|
||||
| Streaming Search Query Function Streaming | ✅ Pass | 0.972s | |
|
||||
| Basic Context Memory Test | ✅ Pass | 1.434s | |
|
||||
| Function Argument Memory Test | ✅ Pass | 0.764s | |
|
||||
| Function Response Memory Test | ✅ Pass | 1.068s | |
|
||||
| Penetration Testing Memory with Tool Call | ✅ Pass | 1.771s | |
|
||||
| Cybersecurity Workflow Memory Test | ✅ Pass | 0.781s | |
|
||||
| Penetration Testing Methodology | ✅ Pass | 3.191s | |
|
||||
| Vulnerability Assessment Tools | ✅ Pass | 3.556s | |
|
||||
| SQL Injection Attack Type | ✅ Pass | 0.965s | |
|
||||
| Penetration Testing Framework | ✅ Pass | 5.216s | |
|
||||
| Web Application Security Scanner | ✅ Pass | 3.193s | |
|
||||
| Penetration Testing Tool Selection | ✅ Pass | 1.441s | |
|
||||
|
||||
**Summary**: 23/23 (100.00%) successful tests
|
||||
|
||||
**Average latency**: 1.556s
|
||||
**Average latency**: 1.702s
|
||||
|
||||
---
|
||||
|
||||
@@ -459,38 +459,38 @@ Generated: Wed, 27 May 2026 23:05:00 UTC
|
||||
|
||||
| Test | Result | Latency | Error |
|
||||
|------|--------|---------|-------|
|
||||
| Simple Math | ✅ Pass | 1.007s | |
|
||||
| Text Transform Uppercase | ✅ Pass | 0.566s | |
|
||||
| Count from 1 to 5 | ✅ Pass | 1.100s | |
|
||||
| Math Calculation | ✅ Pass | 1.011s | |
|
||||
| Basic Echo Function | ❌ Fail | 0.727s | no tool calls found, expected at least 1 |
|
||||
| Streaming Simple Math Streaming | ✅ Pass | 0.542s | |
|
||||
| Streaming Count from 1 to 3 Streaming | ✅ Pass | 0.560s | |
|
||||
| Streaming Basic Echo Function Streaming | ✅ Pass | 0.774s | |
|
||||
| Simple Math | ✅ Pass | 0.558s | |
|
||||
| Text Transform Uppercase | ✅ Pass | 0.527s | |
|
||||
| Count from 1 to 5 | ✅ Pass | 0.651s | |
|
||||
| Math Calculation | ✅ Pass | 0.965s | |
|
||||
| Basic Echo Function | ✅ Pass | 1.248s | |
|
||||
| Streaming Simple Math Streaming | ✅ Pass | 0.919s | |
|
||||
| Streaming Count from 1 to 3 Streaming | ✅ Pass | 0.591s | |
|
||||
| Streaming Basic Echo Function Streaming | ✅ Pass | 0.761s | |
|
||||
|
||||
#### Advanced Tests
|
||||
|
||||
| Test | Result | Latency | Error |
|
||||
|------|--------|---------|-------|
|
||||
| JSON Response Function | ✅ Pass | 0.830s | |
|
||||
| Search Query Function | ❌ Fail | 1.222s | no tool calls found, expected at least 1 |
|
||||
| Ask Advice Function | ✅ Pass | 0.877s | |
|
||||
| Streaming Search Query Function Streaming | ❌ Fail | 1.146s | no tool calls found, expected at least 1 |
|
||||
| Basic Context Memory Test | ✅ Pass | 0.973s | |
|
||||
| Function Argument Memory Test | ✅ Pass | 0.613s | |
|
||||
| Function Response Memory Test | ✅ Pass | 0.583s | |
|
||||
| Penetration Testing Memory with Tool Call | ✅ Pass | 1.826s | |
|
||||
| Cybersecurity Workflow Memory Test | ✅ Pass | 0.625s | |
|
||||
| Penetration Testing Methodology | ✅ Pass | 1.940s | |
|
||||
| Vulnerability Assessment Tools | ✅ Pass | 3.151s | |
|
||||
| SQL Injection Attack Type | ✅ Pass | 0.588s | |
|
||||
| Penetration Testing Framework | ✅ Pass | 1.175s | |
|
||||
| Web Application Security Scanner | ✅ Pass | 1.550s | |
|
||||
| Penetration Testing Tool Selection | ✅ Pass | 0.973s | |
|
||||
| JSON Response Function | ✅ Pass | 0.869s | |
|
||||
| Search Query Function | ✅ Pass | 0.856s | |
|
||||
| Ask Advice Function | ✅ Pass | 1.283s | |
|
||||
| Streaming Search Query Function Streaming | ✅ Pass | 1.259s | |
|
||||
| Basic Context Memory Test | ✅ Pass | 0.981s | |
|
||||
| Function Argument Memory Test | ✅ Pass | 0.649s | |
|
||||
| Function Response Memory Test | ✅ Pass | 1.007s | |
|
||||
| Penetration Testing Memory with Tool Call | ✅ Pass | 1.940s | |
|
||||
| Cybersecurity Workflow Memory Test | ✅ Pass | 1.070s | |
|
||||
| Penetration Testing Methodology | ✅ Pass | 2.822s | |
|
||||
| Vulnerability Assessment Tools | ✅ Pass | 2.937s | |
|
||||
| SQL Injection Attack Type | ✅ Pass | 0.620s | |
|
||||
| Penetration Testing Framework | ✅ Pass | 1.331s | |
|
||||
| Web Application Security Scanner | ✅ Pass | 1.018s | |
|
||||
| Penetration Testing Tool Selection | ✅ Pass | 0.972s | |
|
||||
|
||||
**Summary**: 20/23 (86.96%) successful tests
|
||||
**Summary**: 23/23 (100.00%) successful tests
|
||||
|
||||
**Average latency**: 1.060s
|
||||
**Average latency**: 1.124s
|
||||
|
||||
---
|
||||
|
||||
@@ -500,38 +500,38 @@ Generated: Wed, 27 May 2026 23:05:00 UTC
|
||||
|
||||
| Test | Result | Latency | Error |
|
||||
|------|--------|---------|-------|
|
||||
| Simple Math | ✅ Pass | 4.527s | |
|
||||
| Text Transform Uppercase | ✅ Pass | 4.599s | |
|
||||
| Count from 1 to 5 | ✅ Pass | 5.743s | |
|
||||
| Math Calculation | ✅ Pass | 3.509s | |
|
||||
| Basic Echo Function | ✅ Pass | 4.851s | |
|
||||
| Streaming Simple Math Streaming | ✅ Pass | 3.707s | |
|
||||
| Streaming Count from 1 to 3 Streaming | ✅ Pass | 4.781s | |
|
||||
| Streaming Basic Echo Function Streaming | ✅ Pass | 5.300s | |
|
||||
| Simple Math | ✅ Pass | 3.761s | |
|
||||
| Text Transform Uppercase | ✅ Pass | 4.980s | |
|
||||
| Count from 1 to 5 | ✅ Pass | 6.066s | |
|
||||
| Math Calculation | ✅ Pass | 5.125s | |
|
||||
| Basic Echo Function | ✅ Pass | 4.145s | |
|
||||
| Streaming Simple Math Streaming | ✅ Pass | 4.829s | |
|
||||
| Streaming Count from 1 to 3 Streaming | ✅ Pass | 5.837s | |
|
||||
| Streaming Basic Echo Function Streaming | ✅ Pass | 3.264s | |
|
||||
|
||||
#### Advanced Tests
|
||||
|
||||
| Test | Result | Latency | Error |
|
||||
|------|--------|---------|-------|
|
||||
| JSON Response Function | ✅ Pass | 2.456s | |
|
||||
| Search Query Function | ✅ Pass | 3.989s | |
|
||||
| Ask Advice Function | ✅ Pass | 3.255s | |
|
||||
| Streaming Search Query Function Streaming | ✅ Pass | 2.430s | |
|
||||
| Basic Context Memory Test | ✅ Pass | 4.780s | |
|
||||
| Function Argument Memory Test | ✅ Pass | 5.338s | |
|
||||
| Function Response Memory Test | ✅ Pass | 2.911s | |
|
||||
| Penetration Testing Memory with Tool Call | ✅ Pass | 5.094s | |
|
||||
| Cybersecurity Workflow Memory Test | ✅ Pass | 6.163s | |
|
||||
| Penetration Testing Methodology | ✅ Pass | 12.475s | |
|
||||
| Vulnerability Assessment Tools | ✅ Pass | 11.406s | |
|
||||
| SQL Injection Attack Type | ✅ Pass | 6.216s | |
|
||||
| Penetration Testing Framework | ✅ Pass | 11.490s | |
|
||||
| Web Application Security Scanner | ✅ Pass | 8.229s | |
|
||||
| Penetration Testing Tool Selection | ✅ Pass | 3.336s | |
|
||||
| JSON Response Function | ✅ Pass | 2.894s | |
|
||||
| Search Query Function | ✅ Pass | 3.922s | |
|
||||
| Ask Advice Function | ✅ Pass | 3.311s | |
|
||||
| Streaming Search Query Function Streaming | ✅ Pass | 2.728s | |
|
||||
| Basic Context Memory Test | ✅ Pass | 4.075s | |
|
||||
| Function Argument Memory Test | ✅ Pass | 5.545s | |
|
||||
| Function Response Memory Test | ✅ Pass | 10.448s | |
|
||||
| Penetration Testing Memory with Tool Call | ✅ Pass | 5.993s | |
|
||||
| Cybersecurity Workflow Memory Test | ✅ Pass | 3.695s | |
|
||||
| Penetration Testing Methodology | ✅ Pass | 9.881s | |
|
||||
| Vulnerability Assessment Tools | ✅ Pass | 10.935s | |
|
||||
| Penetration Testing Framework | ✅ Pass | 12.705s | |
|
||||
| Web Application Security Scanner | ✅ Pass | 8.109s | |
|
||||
| Penetration Testing Tool Selection | ✅ Pass | 2.886s | |
|
||||
| SQL Injection Attack Type | ✅ Pass | 86.622s | |
|
||||
|
||||
**Summary**: 23/23 (100.00%) successful tests
|
||||
|
||||
**Average latency**: 5.504s
|
||||
**Average latency**: 9.207s
|
||||
|
||||
---
|
||||
|
||||
|
||||
Reference in New Issue
Block a user