You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+9-8Lines changed: 9 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -38,7 +38,7 @@ vCache is the first semantic prompt cache that guarantees user-defined error rat
38
38
> vCache uses OpenAI by default for both LLM inference and embedding generation, but you can configure any other inference setup.
39
39
40
40
41
-
## 🚀 Quick Install
41
+
## Quick Install
42
42
43
43
Install vCache in editable mode:
44
44
@@ -66,10 +66,10 @@ print(response)
66
66
```
67
67
68
68
69
-
## 🎬 How vCache Works
69
+
## How vCache Works
70
70
71
71
vCache intelligently detects when a new prompt is semantically equivalent to a cached one, and adapts its decision boundaries based on your accuracy requirements.
72
-
This lets it return cached model responses for semantically similar prompts—not just exact matches—reducing both inference latency and cost without sacrificing correctness.
72
+
This lets it return cached model responses for semantically similar prompts (not just exact matches) reducing both inference latency and cost without sacrificing correctness.
Copy file name to clipboardExpand all lines: benchmarks/ReadMe.md
+6-5Lines changed: 6 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -20,7 +20,7 @@ This directory provides the official benchmarking tools for evaluating the perfo
20
20
21
21
22
22
23
-
## ⚙️ Installation
23
+
## Installation
24
24
25
25
To enable benchmarking capabilities, install vCache with the `benchmarks` extras from the project root:
26
26
@@ -29,7 +29,7 @@ pip install -e .[benchmarks]
29
29
```
30
30
31
31
32
-
## 🚀 Running Benchmarks
32
+
## Running Benchmarks
33
33
34
34
Run the main benchmarking script from the project root:
35
35
@@ -40,7 +40,7 @@ python benchmarks/benchmark.py
40
40
The script will automatically download the required datasets from Hugging Face based on the configurations in `RUN_COMBINATIONS`.
41
41
42
42
43
-
## ⚙️ Custom Configuration
43
+
## Custom Configuration
44
44
45
45
The primary configuration is done by modifying the global variables in the `benchmarks/benchmark.py` script. This script is designed to benchmark the performance of vCache against several baselines by evaluating cache hit rates, accuracy, latency, and other metrics.
46
46
@@ -64,7 +64,7 @@ Refer to the docstring in `benchmarks/benchmark.py` for more details on other co
64
64
65
65
66
66
67
-
## 📁 Datasets
67
+
## Datasets
68
68
69
69
### vCache Datasets
70
70
@@ -73,6 +73,7 @@ The official benchmark datasets are hosted on Hugging Face and will be downloade
-**`vCache/SemBenchmarkCombo`** (combines SemBenchmarkLmArena with SemBenchmarkSearchQueries with no-cache-hit scenarios): [Dataset ↗](https://huggingface.co/datasets/vCache/SemBenchmarkCombo)
76
77
77
78
78
79
### Custom Datasets
@@ -120,7 +121,7 @@ You can benchmark vCache on your own datasets. The script supports `.csv` and `.
120
121
```
121
122
122
123
123
-
## 📦 Output
124
+
## Output
124
125
125
126
Benchmark results are saved to the `benchmarks/results/` directory, organized by dataset, embedding model, andLLM. For each run, the output includes:
126
127
-**JSON files** containing raw data on cache hits, misses, latency, accuracy metrics, and internal vCache statistics.
0 commit comments