Skip to content

Commit

Permalink
update prec-rec-f1
Browse files Browse the repository at this point in the history
  • Loading branch information
anonymous4ACL24 committed Apr 28, 2024
1 parent f0471b7 commit df122bf
Show file tree
Hide file tree
Showing 3 changed files with 141 additions and 70 deletions.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
The source code of VulLibGen is included in `VulLibGen-code` (details in `VulLibGen-code/ReadMe.md`).
The dataset of VulLibGen is included in `VulLibGen-dataset` (details in `VulLibGen-dataset/ReadMe.md`).

The comparison results between VulLibGen and our baselines on Precision, Recall, F1 is included in `prec-rec-f1.csv`.
The comparison results between VulLibGen and our baselines on Precision, Recall, F1 is included in `prec-rec-f1.csv` and `prec-rec-f1.md`.

The status of our submitted <vulnerability, package> pairs is included in `submit-packages-appendix.csv`, and as of today, 38 of them are merged.

Expand Down
103 changes: 69 additions & 34 deletions prec-rec-f1.csv
Original file line number Diff line number Diff line change
@@ -1,34 +1,69 @@
Precision,Language,,,,,
,Java,JS,Python,Go,,Average
Chronos,0.516,0.447,0.55,0.71,,0.55575
VulLibMiner,0.669,0.742,0.825,0.647,,0.72075
ChatGPT,0.758,0.732,0.915,0.646,,0.76275
GPT4,0.783,0.768,0.868,0.712,,0.78275
LLaMa-7B,0.71,0.773,0.924,0.716,,0.78075
LLaMa-13B,0.72,0.765,0.904,0.775,,0.791
Vicuna-7B,0.697,0.768,0.929,0.782,,0.794
Vicuna-13B,0.71,0.773,0.935,0.804,,0.8055
,,,,,,
,,,,,,
Recall,Language,,,,,
,Java,JS,Python,Go,,Average
Chronos,0.4,0.412,0.286,0.605,,0.42575
VulLibMiner,0.52,0.709,0.499,0.544,,0.568
ChatGPT,0.573,0.698,0.58,0.544,,0.59875
GPT4,0.596,0.714,0.542,0.58,,0.608
LLaMa-7B,0.548,0.734,0.593,0.625,,0.625
LLaMa-13B,0.554,0.73,0.601,0.656,,0.63525
Vicuna-7B,0.543,0.731,0.601,0.66,,0.63375
Vicuna-13B,0.552,0.736,0.621,0.688,,0.64925
,,,,,,
,,,,,,
F1,Language,,,,,
,Java,JS,Python,Go,,Average
Chronos,0.451,0.429,0.376,0.653,,0.482
VulLibMiner,0.585,0.725,0.622,0.591,,0.635
ChatGPT,0.653,0.715,0.71,0.591,,0.671
GPT4,0.677,0.74,0.667,0.639,,0.684
LLaMa-7B,0.619,0.753,0.722,0.667,,0.694
LLaMa-13B,0.626,0.747,0.722,0.711,,0.705
Vicuna-7B,0.61,0.749,0.73,0.716,,0.705
Vicuna-13B,0.621,0.754,0.746,0.741,,0.719
Precision,Language,,,,Average
,Java,JS,Python,Go,
k=1,,,,,
Chronos,0.516,0.447,0.55,0.71,0.55575
VulLibMiner,0.669,0.742,0.825,0.647,0.72075
GPT4,0.783,0.768,0.868,0.712,0.78275
Vicuna-13B,0.71,0.776,0.935,0.804,0.80625
,,,,,
,,,,,
k=2,,,,,
Chronos,0.673,0.598,0.554,0.767,0.648
VulLibMiner,0.695,0.725,0.789,0.669,0.7195
GPT4,0.762,0.752,0.851,0.692,0.76425
Vicuna-13B,0.722,0.779,0.858,0.747,0.7765
,,,,,
,,,,,
k=3,,,,,
Chronos,0.741,0.648,0.569,0.781,0.68475
VulLibMiner,0.724,0.723,0.782,0.715,0.736
GPT4,0.758,0.749,0.754,0.678,0.73475
Vicuna-13B,0.743,0.776,0.841,0.785,0.78625
,,,,,
,,,,,
,,,,,
Recall,Language,,,,
,Java,JS,Python,Go,Average
k=1,,,,,
Chronos,0.4,0.412,0.286,0.605,0.42575
VulLibMiner,0.52,0.709,0.499,0.544,0.568
GPT4,0.596,0.714,0.542,0.58,0.608
Vicuna-13B,0.552,0.736,0.621,0.688,0.64925
,,,,,
,,,,,
k=2,,,,,
Chronos,0.623,0.591,0.346,0.778,0.5845
VulLibMiner,0.647,0.719,0.561,0.653,0.645
GPT4,0.705,0.744,0.585,0.675,0.67725
Vicuna-13B,0.669,0.771,0.622,0.733,0.69875
,,,,,
,,,,,
k=3,,,,,
Chronos,0.722,0.645,0.392,0.605,0.591
VulLibMiner,0.705,0.72,0.603,0.713,0.68525
GPT4,0.737,0.745,0.597,0.675,0.6885
Vicuna-13B,0.72,0.773,0.657,0.782,0.733
,,,,,
,,,,,
,,,,,
F1,Language,,,,
,Java,JS,Python,Go,
k=1,,,,,
Chronos,0.451,0.429,0.376,0.653,0.482
VulLibMiner,0.585,0.725,0.622,0.591,0.635
GPT4,0.677,0.74,0.667,0.639,0.684
Vicuna-13B,0.621,0.755,0.746,0.741,0.719
,,,,,
,,,,,
k=2,,,,,
Chronos,0.647,0.594,0.426,0.772,0.615
VulLibMiner,0.67,0.722,0.656,0.661,0.68
GPT4,0.732,0.748,0.693,0.683,0.718
Vicuna-13B,0.694,0.775,0.721,0.74,0.736
,,,,,
,,,,,
k=3,,,,,
Chronos,0.731,0.646,0.464,0.682,0.634
VulLibMiner,0.714,0.721,0.681,0.714,0.71
GPT4,0.747,0.747,0.666,0.676,0.711
Vicuna-13B,0.731,0.774,0.738,0.783,0.759
106 changes: 71 additions & 35 deletions prec-rec-f1.md
Original file line number Diff line number Diff line change
@@ -1,38 +1,74 @@
| Precision | Language | | | | Average |
| ----------- | --------- | --------- | --------- | --------- | ----------- |
| | Java | JS | Python | Go | |
| k=1 | | | | | |
| Chronos | 0.516 | 0.447 | 0.55 | 0.71 | 0.55575 |
| VulLibMiner | 0.669 | 0.742 | 0.825 | 0.647 | 0.72075 |
| GPT4 | **0.783** | 0.768 | 0.868 | 0.712 | 0.78275 |
| Vicuna-13B | 0.71 | **0.776** | **0.935** | **0.804** | **0.80625** |
| | | | | | |
| | | | | | |
| k=2 | | | | | |
| Chronos | 0.673 | 0.598 | 0.554 | 0.767 | 0.648 |
| VulLibMiner | 0.695 | 0.725 | 0.789 | 0.669 | 0.7195 |
| GPT4 | **0.762** | 0.752 | 0.851 | 0.692 | 0.76425 |
| Vicuna-13B | 0.722 | **0.779** | **0.858** | **0.747** | **0.7765** |
| | | | | | |
| | | | | | |
| k=3 | | | | | |
| Chronos | 0.741 | 0.648 | 0.569 | 0.781 | 0.68475 |
| VulLibMiner | 0.724 | 0.723 | 0.782 | 0.715 | 0.736 |
| GPT4 | **0.758** | 0.749 | 0.754 | 0.678 | 0.73475 |
| Vicuna-13B | 0.743 | **0.776** | **0.841** | **0.785** | **0.78625** |


| Precision@1 | Language | | | | |
| --------------------------- | --------- | --------- | --------- | --------- | --------- |
| | Java | JS | Python | Go | Average |
| Chronos | 0.516 | 0.447 | 0.55 | 0.71 | 0.556 |
| **Chen 2023 (VulLibMiner)** | **0.669** | **0.742** | **0.825** | **0.647** | **0.721** |
| ChatGPT | 0.758 | 0.732 | 0.915 | 0.646 | 0.763 |
| GPT4 | 0.783 | 0.768 | 0.868 | 0.712 | 0.783 |
| LLaMa-7B | 0.71 | 0.773 | 0.924 | 0.716 | 0.781 |
| LLaMa-13B | 0.72 | 0.765 | 0.904 | 0.775 | 0.791 |
| Vicuna-7B | 0.697 | 0.768 | 0.929 | 0.782 | 0.794 |
| Vicuna-13B | **0.71** | **0.773** | **0.935** | **0.804** | **0.806** |

| **Recall@1** | **Language** | | | | |
| --------------------------- | ------------ | --------- | --------- | --------- | --------- |
| Model | Java | JS | Python | Go | Average |
| Chronos | 0.4 | 0.412 | 0.286 | 0.605 | 0.426 |
| **Chen 2023 (VulLibMiner)** | **0.520** | **0.709** | **0.499** | **0.544** | **0.568** |
| ChatGPT | 0.573 | 0.698 | 0.580 | 0.544 | 0.599 |
| GPT4 | 0.596 | 0.714 | 0.542 | 0.580 | 0.608 |
| LLaMa-7B | 0.548 | 0.734 | 0.593 | 0.625 | 0.625 |
| LLaMa-13B | 0.554 | 0.73 | 0.601 | 0.656 | 0.635 |
| Vicuna-7B | 0.543 | 0.731 | 0.601 | 0.66 | 0.634 |
| **Vicuna-13B** | **0.552** | **0.736** | **0.621** | **0.688** | **0.649** |

| **F1@1** | **Language** | | | | |
| --------------------------- | ------------ | --------- | --------- | --------- | --------- |
| Model | Java | JS | Python | Go | Average |
| Chronos | 0.451 | 0.429 | 0.376 | 0.653 | 0.482 |
| **Chen 2023 (VulLibMiner)** | **0.585** | **0.725** | **0.622** | **0.591** | **0.635** |
| ChatGPT | 0.653 | 0.715 | 0.710 | 0.591 | 0.671 |
| GPT4 | 0.677 | 0.740 | 0.667 | 0.639 | 0.684 |
| LLaMa-7B | 0.619 | 0.753 | 0.722 | 0.667 | 0.694 |
| LLaMa-13B | 0.626 | 0.747 | 0.722 | 0.711 | 0.705 |
| Vicuna-7B | 0.610 | 0.749 | 0.730 | 0.716 | 0.705 |
| **Vicuna-13B** | **0.621** | **0.754** | **0.746** | **0.741** | **0.719** |

| Recall | Language | | | | |
| ----------- | --------- | --------- | --------- | --------- | ----------- |
| | Java | JS | Python | Go | Average |
| k=1 | | | | | |
| Chronos | 0.4 | 0.412 | 0.286 | 0.605 | 0.42575 |
| VulLibMiner | 0.52 | 0.709 | 0.499 | 0.544 | 0.568 |
| GPT4 | **0.596** | 0.714 | 0.542 | 0.58 | 0.608 |
| Vicuna-13B | 0.552 | **0.736** | **0.621** | **0.688** | **0.64925** |
| | | | | | |
| | | | | | |
| k=2 | | | | | |
| Chronos | 0.623 | 0.591 | 0.346 | **0.778** | 0.5845 |
| VulLibMiner | 0.647 | 0.719 | 0.561 | 0.653 | 0.645 |
| GPT4 | **0.705** | 0.744 | 0.585 | 0.675 | 0.67725 |
| Vicuna-13B | 0.669 | **0.771** | **0.622** | 0.733 | **0.69875** |
| | | | | | |
| | | | | | |
| k=3 | | | | | |
| Chronos | 0.722 | 0.645 | 0.392 | 0.605 | 0.591 |
| VulLibMiner | 0.705 | 0.72 | 0.603 | 0.713 | 0.68525 |
| GPT4 | **0.737** | 0.745 | 0.597 | 0.675 | 0.6885 |
| Vicuna-13B | 0.72 | **0.773** | **0.657** | **0.782** | **0.733** |





| F1 | Language | | | | |
| ----------- | --------- | --------- | --------- | --------- | --------- |
| | Java | JS | Python | Go | |
| k=1 | | | | | |
| Chronos | 0.451 | 0.429 | 0.376 | 0.653 | 0.482 |
| VulLibMiner | 0.585 | 0.725 | 0.622 | 0.591 | 0.635 |
| GPT4 | **0.677** | 0.74 | 0.667 | 0.639 | 0.684 |
| Vicuna-13B | 0.621 | **0.755** | **0.746** | **0.741** | **0.719** |
| | | | | | |
| | | | | | |
| k=2 | | | | | |
| Chronos | 0.647 | 0.594 | 0.426 | **0.772** | 0.615 |
| VulLibMiner | 0.67 | 0.722 | 0.656 | 0.661 | 0.68 |
| GPT4 | **0.732** | 0.748 | 0.693 | 0.683 | 0.718 |
| Vicuna-13B | 0.694 | **0.775** | **0.721** | 0.74 | **0.736** |
| | | | | | |
| | | | | | |
| k=3 | | | | | |
| Chronos | 0.731 | 0.646 | 0.464 | 0.682 | 0.634 |
| VulLibMiner | 0.714 | 0.721 | 0.681 | 0.714 | 0.71 |
| GPT4 | **0.747** | 0.747 | 0.666 | 0.676 | 0.711 |
| Vicuna-13B | 0.731 | **0.774** | **0.738** | **0.783** | **0.759** |

0 comments on commit df122bf

Please sign in to comment.