-
Notifications
You must be signed in to change notification settings - Fork 273
CBMC May Struggle with sscanf
and strcmp
Handling in Symbolic Execution?
#8602
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Perhaps this sounds naive, but is it possible that in theory, CBMC it's not guaranteed to always give a valid counterexample that triggers the failure in the original program (assuming that failure is reachable) ? As a model checker, CBMC.should only offer a SAT/SMT solution for an over-approximated model of the safety property of the original program, which might not be a real counterexample? Yet, since CBMC is also a symbolic executor, shouldn't its counterexample be correct? Sorry for my lack of understanding. |
Hi, if the format string actually contains a format specifier you get the expected result: #include <stdio.h>
int main() {
char str[] = "hello world";
char out[20];
if(sscanf(str, "hello %s", out) == 1 ) {
assert(0);
}
return 0;
}
|
For the void main_strcmp() {
char buffer[MAX_BUFFER_SIZE]; // Symbolic input buffer
// Check if input exactly matches "Hello" using strcmp
if (strcmp(buffer, "Hello") == 0) {
assert(0); // Force CBMC to find an input that exactly matches "Hello"
}
}
|
You should also make sure the |
Good point, Remi! I was using a previously used unwind number and forgot to update it. I'll let you know the results. Your comments on Meanwhile, could you confirm the following: Is the counterexample produced by CBMC guaranteed to trigger the assertion under analysis, or could it be a spurious one, similar to cases in CEGAR? |
The symbolic execution engine of CBMC does not do over approximations on its own, but some models of the C standard library function may be over approximating. The source code of the CPROVER model of the C standard library is available under |
Hi Remi and cbmc developers. I have double-checked my usage of Command Line:cbmc test_ssncf.c --function harness_sscanf --no-standard-checks --no-built-in-assertions --unwind 600 --unwinding-assertions --trace Code:#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <assert.h>
#include <stdbool.h>
#define MAX_BUFFER_SIZE 10
void harness_sscanf() {
char buffer[MAX_BUFFER_SIZE];
char dummy[100];
__CPROVER_assume(buffer[MAX_BUFFER_SIZE - 1] == '\0');
// Check if input starts with "Hello" followed by at least something such as Hellooo
if(sscanf(buffer, "Hello%s", dummy) == 1) {
assert(0);
}
} Unexpected Counterexample:CBMC produces the following counterexample:
This does not seem to match the expected conditions for triggering the assertion failure. Additional Output:
Would appreciate any insights! Maybe this is because CBMC does not model |
This is a case where the model of the standard library function is more abstract than a real implementationL sscanf is modeled like this in this file src/ansi-c/library/stdio.c int sscanf(const char *restrict s, const char *restrict format, ...)
{
__CPROVER_HIDE:;
va_list list;
va_start(list, format);
int result=vsscanf(s, format, list);
va_end(list);
return result;
} And int vsscanf(const char *restrict s, const char *restrict format, va_list arg)
{
__CPROVER_HIDE:;
int result = __VERIFIER_nondet_int();
(void)*s;
(void)*format;
# if(defined(__aarch64__) || defined(_M_ARM64)) && !defined(__APPLE__)
while((__CPROVER_size_t)__CPROVER_POINTER_OFFSET(arg.__stack) <
__CPROVER_OBJECT_SIZE(arg.__stack))
{
void *a = va_arg(arg, void *);
__CPROVER_havoc_object(a);
}
# else
while((__CPROVER_size_t)__CPROVER_POINTER_OFFSET(arg) <
__CPROVER_OBJECT_SIZE(arg))
{
void *a = va_arg(arg, void *);
__CPROVER_havoc_object(a);
}
# endif
return result;
} Alternatively you could undef sscanf and use your own concrete version of sscanf: #undef sscanf
int sscanf(const char *str, const char *format, ...)
{
va_list args;
va_start(args, format);
int matched = 0; // Count of successfully matched items
const char *s = str;
const char *f = format;
while(*f && *s)
{
// Skip whitespace in both strings
while(isspace(*f))
f++;
while(isspace(*s))
s++;
if(*f == '%')
{
f++; // Move past '%'
// Handle width specification
int width = 0;
while(isdigit(*f))
{
width = width * 10 + (*f - '0');
f++;
}
if(width == 0)
width = INT_MAX;
switch(*f)
{
case 'd':
{ // Integer
int *target = va_arg(args, int *);
int value = 0;
int sign = 1;
// Handle sign
if(*s == '-')
{
sign = -1;
s++;
}
else if(*s == '+')
{
s++;
}
// Parse digits
int digits_read = 0;
while(isdigit(*s) && digits_read < width)
{
value = value * 10 + (*s - '0');
s++;
digits_read++;
}
if(digits_read > 0)
{
*target = value * sign;
matched++;
}
else
{
goto end;
}
break;
}
case 's':
{ // String
char *target = va_arg(args, char *);
int chars_read = 0;
while(!isspace(*s) && *s && chars_read < width)
{
*target++ = *s++;
chars_read++;
}
*target = '\0';
if(chars_read > 0)
{
matched++;
}
break;
}
case 'c':
{ // Single character
char *target = va_arg(args, char *);
if(*s)
{
*target = *s++;
matched++;
}
break;
}
case 'x':
{ // Hexadecimal
int *target = va_arg(args, int *);
int value = 0;
int digits_read = 0;
while(digits_read < width)
{
if(isdigit(*s))
{
value = value * 16 + (*s - '0');
}
else if(*s >= 'a' && *s <= 'f')
{
value = value * 16 + (*s - 'a' + 10);
}
else if(*s >= 'A' && *s <= 'F')
{
value = value * 16 + (*s - 'A' + 10);
}
else
{
break;
}
s++;
digits_read++;
}
if(digits_read > 0)
{
*target = value;
matched++;
}
break;
}
case '[':
{ // Character set
char *target = va_arg(args, char *);
f++; // Move past '['
// Build character set
char charset[256] = {0};
int invert = 0;
if(*f == '^')
{
invert = 1;
f++;
}
while(*f && *f != ']')
{
charset[(unsigned char)*f] = 1;
f++;
}
// Read matching characters
int chars_read = 0;
while(*s && chars_read < width)
{
if(charset[(unsigned char)*s] ^ invert)
{
*target++ = *s++;
chars_read++;
}
else
{
break;
}
}
*target = '\0';
if(chars_read > 0)
{
matched++;
}
break;
}
}
f++;
}
else
{
// Match literal character
if(*f == *s)
{
f++;
s++;
}
else
{
goto end;
}
}
}
end:
va_end(args);
return matched;
}
void main_sscanf()
{
char buffer[MAX_BUFFER_SIZE];
char dummy[MAX_BUFFER_SIZE];
// Check if input starts with "Hello" followed by at least something such as Hellooo
if(sscanf(buffer, "He%s", dummy) == 1)
{
assert(0);
}
} It runs and gives results, but because of all the complexity of the comcrete sscanf you'll have to stick with very short strings and be really patient with the SAT solver:e
|
Dear CBMC Team,
Thank you for your incredible work on CBMC, which has been invaluable for our verification tasks. I am writing to kindly inquire if there might be a limitation or bug in CBMC 6.4.1 on Mac OS that prevents it from correctly handling
sscanf
andstrcmp
with symbolic strings. I’ve encountered unexpected counterexamples in my tests, and I’d greatly appreciate your guidance or confirmation if this is a known issue.Test Program
Below is a minimal test program I’ve created to investigate this behavior:
Commands and Counterexamples
I ran the following commands on macOS ARM64 using CBMC 6.4.1:
For
harness_sscanf
Counterexample Produced:
For
harness_strcmp
Counterexample Produced:
For
harness_simple_check
Counterexample Produced:
Expected Behavior
For the first two harnesses, I expected CBMC to produce a counterexample where
buffer = "Hello\0..."
(e.g.,{'H', 'e', 'l', 'l', 'o', 0, 0, ..., 0}
). This would indicate that CBMC successfully found an input that:harness_sscanf
, starts with"Hello"
(sosscanf(buffer, "Hello") == 1
).harness_strcmp
, exactly matches"Hello"
(sostrcmp(buffer, "Hello") == 0
).Environment
In sum, this issue looks like a limitation or bug in CBMC 6.4.1 on macOS ARM64 that affects its handling of sscanf and strcmp with symbolic strings. The correct behavior of harness_simple_check suggests CBMC handles direct character comparisons well, but sscanf and strcmp struggle with constraint propagation or symbolic string modeling.
Thank you very much for your time and support!
The text was updated successfully, but these errors were encountered: