Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

C++: Remove pointer/pointee conflation from models of "pure" functions #18556

Merged
42 changes: 32 additions & 10 deletions cpp/ql/lib/semmle/code/cpp/models/implementations/Pure.qll
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
import semmle.code.cpp.models.interfaces.ArrayFunction
import semmle.code.cpp.models.interfaces.Taint
import semmle.code.cpp.models.interfaces.DataFlow
import semmle.code.cpp.models.interfaces.Alias
import semmle.code.cpp.models.interfaces.SideEffect

Expand All @@ -8,7 +9,7 @@ import semmle.code.cpp.models.interfaces.SideEffect
* guaranteed to be side-effect free.
*/
private class PureStrFunction extends AliasFunction, ArrayFunction, TaintFunction,
SideEffectFunction
SideEffectFunction, DataFlowFunction
{
PureStrFunction() {
this.hasGlobalOrStdOrBslName([
Expand All @@ -26,22 +27,43 @@ private class PureStrFunction extends AliasFunction, ArrayFunction, TaintFunctio
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could add a helper predicate for the locale argument check?

Suggested change
/** Holds if `i` is a locale parameter that does not carry taint. */
private predicate isLocaleParameter(ParameterIndex i) {
this.getName().matches("%\\_l") and i + 1 = this.getNumberOfParameters()
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea! Fixed in 7792839

override predicate hasTaintFlow(FunctionInput input, FunctionOutput output) {
exists(ParameterIndex i |
exists(ParameterIndex i | exists(this.getParameter(i)) |
// For these functions we add taint flow according to the following rules:
// 1. If the parameter is of a pointer type then there is taint from the
// indirection of the parameter. Otherwise, there is taint from the
// parameter.
// 2. If the return value is of a pointer type then there is taint to the
// indirection of the return. Otherwise, there is taint to the return.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to move this comment up before the exists given that it concerns both the stuff inside the exists and also the conjunct after it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. Fixed in 67e3b69

(
input.isParameter(i) and
exists(this.getParameter(i))
or
input.isParameterDeref(i) and
this.getParameter(i).getUnspecifiedType() instanceof PointerType
if this.getParameter(i).getUnspecifiedType() instanceof PointerType
then input.isParameterDeref(i)
else input.isParameter(i)
) and
// Functions that end with _l also take a locale argument (always as the last argument),
// and we don't want taint from those arguments.
(not this.getName().matches("%\\_l") or exists(this.getParameter(i + 1)))
) and
(
output.isReturnValueDeref() and
this.getUnspecifiedType() instanceof PointerType
or
if this.getUnspecifiedType() instanceof PointerType
then output.isReturnValueDeref()
else output.isReturnValue()
)
or
// If there is taint flow from *input to *output then there is also taint
// flow from input to output.
this.hasTaintFlow(input.getIndirectionInput(), output.getIndirectionOutput()) and
// no need to add taint-flow if we already have dataflow
MathiasVP marked this conversation as resolved.
Show resolved Hide resolved
not this.hasDataFlow(input, output)
}

override predicate hasDataFlow(FunctionInput input, FunctionOutput output) {
exists(int i |
input.isParameter(i) and
// see the comment in `hasTaintFlow` for an explanation
(not this.getName().matches("%\\_l") or exists(this.getParameter(i + 1))) and
// These functions always return the same pointer as they are given
this.hasGlobalOrStdOrBslName([strrev(), strlwr(), strupr()]) and
this.getParameter(i).getUnspecifiedType() instanceof PointerType and
output.isReturnValue()
)
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7741,6 +7741,32 @@ WARNING: module 'TaintTracking' has been deprecated and may be removed in future
| taint.cpp:809:8:809:9 | p2 | taint.cpp:809:7:809:9 | * ... | TAINT |
| taint.cpp:811:12:811:28 | call to SysAllocStringLen | taint.cpp:812:8:812:9 | p3 | |
| taint.cpp:812:8:812:9 | p3 | taint.cpp:812:7:812:9 | * ... | TAINT |
| taint.cpp:817:42:817:46 | p_out | taint.cpp:817:42:817:46 | p_out | |
| taint.cpp:817:42:817:46 | p_out | taint.cpp:819:4:819:8 | p_out | |
| taint.cpp:817:62:817:65 | p_in | taint.cpp:817:62:817:65 | p_in | |
| taint.cpp:817:62:817:65 | p_in | taint.cpp:818:20:818:23 | p_in | |
| taint.cpp:818:19:818:23 | * ... | taint.cpp:819:19:819:19 | q | |
| taint.cpp:818:20:818:23 | p_in | taint.cpp:818:19:818:23 | * ... | TAINT |
| taint.cpp:819:3:819:8 | * ... [post update] | taint.cpp:817:42:817:46 | p_out | |
| taint.cpp:819:3:819:8 | * ... [post update] | taint.cpp:819:4:819:8 | p_out [inner post update] | |
| taint.cpp:819:3:819:25 | ... = ... | taint.cpp:819:3:819:8 | * ... [post update] | |
| taint.cpp:819:4:819:8 | p_out | taint.cpp:819:3:819:8 | * ... | TAINT |
| taint.cpp:819:12:819:17 | call to strchr | taint.cpp:819:3:819:25 | ... = ... | |
| taint.cpp:819:19:819:19 | q | taint.cpp:819:12:819:17 | call to strchr | TAINT |
| taint.cpp:819:22:819:24 | 47 | taint.cpp:819:12:819:17 | call to strchr | TAINT |
| taint.cpp:822:33:822:35 | out | taint.cpp:822:33:822:35 | out | |
| taint.cpp:822:33:822:35 | out | taint.cpp:823:27:823:29 | out | |
| taint.cpp:822:50:822:51 | in | taint.cpp:822:50:822:51 | in | |
| taint.cpp:822:50:822:51 | in | taint.cpp:823:33:823:34 | in | |
| taint.cpp:823:26:823:29 | ref arg & ... | taint.cpp:822:33:822:35 | out | |
| taint.cpp:823:26:823:29 | ref arg & ... | taint.cpp:823:27:823:29 | out [inner post update] | |
| taint.cpp:823:27:823:29 | out | taint.cpp:823:26:823:29 | & ... | |
| taint.cpp:823:32:823:34 | ref arg & ... | taint.cpp:822:50:822:51 | in | |
| taint.cpp:823:32:823:34 | ref arg & ... | taint.cpp:823:33:823:34 | in [inner post update] | |
| taint.cpp:823:33:823:34 | in | taint.cpp:823:32:823:34 | & ... | |
| taint.cpp:827:20:827:34 | call to indirect_source | taint.cpp:829:23:829:24 | in | |
| taint.cpp:828:15:828:17 | out | taint.cpp:829:18:829:20 | out | |
| taint.cpp:828:15:828:17 | out | taint.cpp:830:8:830:10 | out | |
| vector.cpp:16:43:16:49 | source1 | vector.cpp:17:26:17:32 | source1 | |
| vector.cpp:16:43:16:49 | source1 | vector.cpp:31:38:31:44 | source1 | |
| vector.cpp:17:21:17:33 | call to vector | vector.cpp:19:14:19:14 | v | |
Expand Down
18 changes: 18 additions & 0 deletions cpp/ql/test/library-tests/dataflow/taint-tests/taint.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -810,4 +810,22 @@ void test_sysalloc() {

auto p3 = SysAllocStringLen((LPOLESTR)indirect_source(), 10);
sink(*p3); // $ ir MISSING: ast
}

char* strchr(const char*, int);

void write_to_const_ptr_ptr(const char **p_out, const char **p_in) {
const char* q = *p_in;
*p_out = strchr(q, '/');
}

void take_const_ptr(const char *out, const char *in) {
write_to_const_ptr_ptr(&out, &in);
MathiasVP marked this conversation as resolved.
Show resolved Hide resolved
}

void test_write_to_const_ptr_ptr() {
const char* in = indirect_source();
const char* out;
take_const_ptr(out, in);
sink(out); // $ SPURIOUS: ast
}
Original file line number Diff line number Diff line change
Expand Up @@ -626,6 +626,11 @@ signatureMatches
| taint.cpp:725:10:725:15 | strtol | (XCHAR *,const XCHAR *,int) | CSimpleStringT | CopyCharsOverlapped | 2 |
| taint.cpp:727:6:727:16 | test_strtol | (char *) | CStringT | CStringT | 0 |
| taint.cpp:785:6:785:15 | fopen_test | (char *) | CStringT | CStringT | 0 |
| taint.cpp:815:7:815:12 | strchr | (LPCOLESTR,int) | CComBSTR | Append | 1 |
| taint.cpp:815:7:815:12 | strchr | (char,int) | CStringT | CStringT | 1 |
| taint.cpp:815:7:815:12 | strchr | (const XCHAR *,int) | CStringT | CStringT | 1 |
| taint.cpp:815:7:815:12 | strchr | (const YCHAR *,int) | CStringT | CStringT | 1 |
| taint.cpp:815:7:815:12 | strchr | (wchar_t,int) | CStringT | CStringT | 1 |
| vector.cpp:333:6:333:35 | vector_iterator_assign_wrapper | (LPCOLESTR,int) | CComBSTR | Append | 1 |
| vector.cpp:333:6:333:35 | vector_iterator_assign_wrapper | (char,int) | CStringT | CStringT | 1 |
| vector.cpp:333:6:333:35 | vector_iterator_assign_wrapper | (const XCHAR *,int) | CStringT | CStringT | 1 |
Expand Down Expand Up @@ -2029,6 +2034,12 @@ getParameterTypeName
| taint.cpp:802:6:802:22 | SysAllocStringLen | 0 | const OLECHAR * |
| taint.cpp:802:6:802:22 | SysAllocStringLen | 0 | const wchar_t * |
| taint.cpp:802:6:802:22 | SysAllocStringLen | 1 | unsigned int |
| taint.cpp:815:7:815:12 | strchr | 0 | const char * |
| taint.cpp:815:7:815:12 | strchr | 1 | int |
| taint.cpp:817:6:817:27 | write_to_const_ptr_ptr | 0 | const char ** |
| taint.cpp:817:6:817:27 | write_to_const_ptr_ptr | 1 | const char ** |
| taint.cpp:822:6:822:19 | take_const_ptr | 0 | const char * |
| taint.cpp:822:6:822:19 | take_const_ptr | 1 | const char * |
| vector.cpp:13:6:13:9 | sink | 0 | int |
| vector.cpp:14:27:14:30 | sink | 0 | vector> & |
| vector.cpp:14:27:14:30 | sink | 0 | vector> & |
Expand Down
Loading