-
Notifications
You must be signed in to change notification settings - Fork 273
Document symex instructions #7497
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Document symex instructions #7497
Conversation
The `goto_modelt` is the main data structure that CBMC (and the other tools) use to | ||
represent GOTO-IR (the `GOTO` Intermediate Representation). | ||
|
||
A `goto_modelt` is effectively a product type, consisting of: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as on the previous draft; I would suggest using "pair" or "tuple" as a more C++-ish way of saying this.
|
||
A `goto_modelt` is effectively a product type, consisting of: | ||
|
||
* A list of GOTO functions (pseudocode: `type goto_functionst = list<goto_functiont>`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not really a list...
A `goto_functiont` is also a product type. It's designed to represent a function | ||
at the IR level, and effectively it's the following ADT (in pseudocode): | ||
|
||
```js |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you use C++-style pseudocode?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And / or a reference to the code.
Another important data structure that needs to be discussed at this point is | ||
`source_locationt`. | ||
|
||
This is an `irept`. `irep`s are the central data structure that model most entities inside |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
irept
is pretty critical and deserves space of its own, not just under source_locationt
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
|
||
## A (very) short introduction to Symex | ||
|
||
Symex is, at its core, a GOTO-program interpreter. While Symex is interpreting |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'an interpreter BUT using symbolic values, not literal ones. For example... Note that this produces a formula which describes all possible outputs rather than a single output value.'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. Changed
} | ||
``` | ||
|
||
If we ask CBMC to print the GOTO-program instructions with `--show-goto-functions`, then part |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be in the goto-functions
documentation rather than just in symex.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I presume (hope) it is. Maybe here we should have a cross reference to that part of the goto-functions documentation.
^SIGNAL=0$ | ||
-- | ||
-- | ||
The test is a minimized version of code generated from s2n-quic using Kani. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More tests are always good but why is this in a documentation PR? I guess I have missed something...
src/goto-symex/symex_assign.cpp
Outdated
@@ -215,14 +215,6 @@ void symex_assignt::assign_non_struct_symbol( | |||
assignment.rhs, | |||
target, | |||
symex_config.allow_pointer_unsoundness); | |||
// Erase the composite symbol from our working state. Note that we need to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There could well be a good justification for this removal of code but please could they not be in a documentation PR?
Hi @martin-cs , this was branched of at an earlier point from the same branch as the other documentation PR (#7470), and it hasn't tracked the changes that addressed the comments on the other PR, so it appears like it has the same problems, but it's just being stale. This means that the earlier commits here are just adding noise, so I will focus on addressing everything on the other PR, get it merged, and then rebase this one on top of |
@@ -64,9 +71,10 @@ To be documented. | |||
|
|||
To be documented. | |||
|
|||
## Goto functions → BMC → Counterexample (trace) ## | |||
## Goto functions → BMC → Counterexample (trace) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this change? Doesn't seem to match the other lines.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did this as a drive-by clean up. As far as I know headings in markdown are only prepended by a ##
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On a second visit, it appears that doxygen does support any number of #
after the title, but that's invalid markdown.
If this change offends in any capacity however, it's easy to be reverted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems this file has that issue multiple times. Can you please clean up all occurrences? (src/goto-instrument/README.md also suffers from this problem. Maybe do one commit that does the cleanup for all of these?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean fair enough. I was thinking maybe there should be a separate PR that does this clean up. Or, at least, maybe consistently clean them all up in this file as there are many line this (as Michael as noticed). You want to leave the file in a consistent state so that newcomers will follow the example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will make trailing #
consistent by removing them in the files touched by this PR.
Another important data structure that needs to be discussed at this point is | ||
`source_locationt`. | ||
|
||
This is an `irept`. `irep`s are the central data structure that model most entities inside |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
purposes: | ||
|
||
```c | ||
enum goto_program_instruction_typet |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe throw in some "..."s here to emphasize that this isn't the full set.
enum goto_program_instruction_typet | ||
{ | ||
GOTO = 1, // branch, possibly guarded | ||
ASSUME = 2, // non-failing guarded self loop |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is perhaps not the worlds most obvious description of an assumption.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason for this is that it's a copy from goto_program_instruction_typet
where the comment of ASSUME
states what was written.
I changed it to ASSUME
to avoid confusion.
```c | ||
int main(int argc, char **argv) | ||
{ | ||
int arry[] = {0, 1, 2, 3}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It this point you may as well just say "array", or go the whole way and call it "a". "arry" offers neither brevity nor clarity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to a
and improved assertion comment
} | ||
``` | ||
|
||
If we ask CBMC to print the GOTO-program instructions with `--show-goto-functions`, then part |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I presume (hope) it is. Maybe here we should have a cross reference to that part of the goto-functions documentation.
dispatching functions are usually in their own file, designated by the instruction's | ||
name. As an example, you can find the code for the function goto_symext::symex_goto | ||
in [symex_goto.cpp](../../src/goto-symex/symex_goto.cpp) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A description of what symex produces, why, and what it means could use a longer explanation.
@@ -74,8 +74,8 @@ void multi_path_symex_only_checkert::generate_equation() | |||
{ | |||
const auto symex_start = std::chrono::steady_clock::now(); | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR seems like it should be a couple of PRs about different things. This doesn't feel like it belongs in a documentation PR.
1ce9b87
to
1eb26aa
Compare
Codecov ReportBase: 78.49% // Head: 78.49% // No change to project coverage 👍
Additional details and impacted files@@ Coverage Diff @@
## develop #7497 +/- ##
========================================
Coverage 78.49% 78.49%
========================================
Files 1663 1663
Lines 191339 191339
========================================
Hits 150185 150185
Misses 41154 41154
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
@@ -64,9 +71,10 @@ To be documented. | |||
|
|||
To be documented. | |||
|
|||
## Goto functions → BMC → Counterexample (trace) ## | |||
## Goto functions → BMC → Counterexample (trace) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems this file has that issue multiple times. Can you please clean up all occurrences? (src/goto-instrument/README.md also suffers from this problem. Maybe do one commit that does the cleanup for all of these?)
Symex is, at its core, a GOTO-program interpreter. While Symex is interpreting | ||
the program, it's also building a list of SSA steps that form part of the equation | ||
that is to be sent to the solver. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be better to refer the reader to src/goto-symex/README.md
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea, added
SET RETURN VALUE side_effect statement="nondet" is_nondet_nullable="1" | ||
// 5 file /tmp/example.c line 5 function main | ||
END_FUNCTION | ||
``` | ||
|
||
In the above excerpt, the capitalised words at the beginning each instruction are the | ||
correspondent instruction types (`DECL`, `ASSIGN`, etc). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All of the above really belongs to goto programs, not so much to symex, does it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but in this file we are describing the Instruction type that is exactly what we are showing.
In my opinion it is useful to show how the goto-program looks like.
@martin-cs Can you please have a look and confirm your concerns have been addressed? |
@martin-cs Can you please act on this soon? It's been a week since the last request for review and more than that since various fixes were pushed. Note that we plan to force merge this in about 24 hours unless there is some new specific concern. We can make further changes in follow up tickets if necessary, but we'd like to merge this and progress ASAP. |
Add documentation for the
goto-symex
andgoto-program-instructions
.