-
Notifications
You must be signed in to change notification settings - Fork 170
Auto-saving and loading modules at compile time #1065
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
czgdp1807
wants to merge
9
commits into
lcompilers:main
Choose a base branch
from
czgdp1807:mod02
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Changes from all commits
Commits
Show all changes
9 commits
Select commit
Hold shift + click to select a range
2df870f
done
czgdp1807 b23b40d
done
czgdp1807 4552c22
done
czgdp1807 945f90a
done
czgdp1807 0fae2c6
done
czgdp1807 1761077
upd
czgdp1807 4219afc
Move stat to utils.h
Thirumalai-Shaktivel a03e2a3
WIP
Thirumalai-Shaktivel 2bc1503
Update reference tests
Shaikh-Ubaid File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -95,7 +95,7 @@ class ASRTreeVisitor : | |
{ | ||
public: | ||
bool show_intrinsic_modules; | ||
|
||
std::string get_str() { | ||
return s; | ||
} | ||
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1 @@ | ||
(TranslationUnit (SymbolTable 1 {_lpython_main_program: (Function (SymbolTable 96 {}) _lpython_main_program [f] [] [(SubroutineCall 1 f () [] ())] () Source Public Implementation () .false. .false. .false. .false. .false. [] [] .false.), f: (Function (SymbolTable 2 {list: (ExternalSymbol 2 list 4 list lpython_builtin [] list Private), s: (Variable 2 s Local () () Default (Character 1 -2 () []) Source Public Required .false.), x: (Variable 2 x Local () () Default (List (Character 1 -2 () [])) Source Public Required .false.), y: (Variable 2 y Local () () Default (List (Character 1 -2 () [])) Source Public Required .false.)}) f [list list list] [] [(= (Var 2 s) (StringConstant "lpython" (Character 1 7 () [])) ()) (= (Var 2 x) (FunctionCall 2 list () [((Var 2 s))] (List (Character 1 -2 () [])) () ()) ()) (= (Var 2 y) (ListConstant [(StringConstant "a" (Character 1 1 () [])) (StringConstant "b" (Character 1 1 () [])) (StringConstant "c" (Character 1 1 () []))] (List (Character 1 1 () []))) ()) (= (Var 2 x) (FunctionCall 2 list () [((Var 2 y))] (List (Character 1 -2 () [])) () ()) ()) (= (Var 2 x) (FunctionCall 2 list () [((StringConstant "lpython" (Character 1 7 () [])))] (List (Character 1 -2 () [])) (ListConstant [(StringConstant "l" (Character 1 1 () [])) (StringConstant "p" (Character 1 1 () [])) (StringConstant "y" (Character 1 1 () [])) (StringConstant "t" (Character 1 1 () [])) (StringConstant "h" (Character 1 1 () [])) (StringConstant "o" (Character 1 1 () [])) (StringConstant "n" (Character 1 1 () []))] (List (Character 1 1 () []))) ()) ())] () Source Public Implementation () .false. .false. .false. .false. .false. [] [] .false.), lpython_builtin: (IntrinsicModule lpython_builtin), main_program: (Program (SymbolTable 95 {}) main_program [] [(SubroutineCall 1 _lpython_main_program () [] ())])}) []) | ||
(TranslationUnit (SymbolTable 1 {_lpython_main_program: (Function (SymbolTable 91 {}) _lpython_main_program [f] [] [(SubroutineCall 1 f () [] ())] () Source Public Implementation () .false. .false. .false. .false. .false. [] [] .false.), f: (Function (SymbolTable 2 {list: (ExternalSymbol 2 list 4 list lpython_builtin [] list Private), s: (Variable 2 s Local () () Default (Character 1 -2 () []) Source Public Required .false.), x: (Variable 2 x Local () () Default (List (Character 1 -2 () [])) Source Public Required .false.), y: (Variable 2 y Local () () Default (List (Character 1 -2 () [])) Source Public Required .false.)}) f [list list list] [] [(= (Var 2 s) (StringConstant "lpython" (Character 1 7 () [])) ()) (= (Var 2 x) (FunctionCall 2 list () [((Var 2 s))] (List (Character 1 -2 () [])) () ()) ()) (= (Var 2 y) (ListConstant [(StringConstant "a" (Character 1 1 () [])) (StringConstant "b" (Character 1 1 () [])) (StringConstant "c" (Character 1 1 () []))] (List (Character 1 1 () []))) ()) (= (Var 2 x) (FunctionCall 2 list () [((Var 2 y))] (List (Character 1 -2 () [])) () ()) ()) (= (Var 2 x) (FunctionCall 2 list () [((StringConstant "lpython" (Character 1 7 () [])))] (List (Character 1 -2 () [])) (ListConstant [(StringConstant "l" (Character 1 1 () [])) (StringConstant "p" (Character 1 1 () [])) (StringConstant "y" (Character 1 1 () [])) (StringConstant "t" (Character 1 1 () [])) (StringConstant "h" (Character 1 1 () [])) (StringConstant "o" (Character 1 1 () [])) (StringConstant "n" (Character 1 1 () []))] (List (Character 1 1 () []))) ()) ())] () Source Public Implementation () .false. .false. .false. .false. .false. [] [] .false.), lpython_builtin: (IntrinsicModule lpython_builtin), main_program: (Program (SymbolTable 90 {}) main_program [] [(SubroutineCall 1 _lpython_main_program () [] ())])}) []) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the same approach as LFortran i.e., https://github.com/lfortran/lfortran/blob/f07bc66f885d4bc365be140b23a2b62f7e205d85/src/bin/lfortran.cpp#L609-L649 but still this issue persists. The reason I think is because we keep using the same process to compile the modules as well as the main program. However LFortran launches a different process to compile modules and a different one to compile the main program. If we want to follow the auto-compile, load and save approach in LPython then probably we shouldn't compare Symtab ID for exact matches while comparing the reference tests and the ASR output. @certik Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or you can add an internal option to not use pre-compiled modules (i.e., ignore pyc files) but compile every-time while generating reference tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made some changes (see in 8317729 which is an experimental commit) where I launched a separate lpython process to compile modules and save them in pyc files. And it worked on my mac. Between consecutive clean builds reference tests matched. So I think the issue is not the one in #992 (comment). The real issue is modules getting compiled with other programs because of which the module's symbol tables are affected. If modules are compiled by an isolated
lpython
process then I don't see any such thing happening.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tests pass with 8317729 (an experimental commit). So clearly the issue is what I explained in my above comment. Now
system
is not a safe way to compile modules in an isolatedlpython
process. Some better ways include,popen
- See the comparison betweensystem
andpopen
in https://stackoverflow.com/a/8538345.spawn
- https://stackoverflow.com/a/55490499I would say encapsulate the logic to safely compile a module in a API inside LPython and then call it everywhere. Current approach in
main
is very non-deterministic so we need to change it anyways.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should not be using any kind of isolated process. We simply can always compile using a new class for compilation. We just have to handle the symbol table ID correctly, for example as a local variable, etc.
Furthermore, we should use exactly the same code with LFortran, not maintain two separate codes to load/save ASR to mod files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code and logic is exactly the same. The difference between LFortran and LPython is explained below,
Basically LFortran compiles modules according to the instructions in the build scripts (CMakeLists.txt). However LPython compiles modules on the fly. So a module compiled with one program will give different ASR when compiled along with another program. This is not the issue in LFortran because there we compile the modules, save them in
.mod
files before compiling the main program. In fact that's why such handling is not present in LFortran I think because LFortran doesn't compile modules on the fly.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So let's think what happens when we compile modules on the fly. Consider three files
a.py
,b.py
andmodule.py
.a.py
andb.py
importmodule.py
.Case 1
We execute,
lpython a.py
first and thenlpython b.py
. Since we compile modules on the fly during execution oflpython a.py
, when we will reach the following call to compilemodule.py
,lpython/src/lpython/semantics/python_ast_to_asr.cpp
Lines 243 to 244 in d4e2f74
some symbol tables would have been created already by then. Say
n
symbol tables were created. So the ID of first symbol table while compilingmodule.py
will ben
(symbol table counter starts from0
and its global for a singlelpython
process). The same thing will be stored inmodule.pyc
. So minimum symbol table ID inmodule.pyc
will ben
.Now once the
module.pyc
is generated we will continue compilinga.py
and its symbol table IDs will start fromn + number_of_symtabs_in_module
.Case 2
After cleaning our repository, we execute
lpython b.py
first and thenlpython a.py
. Now again we will reach the following call,lpython/src/lpython/semantics/python_ast_to_asr.cpp
Lines 243 to 244 in d4e2f74
However this time (since we are compiling b.py) say
m
symbol tables were created till the above call. So now minimum symbol table ID inmodule.pyc
will bem
.Now once the
module.pyc
is generated we will continue compilingb.py
and its symbol table IDs will start fromm + number_of_symtabs_in_module
.So you can see how compiling on the fly produces different ASRs for
a.py
andb.py
because after compilingmodule.py
on the fly,SymbolTable::counter
continues from where it left after completingmodule.py
. This affects the program's ASR as well.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the symbol table ID will differ based on who compiled it. This is not a problem, as the deserialization renumbers the Symbol IDs already. LFortran also has to (and does) it.