Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pull Request- Syntax Plugin Infrastructure. #1

Merged
merged 5 commits into from
Aug 17, 2020

Conversation

tpops
Copy link

@tpops tpops commented Jul 7, 2020

Expansion of plugin infrastructure to improve flexibility for Syntax Handler Developers.

  1. "Forget the function definition that triggered the plugin and insert the replacement tokens after it.
  2. Syntax handler developers have access to the original name of the function using the Declarator object in the Get Replacement function call.
  3. Modified the PluginSyntax example project to work with the new syntax handler plugin.
  4. Modified the PluginSyntax test case so it tests for the new syntax handler plugin.

@amccaskey
Copy link

amccaskey commented Jul 8, 2020

@tpops If I understand this correctly, this PR will take

clang::syntax(NAME) void foo() {
 ...
} 

and produce

void __foo() { // old, not used anymore
// empty
}
... REWRITTEN OUTPUT STREAM ...

Let me know if I'm wrong about this.

If thats the case, then this PR should work for our quantum computing use-cases / feature updates @hfinkel

@tpops
Copy link
Author

tpops commented Jul 8, 2020

Yes exactly

You can also get the original function declaration with the preprocessor and declarator objects.

auto DeclCharRange = Lexer::getAsCharRange (D.getSourceRange(), 
                    PP.getSourceManager(), PP.getLangOpts());
auto DeclText= Lexer::getSourceText (DeclCharRange,PP.getSourceManager(), 
                        PP.getLangOpts());
 OS << DeclText << "\n { \n } \n";
    `

@amccaskey
Copy link

👍

Copy link
Owner

@hfinkel hfinkel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this! It's clearly important that we support adding additional declarations/definitions after the processed function. I'm not convinced, however, that this is the best approach. It violates the principle of least surprise for the plugin to replace a function with a particular valid definition with another with some incompatible definition, and that's exactly what this allows. Plus, having the empty function around is not clearly desirable (and might generate warnings?).

Instead, I'd prefer that GetReplacement simply returned two streams, one for the replacement function body, and a second to be placed after the processed function. Is there some reason that approach is problematic or suboptimal?

ReplacementOS << "\n}\n";
// provide token stream to the plugin and get back replacement text
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please follow LLVM's coding conventions here: Comments should be complete sentences, start with a capital letter, end with appropriate punctuation, etc.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

ReplacementOS.flush();

//Now we change the identifier name in the declarator to forget
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here (plus there should be a space between the // and the text.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, done.

@@ -1196,9 +1196,20 @@ void Parser::ProcessPluginSyntax(ParsingDeclarator &D) {
llvm::raw_string_ostream ReplacementOS(Replacement);

ReplacementOS << "\n{\n";
SHI->second->GetReplacement(PP, D, Toks, ReplacementOS);
ReplacementOS << "\n}\n";
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if the function does not return void? Don't we get a warning that the function doesn't return an appropriate value?

@tpops
Copy link
Author

tpops commented Jul 13, 2020

The idea of having two streams is a great idea. However, the following would be a problem:

  1. The function been transformed would not have access to struct declarations made just after by the second stream.
// The function declaration is already parsed at this point.
[[clang::syntax(plugin)]] void doStuff(){
  // Streamed tokens replaced here would not have access to declarations 
  // such as function declarations and structures. 
  
  // You might want to have forward declaration for functions that would be 
  // generated by the second stream. However, forward declarations for structs / classes
  // defined by the second token stream would not work.
}

// Second stream tokens placed here
  1. The current problem the compiler 'return' warnings could be solved by placing __builtin_unreachable(); in the forgotten function.

Least surprise is also a valid point.

@amccaskey
Copy link

amccaskey commented Jul 13, 2020

I too am a fan of 2 streams. Is it possible to provide a third stream for function declarations that go before the transformed function, which we could then implement later in the second replacement stream after the closing function r_brace? Our use case has the same problem that @tpops just mentioned.

@tpops
Copy link
Author

tpops commented Jul 13, 2020

There actually exists

void AddToPredefines(llvm::raw_string_ostream &OS) override;

where you could add predefines, and predefined structs could be added here. However, this function is used by the plugin infrastructure before the GetReplacement is called. So there is no such thing as generating declarations after scanning the content of the transformed function.

My concerns is only for struct declarations that might want to be generated by scanning the tokens in the transformed function.

Function declarations can be solved easily by doing forward declaration

// The function declaration is already parsed at this point.
[[clang::syntax(plugin)]] void doStuff(){
  // Streamed tokens replaced here would not have access do declarations 
  // such as function declarations and structures. 
  
  // Forward declaration for doFoo
  void doFoo();
  doFoo();
}

// Second stream tokens placed here
void doFoo(){
  printf("doing foo");
}

@amccaskey
Copy link

Awesome, thanks! That's exactly what I would need. So yea, no need for anything separate just for function declarations.

@tpops
Copy link
Author

tpops commented Jul 13, 2020

@hfinkel I had a conversation with the rest of the AD group and here are the few key take away:

"I still do believe that function should be forgotten. The user input might not be sufficient in many ways. not creating a warning is a different issue we can address in various ways, including suppressing warnings for [[syntax(..)]] functions all together." - Johannes.

In complex implementations , code provided by the user might be used to generate structures and functions, this would be very tricky to work with for your two stream approach. In the case of our plugin implementation, we generate code from tensor operations provided by the user, the two stream approach would not be a favorable approach in this case.

@tpops
Copy link
Author

tpops commented Jul 14, 2020

Awesome, thanks! That's exactly what I would need. So yea, no need for anything separate just for function declarations.

You welcome. Glad I could be of help

@hfinkel
Copy link
Owner

hfinkel commented Jul 30, 2020

I apologize for the delay in getting back to this. I recommend that we move forward with this, but we should do the following:

  1. Add __builtin_unreachable to the original definition. This should take care of any warnings, and in addition, is a semantically-accurate description of what's going on.
  2. Provide a utility function, wrapping what you put in the review above, to make it easy for the plugin to recreate the original definition. We should update the example plugin to use that utility function.

Any objections?

@tpops
Copy link
Author

tpops commented Aug 1, 2020

No Objections. Do you suggest, I create a new pull request for the new code changes or I go ahead and modify this request.

Would like to know which you would prefer @hfinkel

@hfinkel
Copy link
Owner

hfinkel commented Aug 1, 2020

No Objections. Do you suggest, I create a new pull request for the new code changes or I go ahead and modify this request.

Would like to know which you would prefer @hfinkel

I have no preference. Thanks!

@tpops
Copy link
Author

tpops commented Aug 3, 2020

Sounds good. Just to reiterate the updated commits on current pull request:

  1. Forgets the function definition that triggered the plugin and insert the replacement tokens after it.
  2. Adds __builtin_unreachable to the original definition.
  3. Provides a utility function, getDeclText(Preprocessor &PP,Declarator &D); .
  4. Updates the example to use getDeclText.

@hfinkel hfinkel merged commit 9044b79 into hfinkel:csplugin Aug 17, 2020
acastanedam pushed a commit to acastanedam/llvm-project-csp that referenced this pull request Dec 19, 2022
- Decouple TSCs from trace items
- Turn TSCs into events just like CPUs. The new name is HW clock tick, wich could be reused by other vendors.
- Add a GetWallTime that returns the wall time that the trace plug-in can infer for each trace item.
- For intel pt, we are doing the following interpolation: if an instruction takes less than 1 TSC, we use that duration, otherwise, we assume the instruction took 1 TSC. This helps us avoid having to handle context switches, changes to kernel, idle times, decoding errors, etc. We are just trying to show some approximation and not the real data. For the real data, TSCs are the way to go. Besides that, we are making sure that no two trace items will give the same interpolation value. Finally, we are using as time 0 the time at which tracing started.

Sample output:

```
(lldb) r
Process 750047 launched: '/home/wallace/a.out' (x86_64)
Process 750047 stopped
* thread hfinkel#1, name = 'a.out', stop reason = breakpoint 1.1
    frame #0: 0x0000000000402479 a.out`main at main.cpp:29:20
   26   };
   27
   28   int main() {
-> 29     std::vector<int> vvv;
   30     for (int i = 0; i < 100; i++)
   31       vvv.push_back(i);
   32
(lldb) process trace start -s 64kb -t --per-cpu
(lldb) b 60
Breakpoint 2: where = a.out`main + 1689 at main.cpp:60:23, address = 0x0000000000402afe
(lldb) c
Process 750047 resuming
Process 750047 stopped
* thread hfinkel#1, name = 'a.out', stop reason = breakpoint 2.1
    frame #0: 0x0000000000402afe a.out`main at main.cpp:60:23
   57     map<int, int> m;
   58     m[3] = 4;
   59
-> 60     map<string, string> m2;
   61     m2["5"] = "6";
   62
   63     std::vector<std::string> vs = {"2", "3"};
(lldb) thread trace dump instructions -t -f -e thread hfinkel#1: tid = 750047
    0: [379567.000 ns] (event) HW clock tick [48599428476224707]
    1: [379569.000 ns] (event) CPU core changed [new CPU=2]
    2: [390487.000 ns] (event) HW clock tick [48599428476246495]
    3: [1602508.000 ns] (event) HW clock tick [48599428478664855]
    4: [1662745.000 ns] (event) HW clock tick [48599428478785046]
  libc.so.6`malloc
    5: [1662746.995 ns] 0x00007ffff7176660    endbr64
    6: [1662748.991 ns] 0x00007ffff7176664    movq   0x32387d(%rip), %rax      ;  + 408
    7: [1662750.986 ns] 0x00007ffff717666b    pushq  %r12
    8: [1662752.981 ns] 0x00007ffff717666d    pushq  %rbp
    9: [1662754.977 ns] 0x00007ffff717666e    pushq  %rbx
    10: [1662756.972 ns] 0x00007ffff717666f    movq   (%rax), %rax
    11: [1662758.967 ns] 0x00007ffff7176672    testq  %rax, %rax
    12: [1662760.963 ns] 0x00007ffff7176675    jne    0x9c7e0                   ; <+384>
    13: [1662762.958 ns] 0x00007ffff717667b    leaq   0x17(%rdi), %rax
    14: [1662764.953 ns] 0x00007ffff717667f    cmpq   $0x1f, %rax
    15: [1662766.949 ns] 0x00007ffff7176683    ja     0x9c730                   ; <+208>
    16: [1662768.944 ns] 0x00007ffff7176730    andq   $-0x10, %rax
    17: [1662770.939 ns] 0x00007ffff7176734    cmpq   $-0x41, %rax
    18: [1662772.935 ns] 0x00007ffff7176738    seta   %dl
    19: [1662774.930 ns] 0x00007ffff717673b    jmp    0x9c690                   ; <+48>
    20: [1662776.925 ns] 0x00007ffff7176690    cmpq   %rdi, %rax
    21: [1662778.921 ns] 0x00007ffff7176693    jb     0x9c7b0                   ; <+336>
    22: [1662780.916 ns] 0x00007ffff7176699    testb  %dl, %dl
    23: [1662782.911 ns] 0x00007ffff717669b    jne    0x9c7b0                   ; <+336>
    24: [1662784.906 ns] 0x00007ffff71766a1    movq   0x3236c0(%rip), %r12      ;  + 24
(lldb) thread trace dump instructions -t -f -e -J -c 4
[
  {
    "id": 0,
    "timestamp_ns": "379567.000000",
    "event": "HW clock tick",
    "hwClock": 48599428476224707
  },
  {
    "id": 1,
    "timestamp_ns": "379569.000000",
    "event": "CPU core changed",
    "cpuId": 2
  },
  {
    "id": 2,
    "timestamp_ns": "390487.000000",
    "event": "HW clock tick",
    "hwClock": 48599428476246495
  },
  {
    "id": 3,
    "timestamp_ns": "1602508.000000",
    "event": "HW clock tick",
    "hwClock": 48599428478664855
  },
  {
    "id": 4,
    "timestamp_ns": "1662745.000000",
    "event": "HW clock tick",
    "hwClock": 48599428478785046
  },
  {
    "id": 5,
    "timestamp_ns": "1662746.995324",
    "loadAddress": "0x7ffff7176660",
    "module": "libc.so.6",
    "symbol": "malloc",
    "mnemonic": "endbr64"
  },
  {
    "id": 6,
    "timestamp_ns": "1662748.990648",
    "loadAddress": "0x7ffff7176664",
    "module": "libc.so.6",
    "symbol": "malloc",
    "mnemonic": "movq"
  },
  {
    "id": 7,
    "timestamp_ns": "1662750.985972",
    "loadAddress": "0x7ffff717666b",
    "module": "libc.so.6",
    "symbol": "malloc",
    "mnemonic": "pushq"
  },
  {
    "id": 8,
    "timestamp_ns": "1662752.981296",
    "loadAddress": "0x7ffff717666d",
    "module": "libc.so.6",
    "symbol": "malloc",
    "mnemonic": "pushq"
  }
]
```

Differential Revision: https://reviews.llvm.org/D130054
acastanedam pushed a commit to acastanedam/llvm-project-csp that referenced this pull request Dec 19, 2022
Refactor the string conversion of the `lldb::InstructionControlFlowKind` enum out
of `Instruction::Dump` to enable reuse of this logic by the
JSON TraceDumper (to be implemented in separate diff).

Will coordinate the landing of this change with D130320 since there will be a minor merge conflict between
these changes.

Test Plan:
Run unittests
```
> ninja check-lldb
[4/5] Running lldb unit test suite

Testing Time: 10.13s
  Passed: 1084
```

Verify '-k' flag's output
```
(lldb) thread trace dump instructions -k
thread hfinkel#1: tid = 1375377
  libstdc++.so.6`std::ostream::flush() + 43
    7048: 0x00007ffff7b54dab    return      retq
    7047: 0x00007ffff7b54daa    other       popq   %rbx
    7046: 0x00007ffff7b54da7    other       movq   %rbx, %rax
    7045: 0x00007ffff7b54da5    cond jump   je     0x11adb0                  ; <+48>
    7044: 0x00007ffff7b54da2    other       cmpl   $-0x1, %eax
  libc.so.6`_IO_fflush + 249
    7043: 0x00007ffff7161729    return      retq
    7042: 0x00007ffff7161728    other       popq   %rbp
    7041: 0x00007ffff7161727    other       popq   %rbx
    7040: 0x00007ffff7161725    other       movl   %edx, %eax
    7039: 0x00007ffff7161721    other       addq   $0x8, %rsp
    7038: 0x00007ffff7161709    cond jump   je     0x87721                   ; <+241>
    7037: 0x00007ffff7161707    other       decl   (%rsi)
    7036: 0x00007ffff71616fe    cond jump   je     0x87707                   ; <+215>
    7035: 0x00007ffff71616f7    other       cmpl   $0x0, 0x33de92(%rip)      ; __libc_multiple_threads
    7034: 0x00007ffff71616ef    other       movq   $0x0, 0x8(%rsi)
    7033: 0x00007ffff71616ed    cond jump   jne    0x87721                   ; <+241>
    7032: 0x00007ffff71616e9    other       subl   $0x1, 0x4(%rsi)
    7031: 0x00007ffff71616e2    other       movq   0x88(%rbx), %rsi
    7030: 0x00007ffff71616e0    cond jump   jne    0x87721                   ; <+241>
    7029: 0x00007ffff71616da    other       testl  $0x8000, (%rbx)           ; imm = 0x8000
```

Differential Revision: https://reviews.llvm.org/D130580
acastanedam pushed a commit to acastanedam/llvm-project-csp that referenced this pull request Dec 19, 2022
This reverts commit 5fb4134.

This patch is causing crashes when building llvm-test-suite when
optimizing for CPUs with AVX512.

Reproducer crashing with llc:

    target datalayout = "e-m:o-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
    target triple = "x86_64-apple-macosx"

    define i32 @test(<32 x i32> %0) #0 {
    entry:
      %1 = mul <32 x i32> %0, <i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1>
      %2 = tail call i32 @llvm.vector.reduce.add.v32i32(<32 x i32> %1)
      ret i32 %2
    }

    ; Function Attrs: nocallback nofree nosync nounwind readnone willreturn
    declare i32 @llvm.vector.reduce.add.v32i32(<32 x i32>) hfinkel#1

    attributes #0 = { "min-legal-vector-width"="0" "target-cpu"="skylake-avx512" }
    attributes hfinkel#1 = { nocallback nofree nosync nounwind readnone willreturn }

(cherry picked from commit f912bab111add1275fcaf5f24e4d3654127972d0)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants