Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how to reproduce your figure 1? #4

Open
vietvo89 opened this issue Jan 7, 2025 · 0 comments
Open

how to reproduce your figure 1? #4

vietvo89 opened this issue Jan 7, 2025 · 0 comments

Comments

@vietvo89
Copy link

vietvo89 commented Jan 7, 2025

Hi, I tried to reproduce your intuition shown in Figure 1. I followed your code and your paper to understand what Figure 1 visualizes and how to reproduce it. The caption mentions activation entropy but is not explicitly defined in the paper. In section 3.4, the paper mentions "visualization of activation" and "in-context activation". So I think it is activation score defined in equation 2 represents the activation entropy or in-context activation. That is the tensor activation_all_layers_score in the code below. Then for each layer and for each token, I select the value at the position corresponding to the given token. For instance, the token of Rome is idx, the specified value for each layer i and token j is activation_all_layers_score[i, idx, j]. However, the results I got are not exactly the same as yours but somehow nearly there.

One more important thing I noted is that the values from equation 2 should be less than 1 due to the softmax function. However, the values shown in Figure 1 are possibly higher than 1. So it is not completely clear what you did to demonstrate Figure 1. It's great if you can show the right things to reproduce your Figure 1?

input_text1 = "Fabrizio Spada passed away in "#+"\n"
input_text2 = "Rome."
# input_text2 = "Manila."

kwargs = dict(repetition_penalty=args.repetition_penalty,
             alpha=args.alpha,info_layer=args.info_layer,decoding_strategy=args.decoding_strategy)
return_adjust_scores = False

with torch.no_grad():
    input_text = input_text1 + input_text2
    input_ids = llm.tokenizer(input_text, return_tensors="pt").input_ids.to(device)
    prefix_ids = llm.tokenizer(input_text1, return_tensors="pt").input_ids.to(device)
    continue_ids = input_ids[0, prefix_ids.shape[-1]:]
    premature_layer_dist = {l:0 for l in candidate_premature_layers}
    picked_logits = []
    result_dict = {}
    premature_layers = []
    dict_outputs, outputs = llm.model(
                    input_ids=input_ids,
                    return_dict=True,
                    output_attentions=False,
                    output_hidden_states=False,
                    early_exit_layers=candidate_premature_layers + [mature_layer],
                    info_layer=args.info_layer,
                )

    info_all_layer_probs=[]
    for i in early_exit_layers:
          info_all_layer_probs.append(F.softmax(torch.t(dict_outputs[i][-1,:, :]), dim=1))

    activation_all_layers_score = torch.stack(info_all_layer_probs,dim=0)

Results:

image
image
image
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant