Specify english when using the distil model. It only helps for english #76

sourcery-ai-experiments-bot · 2024-05-08T00:19:10Z

(and that's what I'm using) so let's make sure whisper knows to use english overall. If using the large model in general though, it will auto-detect the language as it has previously.

sourcery-ai-experiments-bot · 2024-05-08T00:19:12Z

This is a benchmark review for experiment review_of_reviews_20240508.
Run ID: review_of_reviews_20240508/benchmark_2024-05-08T00-14-14_v1-16-0-252-g323437959.

This pull request was cloned from https://github.com/jquagga/ttt/pull/107. (Note: the URL is not a link to avoid triggering a notification on the original pull request.)

Experiment configuration

review_config:
  # User configuration for the review
  # - benchmark - use the user config from the benchmark reviews
  # - <value> - use the value directly
  user_config:
    enable_ai_review: true
    enable_rule_comments: false

    enable_complexity_comments: benchmark
    enable_docstring_comments: benchmark
    enable_security_comments: benchmark
    enable_tests_comments: benchmark
    enable_comment_suggestions: benchmark
    enable_functionality_review: benchmark

    enable_approvals: true

  ai_review_config:
    # The model responses to use for the experiment
    # - benchmark - use the model responses from the benchmark reviews
    # - llm - call the language model to generate responses
    model_responses:
      comments_model: benchmark
      comment_validation_model: benchmark
      comment_suggestion_model: benchmark
      complexity_model: benchmark
      docstrings_model: benchmark
      functionality_model: benchmark
      security_model: benchmark
      tests_model: benchmark

# The pull request dataset to run the experiment on
pull_request_dataset:
- https://github.com/nbhirud/system_update/pull/31
- https://github.com/nbhirud/system_update/pull/34
- https://github.com/suttacentral/suttacentral/pull/3164
- https://github.com/0xdade/sephiroth/pull/79
- https://github.com/Fenigor/align-game/pull/3
- https://github.com/NathanVaughn/blog.nathanv.me/pull/200
- https://github.com/jquagga/ttt/pull/107
- https://github.com/Stagietechs/sketchup-ruby-api-tutorials/pull/1
- https://github.com/jabesq-org/pyatmo/pull/497
- https://github.com/UnitapApp/unitap-backend/pull/440
- https://github.com/UnitapApp/unitap-backend/pull/441
- https://github.com/UnitapApp/unitap-backend/pull/442
- https://github.com/gdsfactory/gdsfactory/pull/2725
- https://github.com/gdsfactory/kfactory/pull/305
- https://github.com/gdsfactory/kfactory/pull/309
- https://github.com/kloudlite/operator/pull/185
- https://github.com/kloudlite/api/pull/317
- https://github.com/nuxeo/nuxeo-drive/pull/4850
- https://github.com/albumentations-team/albumentations/pull/1711
- https://github.com/avelino/awesome-go/pull/5303
- https://github.com/Cristofer543/Cristofer543.github.io/pull/1
- https://github.com/W-zrd/unishare_mobile/pull/14
- https://github.com/2lambda123/DPDK-dpdk/pull/3
- https://github.com/2lambda123/DPDK-dpdk/pull/1
- https://github.com/2lambda123/wayveai-mile/pull/1
- https://github.com/2lambda123/wayveai-mile/pull/3
- https://github.com/erxes/erxes/pull/5188
- https://github.com/Patrick-Ehimen/the-wild-oasis/pull/1
- https://github.com/Patrick-Ehimen/the-wild-oasis/pull/2
- https://github.com/manoelhc/test-actions/pull/41
- https://github.com/StartupOS/verinfast/pull/365
- https://github.com/CypherGuy/PantryPal/pull/1
- https://github.com/DevCycleHQ/devcycle-docs/pull/666
- https://github.com/DevCycleHQ/cli/pull/388
- https://github.com/DevCycleHQ/js-sdks/pull/841
- https://github.com/allthingslinux/tux/pull/207
- https://github.com/simnova/ownercommunity/pull/109
- https://github.com/simnova/ownercommunity/pull/111
- https://github.com/simnova/ownercommunity/pull/112
- https://github.com/FrancisKOUAHO/Plumera/pull/24
- https://github.com/FrancisKOUAHO/Plumera/pull/25
- https://github.com/dashmug/glue-utils/pull/32
- https://github.com/dashmug/glue-utils/pull/34
review_comment_labels:
- label: correct
  question: Is this comment correct?
- label: helpful
  question: Is this comment helpful?
- label: comment-type
  question: Is the comment type correct?
- label: comment-area
  question: Is the comment area correct?
- label: llm-test
  question: |
    What type of LLM test could this comment become?
    - 👍 - this comment is really good/important and we should always make it
    - 👎 - this comment is really bad and we should never make it
    - no reaction - don't turn this comment into an LLM test

# Benchmark reviews generated by running
#   python -m scripts.experiment benchmark <experiment_name>
benchmark_reviews: []

SourceryAI

Hey @sourcery-ai-experiments-bot - I've reviewed your changes and they look great!

Here's what I looked at during the review

🟡 General issues: 1 issue found
🟢 Security: all looks good
🟡 Testing: 1 issue found
🟢 Complexity: all looks good

LangSmith trace

_{Help me be more useful! Please click 👍 or 👎 on each comment to tell me if it was helpful.}

SourceryAI · 2024-05-08T00:21:09Z

ttt.py

    model_id = os.environ.get("TTT_TRANSFORMERS_MODEL_ID", "openai/whisper-large-v3")
+    print(f"We are using {torch_dtype} on {device} with {model_id}")


suggestion (code_refinement): Consider logging instead of printing for better production environment compatibility.

Using logging instead of print statements can help in managing output verbosity and is generally more flexible for different environments, such as production or development.

Suggested change

print(f"We are using {torch_dtype} on {device} with {model_id}")

import logging

logging.basicConfig(level=logging.INFO)

logger = logging.getLogger(__name__)

logger.info(f"We are using {torch_dtype} on {device} with {model_id}")

Is this comment correct?

Is this comment helpful?

Is the comment type correct?

Is the comment area correct?

What type of LLM test could this comment become?

👍 - this comment is really good/important and we should always make it

👎 - this comment is really bad and we should never make it

no reaction - don't turn this comment into an LLM test

Consistency?

SourceryAI · 2024-05-08T00:21:09Z

ttt.py

@@ -75,7 +77,7 @@
        tokenizer=processor.tokenizer,
        feature_extractor=processor.feature_extractor,
        max_new_tokens=128,
-        generate_kwargs={"assistant_model": assistant_model},
+        generate_kwargs={"assistant_model": assistant_model, "language": "english"},


suggestion (testing): Missing test for the new 'language' parameter in generate_kwargs.

The addition of the 'language' parameter to generate_kwargs suggests a specific behavior when using the assistant model. It would be beneficial to add a test case that verifies the correct handling and effect of this parameter, ensuring it integrates correctly and modifies the behavior as expected.

Is this comment correct?

Is this comment helpful?

Is the comment type correct?

Is the comment area correct?

What type of LLM test could this comment become?

👍 - this comment is really good/important and we should always make it

👎 - this comment is really bad and we should never make it

no reaction - don't turn this comment into an LLM test

Specify english when using the distil model. It only helps for english

6e15a49

(and that's what I'm using) so let's make sure whisper knows to use english overall. If using the large model in general though, it will auto-detect the language as it has previously.

SourceryAI approved these changes May 8, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Specify english when using the distil model. It only helps for english #76

Specify english when using the distil model. It only helps for english #76

Uh oh!

sourcery-ai-experiments-bot commented May 8, 2024

Uh oh!

sourcery-ai-experiments-bot commented May 8, 2024

Uh oh!

SourceryAI left a comment

Uh oh!

SourceryAI May 8, 2024

Uh oh!

SourceryAI May 8, 2024

Uh oh!

SourceryAI May 8, 2024

Uh oh!

SourceryAI May 8, 2024

Uh oh!

SourceryAI May 8, 2024

Uh oh!

SourceryAI May 8, 2024

Uh oh!

bm424 May 10, 2024

Uh oh!

SourceryAI May 8, 2024

Uh oh!

SourceryAI May 8, 2024

Uh oh!

SourceryAI May 8, 2024

Uh oh!

SourceryAI May 8, 2024

Uh oh!

SourceryAI May 8, 2024

Uh oh!

SourceryAI May 8, 2024

Uh oh!

Uh oh!

		model_id = os.environ.get("TTT_TRANSFORMERS_MODEL_ID", "openai/whisper-large-v3")
		print(f"We are using {torch_dtype} on {device} with {model_id}")

-    print(f"We are using {torch_dtype} on {device} with {model_id}")
+import logging
+logging.basicConfig(level=logging.INFO)
+logger = logging.getLogger(__name__)
+logger.info(f"We are using {torch_dtype} on {device} with {model_id}")

Specify english when using the distil model. It only helps for english #76

Are you sure you want to change the base?

Specify english when using the distil model. It only helps for english #76

Uh oh!

Conversation

sourcery-ai-experiments-bot commented May 8, 2024

Uh oh!

sourcery-ai-experiments-bot commented May 8, 2024

Uh oh!

SourceryAI left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!