Skip to content

enh: create a graphic or visual that helps us visualize % translated for each section of the guide by language #493

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
1 of 5 tasks
lwasser opened this issue May 19, 2025 · 10 comments
Assignees
Labels
help wanted We welcome a contributor to work on this issue! thank you in advance! sprintable

Comments

@lwasser
Copy link
Member

lwasser commented May 19, 2025

One of the benefits of using a tool like crowin is the bar plots that quickly help a user understand how far along translations are in a specific language.

We can generate a graphic like this using Python and post it in our translation file and README file for a quick overview of the translation status. Below is a somewhat ugly, messy version of this using Babel. But it starts to get at what i'm thinking about!

If someone is interested in a python project, you could work on this and

  • Turn it into a runnable script with a main() function.
  • Fix the plots to make them look nicer using perhaps a different plotting tool
  • Create subplots one for each language
  • Save the plot in the repository directory so we can then add it to our README and other files.
  • Document everything!

from pathlib import Path
import os
from babel.messages import pofile
import matplotlib.pyplot as plt

# Path to your locales directory
# Go up one directory and locate the locales folder (i'm thinking this will live in a /scripts directory at the root of the repo
BASE_DIR = Path(__file__).resolve().parent.parent
LOCALES_DIR = BASE_DIR / "locales"
print(LOCALES_DIR)

def calculate_translation_percentage(po_path):
    with open(po_path, "r", encoding="utf-8") as f:
        catalog = pofile.read_po(f)
    total = len(catalog)
    translated = sum(1 for message in catalog if message.string)
    percent = (translated / total * 100) if total > 0 else 0
    return round(percent, 1)

def get_translation_progress(locales_dir):
    progress = {}
    for lang_dir in locales_dir.iterdir():
        # skip os stuff like .DS_Store
        if not lang_dir.is_dir():
            continue  #

        lang = lang_dir.name
        lc_messages_dir = lang_dir / "LC_MESSAGES"
        if not lc_messages_dir.exists():
            continue

        po_files = lc_messages_dir.glob("*.po")

        for po_file in po_files:
            percent = calculate_translation_percentage(po_file)
            key = f"{lang}/{po_file.stem}"
            progress[key] = percent

    return progress


# Get progress data and plot
progress = get_translation_progress(LOCALES_DIR)
print(progress)

langs = list(progress.keys())
percents = list(progress.values())
print(percents)

plt.figure(figsize=(10, 6))
plt.barh(langs, percents)
plt.xlabel("Translation %")
plt.title("Translation progress by language")
plt.xlim(0, 100)
plt.grid(axis="x", linestyle="--", alpha=0.5)
plt.tight_layout()
plt.show()

The follow-up issue associated with this would be to add CI to run this automatically each week. I think that can be a sub-issue that we can make once this issue is complete!

@lwasser lwasser added help wanted We welcome a contributor to work on this issue! thank you in advance! sprintable labels May 19, 2025
@lwasser lwasser moved this to python programming in pyOpenSci Help Wanted Project Board May 19, 2025
@lwasser lwasser changed the title enh: create a graphic or visual that helps us visuablize % tranlated for each section of the guide by language enh: create a graphic or visual that helps us visualize % translated for each section of the guide by language May 19, 2025
@RobPasMue
Copy link
Contributor

Looking into this one as part of PyCon US sprints -- @lwasser @flpm

@lwasser
Copy link
Member Author

lwasser commented May 19, 2025

Added you to the issue @RobPasMue !! Thank you!

@RobPasMue
Copy link
Contributor

RobPasMue commented May 19, 2025

Awesome thanks - I will break this down into:

@sneakers-the-rat
Copy link
Contributor

I just wanna make sure people know this is already something sphinx does and we don't need to write this ourselves!!

https://sphinx-intl.readthedocs.io/en/master/refs.html#sphinx-intl-stat

https://www.sphinx-doc.org/en/master/usage/advanced/intl.html#translation-progress-and-statistics

@flpm
Copy link
Member

flpm commented May 19, 2025

I knew about intl-stat, and we talked about it in the sprint, but since the goal was to use the data to do some sort of chart or viz summary showing the status and areas that need help, it think it is cleaner to extract from the PO files directly instead of parsing the output of sphinx-intl.

I did not know about the second link! If I understand correctly, is to highlight inside the translation pages the text that is not translated yet. I imagine that this is what you were thinking for the CSS that shows those parts in a different way when we were discussing that other PR, right? I think that is definitely still worth investigating, but maybe that should be a separate issue.

@sneakers-the-rat
Copy link
Contributor

totally - use whatever is useful, just wanted to make sure ppl were aware because the docs are (ironically) a little hard to navigate :)

@lwasser
Copy link
Member Author

lwasser commented May 19, 2025

I just merged the PR that has the script and associated JSON data that we can use to create a plot of translation status by section. The ability to highlight untranslated text is super interesting!! Thank you for those resources @sneakers-the-rat !!!

@RobPasMue
Copy link
Contributor

Now that #495 has been merged I will move on to the visualization of the data =) I'll try to open a PR shortly! Any ideas are welcome.

@RobPasMue
Copy link
Contributor

Displaying stats in our docs is now in a PR =) - see #511 for more details!

@lwasser
Copy link
Member Author

lwasser commented May 29, 2025

We appreciate you @RobPasMue thank you for all of the work on this!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted We welcome a contributor to work on this issue! thank you in advance! sprintable
Projects
Status: python programming
Development

No branches or pull requests

4 participants