Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend \includescore #58

Merged
merged 6 commits into from
Feb 27, 2015
Merged

Extend \includescore #58

merged 6 commits into from
Feb 27, 2015

Conversation

eschwab
Copy link
Contributor

@eschwab eschwab commented Feb 24, 2015

Continuation of #53
Behavior Change: \includescore#1 now performs the following checks:

-- If #1 is a gabc file and there already is an -auto.gtex file
then gregoriotex will check the api version of the gtex file for
compatibility and recompile the gabc file if necessary.

-- If #1 is a gtex file then gregoriotex will:

  • check for a gtex file and if none exists then compile one from
    a gabc file if that exists.
  • check the api version of the gtex file for compatibility and
    recompile the gabc file if necessary.
  • compare the timestamp of the gtex file against the gabc file
    and recompile if the gtex is older.

Note: #1 should either end with .gabc or .gtex

If someone could test, I'm not 100% sure of this. I don't think I tested all the possible use cases of \includescore.

Behavior Change: \includescore#1 now performs the following checks:

  -- If #1 is a `gabc` file and there already is an `-auto.gtex` file
     then gregoriotex will check the api version of the gtex file for
     compatibility and recompile the `gabc` file if necessary.

  -- If #1 is a `gtex` file then gregoriotex will:

     -- check for a `gtex` file and if none exists then compile one from
        a `gabc` file if that exists.

     --	check the api version of the `gtex` file for compatibility and
        recompile the `gabc` file if necessary.

     -- compare the timestamp of the `gtex` file against the `gabc` file
        and recompile if the `gtex` is older.

Note: #1 should either end with .gabc or .gtex
@eroux
Copy link
Contributor

eroux commented Feb 24, 2015

Your solution will work and we can merge it for now, but I fear it's not the most efficient: in case of a .gtex file with the correct version (which we can consider the most common case), it opens it twice: once to check the version, then once to really include the score, which is quite inefficient (file input/output is quite costy in most computers). On short documents, it might get unnoticed, but on big documents (Gregorio is used on documents of over 1000 pages) it will certainly make a difference. I see two solutions for this:

  • first, just include the score, and if the api version doesn't match, call \endinput and recompile the score
  • second, when you open the file in lua to check the version, you can just read the whole file, and pass it to tex with tex.print, but I'm not sure it will be as efficient (though more simple)

What do you think?

@eschwab
Copy link
Contributor Author

eschwab commented Feb 24, 2015

Concerning point two: I did a quick test. As the code is in this commit my test file runs in

0.52s user, 0.06s system, 98% cpu, 0.587 total

With your suggestion to read the whole file and pass it to tex:

0.36s user, 0.04s system, 85% cpu, 0.471 total

Quite a big difference for such a small test file. I need to look closer at your first solution to see when it would be best to check api version.

if gtex_timestamp < gabc_timestamp then
gregoriotex.compile_gabc(gabc_file, gtex_file)
else
log("using the file %s without recompiling, as %s hasn't changed since last compilation.", gtex_file, gabc_file)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This log message isn't entirely true, because you won't know if you're actually using the file until you've checked the api version. Perhaps delete everything up to the comma?

@rpspringuel
Copy link
Contributor

Brainstorm thought. Applications often have a "Version Number" property. Could this sort of thing be applied here to prevent us having to open the file to check its API version? If Gregorio could set a file property to the API version number, then the lua script wouldn't need to open the file.

@eroux
Copy link
Contributor

eroux commented Feb 24, 2015

@rpspringuel do you mean some sort of file metadata ? It doesn't seem possible here... but the idea is good, here are two possibilities to go in your direction:

  • put the API versions of the files in the .aux (or .gaux, or whatever) file, that's usually how LaTeX usually deals with this kind of things (but a mechanism as the current one would still be needed for the first compilation)
  • put the API version directly in the file name if it's auto-generated... simpler!

What do you think?

@eschwab
Copy link
Contributor Author

eschwab commented Feb 24, 2015

I reworked the api test in lua. The meta-data idea is interesting. Or we could let \gregoriotexapiversion do the test, as it has been, but then pass control back to lua to recompile the gabc. This would require passing the file name to gregoriotexapiversion, I think.

@eroux Putting the API in the filename, simple. But then do we require all gtex files to be autogen?

@eroux
Copy link
Contributor

eroux commented Feb 24, 2015

@eschwab I was originally thinking about your first solution, but maybe the new ideas are a bit cleaner... I admit I have to dig a little more in the changes you did to have a clearer view of what's happening in the different cases... but the philosophy is: keep old documents compiling (maybe less efficiently, with a warning), and make new documents compile in the best way (and autogenerating everything seems a good solution)...

@eschwab
Copy link
Contributor Author

eschwab commented Feb 24, 2015

My goal was to reduce the i/o to once per score either with reading the entire file in lua then passing it to tex with tex.write or just passing the file on with \input %s.

@rpspringuel
Copy link
Contributor

@eroux, I was referring to file metadata. I don't know how, or even if, C can write metadata to a file, but it seems that if we're worried about processing overhead, then we should avoid reading the file until we know it's good if at all possible. If metadata per se isn't possible, then replacing the "auto" in the file name with the API version is a decent substitute.
@eschwab's changes manage to read the file in only once, but they still have to read in a bad file in order to figure out that it is bad.
(This page) [http://www.codeproject.com/Articles/31274/Change-File-Date-Attributes] seems to show an example of manipulating file metadata using C. It doesn't manipulate a version number property, but because the API version number is a date, we could hijack the file creation date.

@rpspringuel
Copy link
Contributor

Eureka! (I was in the shower when I thought of this, so it's appropriate.) The API version number is a date. Check the gtex file modification date against it. If the modification date is before the date in the API version number then it's out of date and needs to be recompiled. No need to manipulate any metadata that isn't already being set and checked.

@eschwab
Copy link
Contributor Author

eschwab commented Feb 24, 2015

Humm, I don't want to hijack the file creation date. That sounds like trouble. But to check fhe date against the API version, brilliant. I'll look into that.

@eschwab
Copy link
Contributor Author

eschwab commented Feb 24, 2015

That is a lot cleaner now. Using the file modification date works nicely.

@eroux
Copy link
Contributor

eroux commented Feb 25, 2015

Hijacking the creation date is just terrible, this means that people (like me) who sort their file list (in their file explorer) by date will have everything broken... Comparing with the creation date works in many cases, but there still need to be a mechanism for files having wrong dates: it's perfectly possible for someone to use an old version of gregorio in a recent date... But I don't think we should worry too much for these optimizations, as the double I/O operations will take place only the first time the document is compiled against a new version, not every time, I don't think it's a big problem if one compilation every year is a bit slower than usual... What do you think?

@eschwab
Copy link
Contributor Author

eschwab commented Feb 25, 2015

In the case of a bad creation date that lua thinks is fine there is still a backup check in \gregoriotexapiversion that will catch the mismatch. Granted, no autoregen for those cases, but if a user is rolling back to a previous API they should expect slight difficulties. And I'm ok with an occasional hang up. It would probably be best to plan for the usual use case of a user upgrading the API rather than be too concerned with downgrade issues.

@eroux
Copy link
Contributor

eroux commented Feb 25, 2015

To be honest, I believe eschwab@fd2966c is quite a regression... it will speed up a little when upgrading Gregorio, but will prevent autogeneration in the following (very common) case:

  • someone has a Gregorio of date1, all his files are with date1 API version
  • we release Gregorio at date 2
  • a few month later, at date 3 > date 2, the person upgrades Gregorio
  • when compiling old files, lua will look at the file creation, which will certainly be between date2 and date3, so it will say "this is > date2, this must be the correct API", while it's not
  • compilation will fail, and user will have to recompile all his files by hand

while in this common case, the state before the commit was ok for him, even though it took a bit longer to upgrade to a recent Gregorio for people upgrading Gregorio everyday (which is not at all the majority of users)... What do you think? Or is there something I didn't understand?

@rpspringuel
Copy link
Contributor

@eroux has a good point. While checking the modification date will catch use cases where someone is resurrecting an old project, it won't catch use cases where the upgrade wasn't on the API release date.

It seems that unless we can find a way to attach the API version number itself to the file metadata, then we're going to have to look inside the file to see what it is. In that instance, eschwab@ 0a08835 was a pretty good go.

Guess my "Eureka" wasn't much of one.

@eschwab
Copy link
Contributor Author

eschwab commented Feb 25, 2015

Here is where I think we are at:

  • Checking API against the file creation date will not work for users who do not upgrade Gregorio with each commit to master. This is the majority of the users.
  • Making the file creation date = API version is just bad for filesystem management.
  • I/O is costly and reducing it to once per good gtex file is best. Reading a gtex file to learn that it is out of date is a waste of resources.

Now a solution I see is to change the syntax of \includescore to expect a call like \includescore{antiphon} (note no file extension). Pass that filename to lua. Search for a file antiphon-20150220.gtex, compare that date string to the API version and either \input antiphon-20150220.gtex or pass the filename to compile_gabc and change the shell command to

"gregorio -o %s-%s.gtex %s.gabc", file_name, lnternalversion, file_name

Then \input. The date string in the filename would be written by Gregorio which would pull it from the GREGORIOTEX_API_VERSION macro.

It may then be a good idea to change the default output filename of gregorio from antiphon to antiphon-APIVERSION.

How does all of this effect the compiling of old projects? I don't think there should be a problem as long as the user still has the gabc files in the same dir.

Thoughts?

@eroux
Copy link
Contributor

eroux commented Feb 25, 2015

I agree, thought there should still be a way to include a .tex file directly (without automatic compilation, like current behaviour) for users with specific needs... But on the overall yes, I think it will improve user experience!

@eschwab
Copy link
Contributor Author

eschwab commented Feb 25, 2015

Well if a user does \includescore{antiphon.gtex} we could simply move directly to \input antiphon.gtex and skip all the tests.

@rpspringuel
Copy link
Contributor

I realize this isn't how gregoriotex currently behaves, but what's wrong with just asking users to use \input themselves when they want that behavior?

@eroux
Copy link
Contributor

eroux commented Feb 25, 2015

Juste in case we need to wrap input into something else... It never harms to have a bit of abstraction...

@rpspringuel
Copy link
Contributor

Perhaps, then a optional argument that forces inclusion? I.e. \includescore[f]{antiphon.gtex}

@eroux
Copy link
Contributor

eroux commented Feb 25, 2015

Why not, though it's not really easy to do in plain TeX (but it should be feasible)

@rpspringuel
Copy link
Contributor

\def\includescore{\@ifnextchar[\gre@includescorewithoption\gre@includescore}
\def\gre@includescorewithoption[#1]{#2}{...}
\def\gre@includescore#1{...}

Edit: Opps. For got an @.

NEW BEHAVIOR: The behavior of `\includescore` has changed. It now has an
optional argument.

When called without the optional argument
i.e. `\includescore{antiphon}`:

  -- Gregoriotex will now check for the existence of
     antiphon-APIVERSION.gtex. If that file does not exist it will be
     compiled from antiphon.gabc.

  -- If antiphon-APIVERSION.gtex exists, the APIVERSION will be checked
     against the current gregoriotexapiversion. If there is a mismatch,
     antiphon.gabc will be recompiled.

  -- antiphon-APIVERSION.gtex will be checked against the modification
     time of antiphon.gabc. The gabc will be recompiled if newer than
     the gtex.

  -- APIVERSION is a number automatically added by gregoriotex.

When called *with* the optional argument
i.e. `\includescore[f]{antiphon.gtex}`:

  -- The file will be immediately passed to TeX. No checks will be made
     until the one in `\gregoriotexapiversion`.

  -- The value of the optional argument can be anything. `[f]` is
     recommended, signifying `Forced`.
Changed due to the refactoring of `\includescore`.
@eschwab
Copy link
Contributor Author

eschwab commented Feb 26, 2015

Refactor \includescore.

NEW BEHAVIOR: The behavior of \includescore has changed. It now has an
optional argument.

When called without the optional argument
i.e. \includescore{antiphon}:

-- Gregoriotex will now check for the existence of
antiphon-APIVERSION.gtex. If that file does not exist it will be
compiled from antiphon.gabc.

-- If antiphon-APIVERSION.gtex exists, the APIVERSION will be checked
against the current gregoriotexapiversion. If there is a mismatch,
antiphon.gabc will be recompiled.

-- antiphon-APIVERSION.gtex will be checked against the modification
time of antiphon.gabc. The gabc will be recompiled if newer than
the gtex.

-- APIVERSION is a number automatically added by gregoriotex.

When called with the optional argument
i.e. \includescore[f]{antiphon.gtex}:

-- The file will be immediately passed to TeX. No checks will be made
until the one in \gregoriotexapiversion.

-- The value of the optional argument can be anything. [f] is recommended, signifying Forced.

The only issue I see is that eventually there may be a little collection of outdated antiphon-APIVERSION.gtex files. Also, version bump. Perhaps not the right version, but until we hammer that issue out, lets do this for now.

@eroux
Copy link
Contributor

eroux commented Feb 26, 2015

Thank you very much! About the outdated files, you're right, I think they should be removed... Otherwise it seems fine to me!

@eschwab
Copy link
Contributor Author

eschwab commented Feb 26, 2015

Right now I can't think of a good way to cleanup the outdated files, in lua atleast.

@eroux
Copy link
Contributor

eroux commented Feb 26, 2015

I might be a bit costy, but when the file with the correct api version is not found, you could remove all antiphon-*.gtex, there are ways to do this in lualibs... but it can be kept for later

@eschwab
Copy link
Contributor Author

eschwab commented Feb 26, 2015

os.remove(string.format("%s-*.gtex", file_root))

If I'm thinking correctly, this should do it.

@eroux
Copy link
Contributor

eroux commented Feb 26, 2015

If it works, then it's perfect!

@eschwab
Copy link
Contributor Author

eschwab commented Feb 26, 2015

From what my tests, os.remove cannot use globs. I found that this works:

os.execute("rm "..file_root.."\\-[0-9]*.gtex")

but I am wondering if that is portable between *NIX, Mac, Windows.

@rpspringuel
Copy link
Contributor

That won't be portable to Windows because Windows doesn't have an rm command. To get something portable you need to avoid os.execute (which just passes its argument to the shell) and use something like os.remove (which translates the intention of the command into something OS appropriate).

To get past the wildcard problem, you might try bringing the list of files in the directory into lua, using lua to identify which filed in the list need to be deleted, and then issuing separate os.remove commands for each.

@eroux
Copy link
Contributor

eroux commented Feb 26, 2015

You'll find some help in here for this, you can \RequirePackage{lualibs} to use it.

@eschwab
Copy link
Contributor Author

eschwab commented Feb 26, 2015

I received an unexpected assignment. I will not have time to tackle this last issue for a week or two.

@eroux
Copy link
Contributor

eroux commented Feb 27, 2015

No problem, thank you very much for your work, I'm merging the current state, even if it's not 100% finished.

eroux added a commit that referenced this pull request Feb 27, 2015
@eroux eroux merged commit cc0432d into gregorio-project:master Feb 27, 2015
@eschwab
Copy link
Contributor Author

eschwab commented Feb 27, 2015

Thank you both for the ideas and critiques.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants