Handle one-html-file books a bit better #39

stuartlangridge · 2014-03-08T17:48:39Z

Beru doesn't deal all that well with epubs which have the whole book content in one massive HTML file rather than a number of small files. Now, obviously, a well-put-together epub doesn't do that, but I clearly have a fair amount of non-well-put-together epubs. The book reader will essentially hang for seconds at a time, especially when switching back to Beru from another app, or after rotating the phone. (Note: Beru itself is not hung: the toolbar shows up fine.)

rschroll · 2014-03-08T18:39:12Z

I suspect the problem isn't with Beru itself, but with the Monocle
library we're using to lay out the Epub. Beru is just responsible for
unzipping and serving the HTML file, and this is done in C++, so it's
unlikely to be slow. Monocle is responsible for laying out the view,
and this happens in Javascript. The fact that this slowness happens on
rotation, which should only involve Monocle code, supports this
suspicion.

Unfortunately, this means that the fix lies within Monocle, and I don't
know that code very well. I'll leave this bug open, but don't expect a
quick solution. It may be worth opening a bug with Monocle about this.
If you do so, please link it here. Or, if you can send me a
problematic epub, I'll open a Monocle bug.

stuartlangridge · 2014-03-08T18:42:20Z

I agree almost completely with you, but I think that Monocle won't get any faster because it's just dealing with a really big file. My thought was that Beru could unzip the file and then, if it's large (Calibre splits files up into 260KB chunks by default if they're larger than that), Beru breaks it up into bits as if it were actually produced that way. But... that might be hard if you have to also rewrite-on-the-fly the spine stuff; the impression I had from Monocle was that you can tell it "here is a list of files" any place where it wants one file, but I might very well be wrong because I hardly know that code at all :)

rschroll · 2014-03-08T19:05:34Z

I thought of that briefly, since it does get around the
Monocle-needs-to-parse-the-big-file issue. Rewriting the spine
wouldn't be too much of a problem. The two issues I fear are:

Figuring out where to split the file. Monocle start a new page for
each new HTML file, so you don't want to do this in the middle of a
line. But you could do this just before a header and probably be okay.
Re-targeting internal links. We'd have to find all anchors, figure
out what new file they'll end up in, and adjust all links to point to
that new file. Not impossible, but it leaves a lot of places to screw
up.

Frankly, I wonder if a better solutions is a dedicated epub reformatter
that takes care of this. Potentially, this could ship with Beru and be
run the first time a epub is opened.

stuartlangridge · 2014-03-08T19:07:43Z

Hm. I hadn't thought about links. Darn. That is a problem.

A dedicated reformatter would be ideal, wouldn't it? The only one I know of
is in Calibre, though, and that (while state-of-the-art) is in Python and
therefore no use.

I'll have a poke around and see if I can find anything, although I'm sure
you'll do the same!

On Sat, Mar 8, 2014 at 7:05 PM, Robert Schroll [email protected]:

I thought of that briefly, since it does get around the
Monocle-needs-to-parse-the-big-file issue. Rewriting the spine
wouldn't be too much of a problem. The two issues I fear are:

Figuring out where to split the file. Monocle start a new page for
each new HTML file, so you don't want to do this in the middle of a
line. But you could do this just before a header and probably be okay.

Re-targeting internal links. We'd have to find all anchors, figure
out what new file they'll end up in, and adjust all links to point to
that new file. Not impossible, but it leaves a lot of places to screw
up.

Frankly, I wonder if a better solutions is a dedicated epub reformatter
that takes care of this. Potentially, this could ship with Beru and be
run the first time a epub is opened.

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/39#issuecomment-37106438
.

New Year's Day --
everything is in blossom!
I feel about average.
-- Kobayashi Issa

rschroll · 2014-03-08T19:59:05Z

Actually, it may not be as bad as I feared. We don't actually need to
rewrite all links in all HTML files. Instead, we could adjust where
those links lead later, either client side or server side. We could
detect clicks on links with Javascript and adjust their targets before
the request is sent out, though this would require passing a remapping
from the server to the client. Or we could wait for the requests to
come in and serve the correct part of the file, though this would
require us to get the anchors, which aren't passed to the server, as
far as I know.

rschroll added the enhancement label Jun 27, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Handle one-html-file books a bit better #39

Handle one-html-file books a bit better #39

stuartlangridge commented Mar 8, 2014

rschroll commented Mar 8, 2014

stuartlangridge commented Mar 8, 2014

rschroll commented Mar 8, 2014

stuartlangridge commented Mar 8, 2014

rschroll commented Mar 8, 2014

Handle one-html-file books a bit better #39

Handle one-html-file books a bit better #39

Comments

stuartlangridge commented Mar 8, 2014

rschroll commented Mar 8, 2014

stuartlangridge commented Mar 8, 2014

rschroll commented Mar 8, 2014

stuartlangridge commented Mar 8, 2014

rschroll commented Mar 8, 2014