Discussion:
Any Existing Way to Create LaTeX Source, HTML, and E-Reader Output From the Same Source Material
(too old to reply)
David T. Ashley
2017-07-03 02:54:39 UTC
Permalink
Raw Message
I'm working on a technical book (table of contents, equations, tables, cross-references, index, etc.), and I'd like to produce all of the following from the same source material:

a)LaTeX source code.

b)HTML output (so the book can be viewed in a browser as a set of web pages).

c)E-reader source code or output (this is a complex topic, because other tools are involved, and there is more than one way to do it).

Has anyone developed any tools that go in this direction? I can develop my own tools, but I would like to see what exists already first.

Thanks sincerely.
Robert Heller
2017-07-03 12:47:14 UTC
Permalink
Raw Message
Post by David T. Ashley
a)LaTeX source code.
b)HTML output (so the book can be viewed in a browser as a set of web pages).
Doxygen can do both of the above, although doxygen is geared toward software
documentation, partitularly API documentation.
Post by David T. Ashley
c)E-reader source code or output (this is a complex topic, because other tools are involved, and there is more than one way to do it).
Has anyone developed any tools that go in this direction? I can develop my own tools, but I would like to see what exists already first.
Thanks sincerely.
--
Robert Heller -- 978-544-6933
Deepwoods Software -- Custom Software Services
http://www.deepsoft.com/ -- Linux Administration Services
***@deepsoft.com -- Webhosting Services
Dr Engelbert Buxbaum
2017-07-03 12:51:37 UTC
Permalink
Raw Message
Post by David T. Ashley
a)LaTeX source code.
b)HTML output (so the book can be viewed in a browser as a set of web pages).
c)E-reader source code or output (this is a complex topic, because other tools are involved, and there is more than one way to do it).
You can use TeX4ht (https://tug.org/applications/tex4ht/mn.html) to get
HTML from LaTeX source. One of the formats used by E-readers is pdf, and
pdf is the standard output for modern TeX-compilers.

In addition, there is LaTeXML (http://dlmf.nist.gov/LaTeXML/), which you
can use to produce HTML/XML and possibly epub from a LaTeX source, but I
have no experience with that.
--
DIN EN ISO 9241-13: 9.5.3 Error messages should convey what is wrong,
what corrective actions can be taken, and the cause of the error.
j***@gmail.com
2017-07-03 18:16:00 UTC
Permalink
Raw Message
The user wants from a single file
a)LaTeX source code.
b)HTML output (so the book can be viewed in a browser as a set of web pages).
c)E-reader source code or output

I suggest you use http://www.sphinx-doc.org/, which was created to support https://docs.python.org/3/, which can be downloaded as PDF, HTML, Plain Text and EPUB (see https://docs.python.org/3/download.html). The documentation hosting site https://readthedocs.org/ supports Sphinx.

Please let us know what you decide (and why, if you dare).
--
Jonathan
David T. Ashley
2017-07-09 20:10:22 UTC
Permalink
Raw Message
Post by j***@gmail.com
The user wants from a single file
a)LaTeX source code.
b)HTML output (so the book can be viewed in a browser as a set of web pages).
c)E-reader source code or output
I suggest you use http://www.sphinx-doc.org/, which was created to support https://docs.python.org/3/, which can be downloaded as PDF, HTML, Plain Text and EPUB (see https://docs.python.org/3/download.html). The documentation hosting site https://readthedocs.org/ supports Sphinx.
Please let us know what you decide (and why, if you dare).
I will roll my own solution, for sure. It will end up to have these characteristics:

a)The input will be a regular language (similar in spirit to HTML or to LaTeX). (I mean "regular language" in the computer science finite automata sense.)

b)I will NOT use XML. XML is too verbose and not friendly to humans (or at least not humans who value their time or consider carpal tunnel a risk).

c)I will probably handle equations by linking together LaTeX source and graphics. One obvious way to do this is to normalize the whitespace in the equation source, then key off the SHA256 of the normalized string. Another obvious way is just unique naming.

But I will roll my own because I can get exactly what I want that way.
David T. Ashley
2017-07-09 20:04:33 UTC
Permalink
Raw Message
Post by Dr Engelbert Buxbaum
Post by David T. Ashley
a)LaTeX source code.
b)HTML output (so the book can be viewed in a browser as a set of web pages).
c)E-reader source code or output (this is a complex topic, because other tools are involved, and there is more than one way to do it).
You can use TeX4ht (https://tug.org/applications/tex4ht/mn.html) to get
HTML from LaTeX source. One of the formats used by E-readers is pdf, and
pdf is the standard output for modern TeX-compilers.
In addition, there is LaTeXML (http://dlmf.nist.gov/LaTeXML/), which you
can use to produce HTML/XML and possibly epub from a LaTeX source, but I
have no experience with that.
Thanks.

I was aware of PDF capability (and there are a few ways to do that), but that is awkward e-Readers with smaller screens (because of the fixed-layout).

Tablets are getting bigger and cheaper, though, so maybe my concerns are nearly obsolete. I just bought a Lenovo Tab 3 Plus, 10.1-inch. You can get the low-end model for under $200. It has a full 1080p screen, which means you can read technical papers as PDF pretty comfortably. I'm surprised at that price point for a big screen with good resolution.
Eduardo M KALINOWSKI
2017-07-03 22:23:53 UTC
Permalink
Raw Message
Post by David T. Ashley
a)LaTeX source code.
b)HTML output (so the book can be viewed in a browser as a set of web pages).
c)E-reader source code or output (this is a complex topic, because other tools are involved, and there is more than one way to do it).
Has anyone developed any tools that go in this direction? I can develop my own tools, but I would like to see what exists already first.
pandoc (http://pandoc.org/) might do what you need. I haven't used it,
though.
Peter Flynn
2017-07-03 22:48:00 UTC
Permalink
Raw Message
Post by David T. Ashley
I'm working on a technical book (table of contents, equations,
tables, cross-references, index, etc.), and I'd like to produce all
a)LaTeX source code.
b)HTML output (so the book can be viewed in a browser as a set of web pages).
c)E-reader source code or output (this is a complex topic, because
other tools are involved, and there is more than one way to do it).
Has anyone developed any tools that go in this direction? I can
develop my own tools, but I would like to see what exists already
first.
Yes, there are extensive toolchains available. IMNSHO BY FAR the most
reliable is to author the document in XML, using (for example) the
DocBook schema, because you can generate everything else from that,
using transformations written in XSLT2. There is a fully-worked (but
very simple) example of XML-to-LaTeX towards the end of _Formatting
Information_ at
http://latex.silmaril.ie/formattinginformation/tolatex.html#xml2latex

I've written four books, dozens of articles, all my documentation, and
several web sites in XML this way, and generated LaTeX to get PDF, HTML
for the web, and EPUB3 to get an ebook. There will be screams from the
gallery that XML is "too hard" but it's not: it's actually simpler than
LaTeX, just different.

The only problem is that XML editors are mostly not aimed at WRITING, so
they lack many of the features and facilities that you expect from a
writing tool like a wordprocessor. They are improving, slooooooowly, but
it's still an effort, which is a pity: it could easily be so much simpler.

But I consider that a very small penalty for the ability to maintain a
single source document from which I can produce lots of different
formats. You also need to take into account that I'm heavily biased:
I've been doing this for decades, so it's second-nature. I'm also a
dyed-in-the-wool Emacs user, so all the tools I need are immediately
available either within Emacs or one click away. Everything else (DTD,
RNG, XSLT2, LaTeX, epubcheck, browsers, XML tools) is available for
download.

There is a learning curve but it's well documented and there is plenty
of help online. <plug class="shameless">There is also the XML Summer
School in Oxford in mid-September where you can meet people who do this
stuff, including me. http://xmlsummerschool.com/</plug>

///Peter
--
Formatting Information: http://latex.silmaril.ie/
XML FAQ: http://xml.silmaril.ie/
Axel Berger
2017-07-04 06:13:35 UTC
Permalink
Raw Message
Post by Peter Flynn
There will be screams from the
XML as such is not hard imho, rather primitive in fact, but extremely
verbose. As far as I can see it's nothing but a huge mess of tags with
strict rules about opening and closing. Whethrer the current choice of
tags is in any way structured, logical and useful seems mostly up to you
or to whoever defined the current set in use.

Its one plus is that it is eminently machine readable, which helps with
the current task of translation. Both LaTeX to HTML and HTML to LaTeX
often run into the problem of missing or implied end tags which makes
scripting those tasks hard and error prone.

To my mind it all hinges on the avaliablility of a well defined and well
thought out set of predefined tags to use, XML as such is just a
formalism. And as for writing, XML's verbosity tends to make source code
very hard to read, proofread and correct. That was LaTeX 2.09's biggest
selling point, to some but slightly lesser degree retained in current
LaTeX and it is what makes markdown so attractive for all the simpler
cases where it suffices.

N.B: For a case of extremely bad XML, take a look at my small sample in
http://berger-odenthal.de/upload/docx.zip
--
/¯\ No | Dipl.-Ing. F. Axel Berger Tel: +49/ 221/ 7771 8067
\ / HTML | Roald-Amundsen-Straße 2a Fax: +49/ 221/ 7771 8069
 X in | D-50829 Köln-Ossendorf http://berger-odenthal.de
/ \ Mail | -- No unannounced, large, binary attachments, please! --
Peter Flynn
2017-07-04 20:57:07 UTC
Permalink
Raw Message
Post by Axel Berger
Post by Peter Flynn
There will be screams from the
XML as such is not hard imho, rather primitive in fact, but extremely
verbose. As far as I can see it's nothing but a huge mess of tags
If it's allowed to get out of hand, yes. It can actually be quite
minimalist.
Post by Axel Berger
with strict rules about opening and closing.
That's essential to make it robust to write programs that process it, as
you correctly noted.
Post by Axel Berger
Whether the current choice of
tags is in any way structured, logical and useful seems mostly up to you
or to whoever defined the current set in use.
The latter, I hope. There really isn't any need to define your own these
days unless you need something very complex or unusual.
Post by Axel Berger
To my mind it all hinges on the availability of a well-defined and well
thought out set of predefined tags to use
There is a good choice for most purposes.
Post by Axel Berger
formalism. And as for writing, XML's verbosity tends to make source code
very hard to read, proofread and correct.
Right. Which is why everyone wants WYSIWYG. An XML editor will take care
of the syntax and behaviour, but the things writers need to do in
actually writing are usually not as well catered-for as in (say) Word or
LaTeX.
Post by Axel Berger
N.B: For a case of extremely bad XML, take a look at my small sample in
http://berger-odenthal.de/upload/docx.zip
:-)

///Peter
Manuel Collado
2017-07-04 08:57:42 UTC
Permalink
Raw Message
Post by Peter Flynn
...
The only problem is that XML editors are mostly not aimed at WRITING, so
they lack many of the features and facilities that you expect from a
writing tool like a wordprocessor. They are improving, slooooooowly, but
it's still an effort, which is a pity: it could easily be so much simpler.
There is a wordprocessor-like XML validating editor. Please see:

http://www.xmlmind.com/xmleditor/

"XMLmind XML Editor is a strictly validating, near WYSIWYG, DocBook
editor, DITA editor, MathML editor, XHTML editor, XML editor."

This product has existed for more than a decade.

HTH.
--
Manuel Collado - http://lml.ls.fi.upm.es/~mcollado
m***@gmail.com
2017-07-04 20:58:20 UTC
Permalink
Raw Message
Post by David T. Ashley
a)LaTeX source code.
b)HTML output (so the book can be viewed in a browser as a set of web pages).
You can use make4ht [1]. It is based on tex4ht, but it add some useful features like more human friendly CLI options and Lua based build files which can be used to call external applications like bibtex or makindex and post-process the generated files.
Post by David T. Ashley
c)E-reader source code or output (this is a complex topic, because other tools are involved, and there is more than one way to do it).
Try tex4ebook [2]. It supports the same features as make4ht, but it can generate Epub and Epub3 ebooks directly.

Both of these tools are included in common TeX distributions.

[1] https://www.ctan.org/pkg/make4ht?lang=en
[2] https://www.ctan.org/pkg/tex4ebook?lang=en
Leon van Dommelen
2017-07-08 03:59:29 UTC
Permalink
Raw Message
Post by David T. Ashley
I'm working on a technical book (table of contents, equations, tables,
cross-references, index, etc.), and I'd like to produce all of the
a)LaTeX source code.
b)HTML output (so the book can be viewed in a browser as a set of web pages).
c)E-reader source code or output (this is a complex topic, because other
tools are involved, and there is more than one way to do it).
Has anyone developed any tools that go in this direction? I can develop
my own tools, but I would like to see what exists already first.
I have used the "standard" (actually, outdated, but the new one works the
same) LaTeX to pdf conversion and a custom version of latex2html for my,
so far, 1600 page online book "Quantum Mechanics for Engineers",

http://www.eng.famu.fsu.edu/~dommelen/quantum/

It does all the usual stuff correctly (as far as I know); auto-generated.
hot-linked, table of contents, index, bibliography, etc, in both the html
and pdf versions. The html version has some links to gif movies. But E-
readers (which I do not have and know nothing about) may have some
difficulties with the pdf, see the comments on the above link for
example. And in the html version the mathematics is converted to images,
which has esthetic and functional limitations. (But then, so have all
other solutions I have seen so far. LaTeX2HTML will put the LaTeX source
of the equation in the ALT tag, which is really good, though I do not
know anything actually using it for, say, those with limited vision.)

My source is standard LaTeX, though I use some minor personal extensions
to customize the web page headers and footers. You seem to say that you
want to *produce* the LaTeX source, however. (Which I take to mean from
an XML source or so.) Maybe you could consider instead creating the
LaTeX source yourself, like I do, and trying to find a converter that
converts it to XML. I would think that is a converter someone would
already have written, LaTeX being such a well structured language.

Leon
j***@gmail.com
2017-07-09 21:21:15 UTC
Permalink
Raw Message
Have a look at MathBookXML, at https://mathbook.pugetsound.edu/ and https://groups.google.com/forum/#!forum/mathbook-xml-support.
Tim
2017-07-11 13:25:07 UTC
Permalink
Raw Message
Post by David T. Ashley
a)LaTeX source code.
b)HTML output (so the book can be viewed in a browser as a set of web pages).
c)E-reader source code or output (this is a complex topic, because other tools are involved, and there is more than one way to do it).
Has anyone developed any tools that go in this direction? I can develop my own tools, but I would like to see what exists already first.
Thanks sincerely.
You have a lot of great responses, but I'll chime in too. If you don't have a lot of tables, I think markdown with latex math is a great solution.

If you do have complicated tables or other complicated doc elements that is too much for markdown, I second Peter's suggestion of authoring in xml (I have a preference for DocBook) and then you can go to pdf, html, etc. from there.

I support writers who use a restricted set of LaTeX as the source and use plasTeX (python parser/tokenizer) to create DocBook XML. So I use the LaTeX to get the PDF version and the DocBook XML to create the various HTML flavors (including epub).

http://tiarno.github.io/plastex/

good luck,
--Tim
l***@magic.ms
2017-08-01 15:10:52 UTC
Permalink
Raw Message
Post by David T. Ashley
a)LaTeX source code.
b)HTML output (so the book can be viewed in a browser as a set of web pages).
c)E-reader source code or output (this is a complex topic, because other tools are involved, and there is more than one way to do it).
Has anyone developed any tools that go in this direction? I can develop my own tools, but I would like to see what exists already first.
Thanks sincerely.
No one has yet dared to mention that TeX was designed for created publications which got printed on sheets of paper. The basic fundamentals involved certain dimensions: page width, page depth, line width, font height, etcetera. When publishing to HTML or to E-Books, these dimensions become fluid. Someone reading pages in HTML or E-Book, expect to be able to change the type size at will. They expect that the line width and the length of the page will instantly change to reflect the new type size.

The author of an HTML page can conceive of writing a page which extends (almost) infinitely in the vertical direction.

Since TeX doesn't do its page layouts in that manner, any conversion process will either demand extensive modification (by hand) after the conversion, or the author should simply have a high tolerance for awful-looking pages.

I will state my preference: TeX is the most sophisticated tool for creating on-paper-published documents. It will do a reasonable job on HTML pages and E-Book pages, because both of those offer less sophistication. But the most æsthetically-pleasing results are achieved when an HTML page or an E-Book page are coerced into outputting the PDF output of a TeX Document.
Loading...