< voriga

näxda >

Coherent Spin Ratchets

Transport and Noise Properties

vom Strehl Manuel

Gspeichad ois

Bapperln: , , , ,

The Diploma Thesis is reachable at /thesis/thesis.xhtml. It is in the XHTML format, which is inaccessible to current versions of the Internet Explorer. If you use this browser, you might consider using Mozilla Firefox, Google Chrome, Opera or any other modern browser.

How Conversion from TEX Was Done

The original source was TEX to produce the final PDF for thesis submission. However, playing with some SVG export functionality in various scientific plotting tools, all the graphs were ready as SVG already (although in quite different quality).

For import in TEX they were batch-converted to PNG, while experimenting with SVG to EPS brought insufficient results. That is mainly based on the missing support for gradients in PostScript, that were used for some of the diagrams.

Finishing the LATEX source, it was converted with TeX4ht into an XHTML+MathML document, still linking to the PNGs from the original source.

After manual clean-up of the produced markup, which can be quite time-consuming, I started to paste the SVG graphics inline into the XHTML file. This is not fully finished, since especially Grace’s SVG output needs heavy resampling before being able to be displayed in a suitable way.

Problems with the Conversion

Trying to combine these two worlds, TEX and XML, you’ll run into several problems. Some that I encountered are mentioned here.

  • SVG output in Grace and gnuplot: That’s a story for itself. It is improving (some 4.1.X version of gnuplot actually does nice SVG output), but still you will have to review the produced SVG to get it display smoothly inlined in XHTML. Especially font sizes will vary heavily depending on where the SVG is viewed. Therefore reviewing font sizes and transformations of the user coordinate system are quite important for inlining graphics.
  • Handcrafted and “Inkscape’d” graphics: That goes quite well, but you have to see for it, that you do not get into conflicts when using IDs. Keep in mind, that they have to be unique throughout the whole later document.
  • TeX4ht: This tool converts LATEX to XHTML+MathML. It’s quite nice, but it has some quirks that really can drive you nuts. One very uncool one is, that you have to assure, that every sub- and superscript is embraced correctly. So instead of shortly writing A_b you will find yourself correcting that into A_{b} a thousand times. This is where profound knowledge of regular expressions comes in handy.
  • Style: Semantically, the output of TeX4ht is quite a mess. That is no problem of TeX4ht but of the try to get style stuff marked up in LATEX into some semantical shape. So I fixed that by spending hours in cleaning up the code. Finally, I came up with quite a nice stylesheet, that does also implement some of LATEX’s features like automatic numbering of figures and equations with CSS 3’s counter property.
  • Browsers: There is only one browser out there to display this really nice, it’s Firefox > 1.5. Opera catches up with v9.5, but still has an ugly MathML rendering. Safari completely skips MathML. Ah yes, there is this other browser, whose name I just can’t recall. On my system it offers me to open XHTML files in Firefox. That’s nice.
  • Fonts: For nice display of MathML in Firefox you will need to install some fonts. Do it, especially Computer Modern should be on your system. (Perhaps that’s the reason why Opera doesn’t get such a nice rendering?) Related to this topic is the use of ligatures in the text. I decided to keep the hardcoded Unicode ligatures that were outputed from TeX4ht. The very huge disadvantage of this is, that finding text can be a real problem. Example: The text is “define” (marked up as de&#xFB01;ne). It looks good but will not be found by typing “define” into some text field. Perhaps it should rather be done via scripting afterwards.

Other Remarks

  • To embed maths afterwards (say you forgot one really important equation) one can use LATEX formulas directly in HTML and live conversion via LaTeXMathML.js.
  • Just some line of thought: Provide a nice XSL-FO+SVG stylesheet to create a PDF out of the XHTML+MathML+SVG. The problem here is clearly maths. At the moment, print output is quite an issue. Hopefully you didn’t throw away your LATEX source!
  • Some of the resampling of SVG sources can be nicely done in Inkscape. I opened some of the plots from Grace, undid all groupings, set new font sizes and saved it again. That cleaned up the code quite nicely and also added stuff like namespaces and so on that were missing, e.g., in the Grace output.

Conclusion

To export TEX, which is kind of the standard in publishing Physics, into XHTML+MathML+SVG requires still a lot of knowledge from both worlds, if you want to get useful results.

But for publishing scientific data on the web the combined format could be a huge chance. Instead of having PDF all over the place viewing papers directly and natively inside browsers has the potential to be a breakthrough in scientific document markup and retrieval. This has to be viewed in conjunction with the rise of the Open Access idea. Only for publishing in traditional scientific mags it’s no alternative, because controlling downloads of PDFs is much simpler than that of plain text XML files.

A very important part is, that 2 of 3 useful browsers support all needed features for simple, clean marked-up science documents. WebKit/Safari, however, can implement MathML support when needed or use some stylesheet like, e.g., pMML2SVG. The only major browser really not capable of displaying anything reasonable is IE because of its lack for the application/xml mime type and not being able to handle anything but proprietary XML extensions to XHTML. But looking at the distribution of browsers in the scientific area, this should not be a problem.

To conclude, the display of XHTML+MathML+SVG today works in the scientific field very good. What is missing, though, are authoring tools. TEX still is the tool of choice, and the XML community would be good advised, if they would contribute easy-to-use tools for getting a foot into the scientific door.

Manuel Strehl
August 2008