Post a Comment
In my experience, when exporting to html, you don't lose your formatting; you get your formatting so fubarred nothing can make it look decent after...
The only thing I've seen export to html right is latex, and I know we don't all use latex!
And even that requires certain rules in your markup for it to export right!
Latex is so sexy
So you're a LaTex user poo?
BTW I just looked up your website www.tomchu.com on Netcraft it reports your site as running on Linux. If you view the detailed site report it reports it as running on Linux and FreeBSD hosts but never on a Windows host. Please explain. Are you not true to your beliefs or do you just troll for the fun of it. Are you going to change your hosting provider?
Wow. How are you any less of a troll?
No, I was feeding the troll I know you shouldn't do it but sometimes it is irresistable.
Secondly and more seriously, if a regular poster behaves in a way that demonstrates consistent hypocrisy and double standards and one discovers evidence to suppot this, then one has a moral responsibility to bring to the forums attention. So I repeat myself:
Linux Is Poo I just looked up your website www.tomchu.com on Netcraft it reports your site as running on Linux. If you view the detailed site report it reports it as running on Linux and FreeBSD hosts but never on a Windows host. Please explain. Are you not true to your beliefs or do you just troll for the fun of it. Are you going to change your hosting provider?
Edited 2005-11-17 12:40
Thanks Christian for the ephiphany I experienced reading your article. A while back, there was an OSNews blurb about how Sun was touting ODF would tranform the Internet. I could see how it would provide many things, but transform the Internet was not one of them.
You very clearly explain why that makes sense and I am very excited about this possiblity. Great job.
A common misconception about Acrobat is that it lets you directly alter text in a PDF file. This really isn't true. I think the Pro version might give you options for exporting a PDF into DOC format, but you can't necessarily do live, direct editing on the file, AFAIK.
I have something to say here.
PDF is a presentational format. Therefore, it's not suited to be edited (ever tried to edit some text in Acrobat? it doesn't reflow, so you get your text screwed up).
It's a very good format, to publish things. The way you created it (the way it looks), will look on every platform, today or in 15 yrs.
But it's not suitable for being edited. Even though PDFs can be tagged (so the PDF viewer know that all those lines of text indeed belong to the same paragraph), text reflow (and re-layout) can (will) break things.
Just my .02
NachoKB
"Export to xsl and make templates
"
Me izz Mr. Wise Guy tonight... XSL (resp. XSLT) is a language in which you can write templates to transform XML code - so no need to "export to XSL"
Okay, here's the serious part of my answer: Doesn't make sense - does it? Using XSL you can transform the XML Code of the OpenDocument format into something else. But you'd still need some new capability in the browser so that "something else"(tm) is displayed with the layout etc. of the original file.
Edited 2005-11-16 19:42
Ok, I wrote my comment really fast because the boss was coming ;-)
Seriously, I think people focus too much on the form of the document, and not enough about its content.
An OpenDocument should generate an XSL template for PDF generation (XSLFO, for example) or HTML generation, along with the generated document (or there could be 2 standard XSL templates for this).
This way, an exported document from OpenDocument to XML could be converted directly into browsers that support XSL Transformations.
Does it sound better now?
Some while ago i was thinking about a very evil plan about this topic:
I think the perfect content manager(tm) should store its pages in an as rich as possible form, possibly xml, and now the better/cooler is OpenDocument, and should transform the pages with xsl server-side (obviously with an aggressive caching of the generated documents to not eat too much cpu).
And then when the user access the site, (s)he can specify the preferred format, for example:
-www.ubercoolsite.com/index.html will be the default, normal html page
-www.ubercoolsite.com/index.odt will be opendocument and so on
in this way there will be a consistent way to generate html, opendocument, pdf and tons of different formatted documents, with a wery little effort.
... possible to export OpenDocument to HTML without losing a lot.
This is possible for other fileformats and ofcourse also for ODF.
Besides that... There is no reason to export ODF to HTML. All you need is a PDF-like plugin for ODF. Or to put it another way: An ODF-Viewer plugin.
You don't export PDF to HTML when viewing PDF-files in FireFox. Nor should it be nescessary to export ODF to HTML.
We "just" need the plugin-viewer 
Me again.
As I posted in [1], there's a big difference between PDF and ODF, as I see them. On one hand, PDF is a presentational format, meaning it brings the highest output (visually) fidelity possible, at the expense of semantics (for example, it doesn't know it's a title, it just knows it's bold, big and with font xxx); on the other, ODF aims to preserve documents structurally (XML is perfect for that), at the expense of output fidelity (I've known some people which freaked out when Word changed +0.01% kerning or something like that under different systems, so a 100+ pages document had a few more lines, because of reflow; of course, that reflow broke their pixel-aligned layout they carefully crafted on a WYSIWYG view... pity).
So, a PDF-viewer plugin has to know how to draw exactly what the file says. On the other hand, an ODF-viewer plugin should generate a layout, and it probably will be different than that of the app which created it. This alone is not bad, but...
Furthermore, as the ODF file preserves structure information (practically speaking a DOM), it makes sense to leverage all of the browser technologies (stylesheets, scripting, anything else). It's just that ODF and HTML are somewhat alike.
So, with ODF you got a document, while with PDF you got a snapshot of it.
[1] http://osnews.com/read_thread.php?news_id=12685&comment_id=6091...
Not just you. Me too (sorry if I said otherwise).
I don't want yet another standard. There are two arguments for the contrary
1- W3C is really underperforming... its standard are obsolete even before they get released, and are a mess to implement (even understand; I mean, why on earth is it so damn difficult to center something, horizontally or vertically, with HTML/CSS? what about CSS3? isn't it already 3+ yrs obsolete?)
2- there's information inside the ODF which would be a shame to not get exposed to the rest of the browser (structure blablabla).
Anyways, I prefer a slow, ineffective standards body than one another which accepts (and even enacts) patented standards.
KB
That's exactly a kind of vision Gary Edwards had in his article "OpenOffice.org 2.0 leaping over legacy lockdown with clean XML", i recommend reading it:
http://madpenguin.org/cms/html/62/5304.html
There have been many rich formats before. Why didn't the browser makers rush to display them, but overcome all kinds of problems with poor HTML, Javascript, DHTML and such things?
Maybe now with broadband Internet more available, people will be ready to use rich documents?
You wrote about plugins, and I think it's up to the OOo developers to make plugins for different browsers. That will speed the things.
I remember the old days when MS also created things PowerPoint viewers for those who didn't have MS Office.
Wouldn't it just be better to support XSL-FO? http://www.w3schools.com/xslfo/xslfo_intro.asp IIRC it's supposed be W3C's XML version of PDF. It it supposed to be powerful enoguh for typesetting books so I assume it should be good enough for Word/ODF. And since ODF is also XML based all that is needed should be an XSLT transformation from one to the other (I believe that there are compilers that will take an XSLT and spit out the C code to do it to improve performance). It seems more logical to stay within W3C's suite of tools.
Apparently Apache's FOP project is an open-source XSL:FO processor.. perhaps this could be used as a base. http://xmlgraphics.apache.org/fop/
---
Edit: It now seems to me that XSL-FO is the XSL as previous post mentions. I originally understood it as meaning just making a ODF -> XHTML XSLT
Edited 2005-11-16 19:45
XSL-Fo is a 'read-only' format. You have to go through a 'compilation' stage (eg with FOP) in order to get something viewable.
XSL-Fo is designed to be a print medium, ie to allow publishers to store their books in an 'intermediate' format that can be exported to a printable binary file (pdf, ps etc)
In short, XSL-Fo is _not_ going to replace opendoc any time soon.
OTOH, it would be nice to be able to easily convert between opendoc and XSL-Fo - you could do this with an XSL stylesheet.
--Robin
I started reading thinking that it would be a nice idea, but then it struck me: If the "biggest" desktop browsers supported ODF rendering directly, many pages could become simply a collection of ODF files, odf replacing normal html, and if that doesn't really matter for desktops (as they would have support for seeing them), for pdas, smartphones and the like it would cut them off. Of course they could have odf rendering too, but we have html for a reason, and if we start having other page formats it's going to make the internet an even more confusing mass of information.
Just my 2 (euro) cents.
There are already several plugins to show rich formats in the browser, but they're not replacing HTML, nor XML on the websites.
So I don't think you have to worry. Supporting ODF in the browser is not different than supporting DOC,PPT,RTF and whatever. It's just another plug-in.
There are already several plugins to show rich formats in the browser, but they're not replacing HTML, nor XML on the websites.
<p>It doesn't need to replace XML. ODF is a form of XML, so someone would just need to create a means of rendering that XML in browsers.
<p>This could be done a few different ways, but it seems to me that the guy has a point. We've made browsers to render XML (including HTML). Now, we just have another form of XML. Why not make it so browsers can render that, too?
XML files really aren't much more than simple text files. Some binary data could be included with XML such as pictures or sound files. ODF is pretty much a JAR compressed XML with formatting.
ODF isn't common on the internet now. If it did become common, by that time portable internet devices should have the capability of rendering the files.
I think it shouldn't support that. At least for now.
Again, PDF is presentational, while ODF is structural.
BTW, ODF and HTML certainly will overlap. At first thought, that's a bad thing (stop relying on an inefficient standards body to start relying on another, whose efficiency is not yet known, but which accepts patented standards mmmmmmmmmmm bad idea), but on the other hand, HTML is a mess, and W3C only knows how to create confusion. Perhaps THIS time we could get it right? Nah, I'm dreaming...
Whatever...
NachoKB
You are right, if you don't have the fonts on your computer, you won't get the same result. On Windows Times New Roman is the default font in OOo, so Windows users usually use this font. If a Linux user opens such documents, the font is translated to another one. Therefore, this whole idea is not applicable.
same thing happens to webpages allready and i dont see much complaining happening from the penguins and daemons...
mostly its a questions of what your used to, and where the files come from...
files made by openoffice on windows and viewed in firefox on windows will see no diffrence. other combos tho may produce other results.
this however is more with copyrighted fonts then anything else 
same thing happens to webpages allready and i dont see much complaining happening from the penguins and daemons...
Webpages are not page-oriented, so there is no problem, if the fonts don't match. In OOo documents it might happen that the page structure (text/objects moving to the next/previous page) will be changed if another font is used. Worse if you use manual page breaks in your document.
First off, HTML, when used with CSS, lets you specify multiple fonts, in order of preference, for a piece of text. This means you can ensure the appearance on a variety of platforms.
Secondly, HTML is not page orientated, and when designed with modern techniques allows for fluid page layouts, so it can deal with different fonts and font-sizes (not everyone can see as well as you, you know).
This means good HTML/CSS will reflect the designers wishes on all platforms (note: market-share for Apple is around 10%, market share for Linux is around 10%, market share for Windows is around 85-95%, but as the first two figures show, it's not used all the time). Further, HTML/CSS offers far more aids for people with disabilities: it is a scandal that the web, which should have been the greatest gift ever to the blind and those with chronic eye difficulties, has remaining broadly off-limits.
At the end of the day, if you just want your users to read something, instead of entirely re-format it, HTML/CSS is superior, and that's why it's the language that web uses for publication, and rightly so.
opendocument has the concept of paging as does pdf. browsers don't do this natively because web pages themseves aren't meant to be printed. i'm sure one could implement your idea but you would lose all the paging of the original document (i imagine). I think you completely ignored a lot of issues when envisionning this.
I've looked into print formatting for css and it does help printing on the web but doesn't help us define a printable document. I don't think there are any mention of page breaks etc.
Regardless, extend my argument to other features of opendocument like the spreadsheet, how to diplay multiple work sheets etc...
it just seems to me that these types of non web documents will be handled by specific plugins (how they are handled now).
The point is, ODF is just XML. OpenOffice interprets this XML one way, and web browsers could be made to interpret the same XML to display in a web browser. It really isn't far fetched.
In fact, HTML can be set up to display one way on your screen, and to display differently when printed.
Basically, the article suggests browsers render ODF documents like they do HTML today.
I don't like the idea of having to kludge ODF support into the browser I demand instant responsiveness from.
So I think rather than tax the HTML rendering engine, I prefer to just pass the document to an external handler as you won't be browsing the web through the document anyway.
I can open a browser, surf to Google, and spell check a word in less time than I can do the same in Word or OO.o from my PC. That is sad.
If only my browser was as lightning fast as OO.o I might just have to kill myself.
Edited 2005-11-16 20:36
I like it. If it's feasible, it boosts the utility value of OpenDoc wonderfully and solidifies the case for OpenDoc over NotOpenDoc ;-). As for implementation, if a browser plugin that provides efficient, transparent support for OpenDoc is feasilble, I'm O.K. with it, but I suspect that native support would be be better. I don't really hate PDF, but I think Adobe's browser support is poor. Personally, I don't like to view .pdf's on the web. While Acrobat works well enough as a vehicle for publishing documents, it makes a slow, klunky appendage to the web. Acrobat docs are really a different animal than web docs. If OpenDoc cannot be smoothly integrated into the web such that OpenDoc documents are web documents, it, unfortunately, will have the same problems.
Edited for grammar.
Edited 2005-11-16 20:38
I think the OpenDocument format will open a bright new window of possibilities.
Take content management systems for example. Today you have all these dirty hacks to support rich text format editing in your browser (and it's a hassle to import documents properly).
With the ODF format you could just easily upload your document and it would be automatically converted to XHTML using XSL templates. Need to edit the text? Download the document (could be converted from XHTML back to ODF on the fly), edit it and upload it again. In combination with WebDAV it would be even easier. And you wouldn't need to upload the images seperately. You could just embed them in your document.
The guy has no idea of what he's talking about.
Point 1: It is possible to edit PDFs, there just aren't many PDF editors out there (KWord currently features basic PDF editing support)
Point 2: Quote: "I imagine that implementing OpenDocument rendering capabilities into browsers should be fairly easy". Nonsense. Web-Browsers are built around an entirely different paradigm. Building in support would be an enormous challenge, the disconnect between the XML in OpenDocument and XHTML is huge; it's not just a matter of lashing together a couple of XSLT stylesheets.
Point 3: The web is for viewing documents. If you want to distribute documents to other people for editing, fair enough, use OpenDocument (or Word, or RTF). But if you just want people to read your documents, HTML is far better: it's more lightweight, saving bandwidth; a multitude of accessibility tools, from text-mode browsers to screen-readers already exist for it; and a multitude of applications already support it.
Point 4: It is perfectly possible to export to HTML without losing any significant graphical detail. It is also possible to paste that HTML into a document and retain most of the formatting. The only problem is that HTML itself is not immediately editable, but the web is about publishing things that people can read, not providing documents that they can edit and send back.
The biggest problem is an internal contradiction in the article. The author is trying to make an argument for "Why Browsers Should Be Able to Display OpenDocument", but then goes on to say that "OpenOffice.org Writer, Calc, Abiword, Gnumeric, or the KOffice suite (just to name a few) are much better tools to create documents".
The crux of the matter, as I said in point 4, is that the web is a publishing medium: authors make content available for others to read, not to perpetually edit. If you want users to be able to edit documents, then they should surely use editors like OO.o Writer to do that. Further, things like GMail, Blogger and Wikipedia show that where a certain level of editing is required, HTML is up to the job. Ultimately it's a format just like OpenDocument, but optimised for publication, not editing.
ad Point 1: fair enough, it is possible to edit PDF files with some tools, but it's a lot easier to download, open and edit an OpenDoc file.
ad Point 2: really, i don't know how difficult it is to build rendering capabilities for OpenDocument into browsers. maybe you know better, maybe an expert on this matter should clear this up.
ad Point 3: "the web is for viewing documents" - i certainly think that statement will not hold true 10 years from now. let's wait and see.i don't buy the "lightweight, saving bandwidth" argument: while i know how to write very efficient HTML, i think that OpenDocument files are really small enough to be put on the web. the difference between HTML and ODF doesn't eat nearly as much bandwith as your next flash ad. also, accessibility tools could be used for ODF easy enough: it's XML, it's human readable plain text after all.
ad Point 4: one of my main points was that this importing/exporting thing is useless. it's just an additional step we don't need. ODF on the web could also save the problem with printing we nowadays have on the web: that most of the webdesigners don't know how to use print stylesheets and a lot of documents look like crap when you send them to the printer (of course not, when you get a PDF).
The biggest problem is an internal contradiction in the article. The author is trying to make an argument for "Why Browsers Should Be Able to Display OpenDocument", but then goes on to say that "OpenOffice.org Writer, Calc, Abiword, Gnumeric, or the KOffice suite (just to name a few) are much better tools to create documents".
i said: "better than dreamweaver". you certainly don't want to tell me that you would write your next spreadsheet with table, tr and td tags in a webeditor/text editor when you can use OpenOffice Calc/Excel/Gnumeric. use the right tool for the job!
and as soon as you're done, put it on the web. people love their office suites. everybody can use word, excel and powerpoint. who can write html? not even 95% of the people who call themseves "webdesigner". so let the people use their favourite tools to create documents that are google-friendly, printer-friendly and user-friendly...
regards,
christian
It is possible to edit PDFs
Not always. That depends if the PDF actually has text in it or if it's image based, which many times, scanned documents are. Also, password protected documents aren't editable. Many people use PDF not only for formatting but BECAUSE they don't want the documents edited.
Web-Browsers are built around an entirely different paradigm. Building in support would be an enormous challenge, the disconnect between the XML in OpenDocument and XHTML is huge; it's not just a matter of lashing together a couple of XSLT stylesheets.
Partly right. Building in standard and consistent support would be a challenge, but it could easily be done. I would bank on Opera being the first to pull it off, like they did with voice support. It is something that a company that controlled their own rendering engine could add client side, the same way that Firefox can parse RSS in a certain manner.
The web is for viewing documents
I hate to argue, but isn't that awfully narrow-minded? Because you want to keep things status quo, no one should innovate or extend? Yikes.
Point 4 is solid, I suppose, but also rely on the idea that the web should be a read only playground.
I would bank on Opera being the first to pull it off, like they did with voice support. It is something that a company that controlled their own rendering engine could add client side ...
IMHO a corporation (e.g. Novell, Sun or IBM) will sponsor/ develop a Firefox extension (using the XTF technology) in a similar way it was done with the XFORMS extension
If you want to edit docs, install OOo or an other ODF-ready office suite. No need to pack those 80 MB OOo into a browser plugin.
For docs not meant for editing, HTML and PDF do a much better job than any office document format will ever do. Yes, PDF needs a Acrobat or another viewer, but it's more realistic to expect that than a full office suite.
Since the major browsers now support both CSS and in-browser XSLT, you can already display ODF directly in them. You just need a stylesheet (CSS) or XSLT transformation to XHTML. Presumably an office package can format documents in ways that exceed the current capabilities XHTML or CSS, but CSS will develop. Of course, a plugin supporting XSL-FO would allow you to convert directly to PDF for display, as others have noted.
this:
1 renamed zip file.
1 xml based file containing data and markups, like say the cell content of a spreadsheet or the text of a document.
any number of images and other stuff thats embedded into the file. mostly relevant for text documents and presentations.
only problem i can see is stuffing a spreadsheet into a browser window. why? the cells that contain formuals rather then raw data. these have to be calculated by the browsers plugin or similar...
i recall a comparison that have been done on filesizes of the exact same document saved in odf and in ms doc formats. i think the diffrence was something like either 5:1 or 10:1 in favor of the odf format.
All MS Office programs, word, visio, excel, ppt, have the plugin for IE to *display only* those docs, and these plugins are very small to download, very easy to install, and are freely available. It would be a shame for OOo's lacking of this. Is it very hard to implement?





