Forgot your password?
typodupeerror
Data Storage

Vendor Neutral File Formats? 83

Posted by Cliff
from the no-more-lock-in dept.
timmyv asks: "I have recently been tasked with developing a corporate wide policy that will standardize all employee created documents on vendor neutral file formats. OASIS is good in theory, but I haven't been able to locate enough concrete examples of policies or implementation schemes that work at a corporate level. Does anyone work at a company where documents can only be saved as RTF, HTML, etc. or have any experience with this type of problem?"
This discussion has been archived. No new comments can be posted.

Vendor Neutral File Formats?

Comments Filter:
  • by ilithiiri (836229) on Friday December 31, 2004 @11:54AM (#11227989) Homepage Journal
    and we, unfortunately, use _all_ the formats known to the world.

    I've already tried to encourage the adoption of hassle-free formats (rtf, html, TXT, whatever).. they don't pass.

    It seems that people simply can't get it.
    Unfortunately.
  • by GoofyBoy (44399) on Friday December 31, 2004 @12:10PM (#11228079) Journal

    There could be a huge number of different files you need. CAD files, images, Powerpoint presentations, complex spreadsheets will all mess up any format you can come up with (eg HTML). How would you even edit some of these things?

    Even OpenOffice formats are not vendor neutral, you have only one product out there that really uses it.
  • by Alpha27 (211269) on Friday December 31, 2004 @12:21PM (#11228134)
    The idea of switching applications for people can be a task no one wants to undertake for many two reasons.

    Comfort level:
    It's like having designers switch from Photoshop to The GIMP, or MS Word to OO Writer. Granted, the apps accomplish the same thing, but it's not the *same* program. People will resist the change because they know how to use the first program, and the reason for the change isn't a concern for them.

    Dominance:
    Going vendor neutral when the major still use vendor specific requires you to see if your users use vendor specific features that are not available in the neutral. If those features aren't there, then what do you do? Write code to compenstate for the feature, or get plugins, or do nothing if there's nothing you can do. Are there tools that can do as good a job as the old tools, to work in this neutral envirnoment?

    It would help more if you stated your case in more detail.
  • Re:Hmmm. (Score:3, Insightful)

    by fm6 (162816) on Friday December 31, 2004 @12:50PM (#11228354) Homepage Journal
    XML isn't a format. It's a language for creating formats. Saying "we'll use XML" is like saying "we'll use an SQL database". It's a step, but only a small one. The big decisions remain.
  • Re:OpenOffice (Score:4, Insightful)

    by fm6 (162816) on Friday December 31, 2004 @12:53PM (#11228380) Homepage Journal
    Well, that's not exactly "vendor neutral", since only one vendor supports it. Of course, that one vendor is an open-source project, and the format is well-documented XML. So if you want to break out of the Microsoft orbit, it's the obvious first choice.
  • Re:Hmmm. (Score:4, Insightful)

    by pauljlucas (529435) on Friday December 31, 2004 @12:55PM (#11228396) Homepage Journal
    XML maybe?
    XML without a schema (and applications that can understand it) is useless. One needs something like DocBook [oasis-open.org].
  • by moreati (119629) <alex@moreati.org.uk> on Friday December 31, 2004 @12:59PM (#11228417) Homepage
    Avoiding vendor lockin is of course A Good Thing. However, as others have said, there is no format completely vendor neutral - each platform has it's own set of unique features that don't translate directly and must be stored somewhere in an extension or custom tag. I'm certain the OASIS/OOo format has a few StarOfficeisms in it.

    What matters is that the data you own is readly transformable into a Fully Open and documented format independant of your chosen platform, normally (but not necessarily) this will mean your native format is Fully Open and documented. This includes all data, styling, formatting, metadata and interrelationships. Bascially you should be able to quickly jump ship, even if your vendor has been wiped of the earth or there are legal/technical issues preventing you from running the original platform, without loss or 'damage' of any information. There must be at least one other clear route to all your information, completely bypassing the original platform.

    As an example .doc would be unsuitable since the format is undocumented and you would be reliant on the correct version of office to correctly and completely read/export it, hence you would depend on Microsoft.

    Similarly prior to it's released as open source software and even immediately after .sxw would have been unsuitable (even though it was 'just zipped xml'), since OOo/StarOffice were the only way of performing any completely trustworthy export. Now the format is formally documented and independant tools exist it is suitable.

    There are grey areas such as databases, which have no common datafile format but do expose Fully Open interfaces such as ODBC or JDBC.

    With this in mind I would argue that forcing everyone to save documents in 'basic' formats such as HTML and RTF is counterproductive, they lack wide support for features such styling and precise page layout. Any format will do as long as you can readily, fully & demonstratably extract all your information, independantly of the platform that created it.

    Alex
  • by GoofyBoy (44399) on Friday December 31, 2004 @01:09PM (#11228480) Journal
    Umm... you a moving from a vendor-specific system to in-house expertise-specific system.

  • by fm6 (162816) on Friday December 31, 2004 @01:12PM (#11228499) Homepage Journal
    You think RTF is "vendor neutral"? It's simply a 7-bit-safe version of Word's native format. There are lots of third-party tools that read and write RTF, but the same is true of Word native. Either way, you run up against all the formatting issues you always get when you're importing and exporting unstructured formats.

    HTML is only vendor neutral if you don't use any vendor-specific extensions. So you can't just say, "Everybody save your files as HTML". You also have to forbid anybody using apps (such as Word) that save to a non-standard HTML.

    In theory, you can create an XML-based format that looks the same in Word, OpenOffice, FrameMaker, and any other XML-aware app. But doing so means designing a schema in extreme nit-picking detail, and writing a lot of transformations to get that XML in and out of all the apps that need to read or write it. It's a lot of work, and nobody does it unless they have a specific application that requires highly-structured information. Like if you have a huge set of technical documentation that you need to update a lot. (I was involved in just such a project -- and the politics of converting all those documents to XML cost me my job.) Or if you have invoices or similar business documents that need to go into or out of a web services app.

    But for the big mass of unstructured documents, there just isn't a vendor-neutral solution, and nobody has any real incentive to create one. The solution remains the same: standardize on certain specific applications. Which boils down to using OpenOffice if you hate giving money to Bill and/or want a platform-neutral solution. Otherwise you standardize on Microsoft Office, because it's what everybody knows how to use.

  • by abb3w (696381) on Friday December 31, 2004 @02:36PM (#11229026) Journal
    The first question is not what, or how; the first question is WHY. As in, why do this? And therefore, is there a better way to achieve this goal?

    Are they doing this to save money? to clamp down on the uppity workers? because the CEO got emailed an AppleWorks attachment with no file extension from some Mac user? to avoid the risks of single vendor lock-in?

    Many documents formats can be converted back-and-forth with some degree of effectiveness. Yes, if you open a document from WordPerfect in Microsoft Office, the word spacing may change a little. However, this happens if you move from a machine connected with a HP4000 printer to a HP2100 printer as well. However, some formats give different feature capabilities; saving from DOC to RTF will cause (as an example) tables to shift about a bit. TXT format is readable by most anything, but the formatting capabilites are nigh nonexistant. (Ooh! Tabs!) While WordPerfect and Word will each open the others documents, they aren't so good for saving in open formats

    What formats are currently used? Why are they needed? Will everyone need to be able to write to them, or are pay-writer/free-reader combos acceptable? And, *ARE* there any "vendor neutral" formats out there? (For desktop publishing, the real answer is "no". Publisher is a joke, and while Adobe and Quark maintain some import compatibilties, the formats AREN'T neutral.)

    For myself, working in a small department, "Let a thousand flowers bloom" is just fine. I accept that I will occaisionally get forwarded an e-mail with an attachement that the user can't figure out how to open-- usually Mac/PC file extension name issues solved easily by renaming. Once in a blue moon I have to explain to someone that no, not everyone has FooBarBaz market research organizer, since for most the $800 license cost for it would be more beneficially used for other things, and they will probably need to examine such data files once in their career, if that.

    Perhaps a list of universally accepted formats-- that is, formats that must be used for wide distribution-- would be more appropriate, after considering what features are needed in said formats. After all, Photoshop .PSD documents are harder to view outside Photoshop, but far more useful for subtle graphics work than JPEGs.

    I suspect you are being sent out on a project inadequately considered. Depending on the pointy-hairyness of the person who assigned it to you, you may find some substantial benefit to reconsidering the ground assumptions.

  • LaTeX (Score:2, Insightful)

    by KivlE (547859) on Friday December 31, 2004 @04:06PM (#11229580)
    Hmm, I'd say LaTeX would be a good alternative? There are interpreters for most platforms, the source files are plain text, and it can output a variety of readable formats (pdf,ps,html etc).
  • Bad Assignment (Score:3, Insightful)

    by salesgeek (263995) on Friday December 31, 2004 @04:08PM (#11229604) Homepage
    I'd recommend you find a way to get out of the assignment. You will not find what you seek as it is one of the holy grails of computing that should exist but does not and does not for good reason (money).
  • by Anonymous Coward on Friday December 31, 2004 @08:32PM (#11231198)
    And this isn't a mystery?

    No. It's a matter of researching documentation.
  • by michaela (31955) on Friday December 31, 2004 @11:04PM (#11231834) Homepage
    Yep. Just use unzip and you'll get several XML files, among them: content.xml is the document itself, meta.xml is the property sheet info, styles.xml is the stylesheet(s) in use when the document was saved.

    After that, you can your favorite XML widget, such as the XML::Parser [cpan.org] Perl module, to turn it into HTML or other things of your choosing.

    Or create an XSLT file and use something like Xalan [apache.org] to
    format it on the fly.

    Gotta love OOo and those open formats!

Passwords are implemented as a result of insecurity.

Working...