Trying to read files from the past…

In the deeper directories of my storage device I have some files around from the old times, when my general purpose computer was still locked partially by proprietary (operating) systems. Never-the-less e.g. Corel Draw was still worth the lock and the money, I loved that program and what it enabled me to produce.

These products are still around. And are basically binary blobs now for me, the content not readable by the software I use. Blame on that software 🙂 Well, but also the old software, i.e. Corel Draw, and its makers, using a storage format which seems not published and possibly only available by something like a Technology Partner program or similar, meaning lawyers and businessmen, not fun. But those files’ content is mine, and I completely dislike that the format binds me to a certain software.

Searching for FLOSS code that can understand files in Corel Draw’s CDR file format I discovered UniConvertor from the sK1 project and some initial code for LibreOffice, even actively developed ATM, but nothing for Karbon from the Calligra Suite.

Seeing with the hex editor Okteta (jay!) that the CDR file format is based on some RIFF and finding that one completely described on Wikipedia I somehow got tempted to try to develop a CDR import filter for Karbon, to finally free my content from those binary blobs again. And there was quickly some initial success, so I may try to stay with that combining game of decoding bytes:

At least the by-product libkoralle, a Qt-based library for parsing RIFF is useful already.

I look forward to others joining the efforts on the Karbon CDR import filter. I am mainly interested to read files of the versions 4 and 5, so people with newer versions need to make sure support for these themselves 🙂 I still also need to contact the developers of UniConvertor and libcdr (if you are one, check your email box the next days 😉 ).

You can find the current state in the branch “CDRImport” in my clone of the Calligra repo “filters-karbon-cdr” in the official Calligra repo (Updated January 28, 2012).

Advertisements

Initial release of libkoralle, a simple Qt-based RIFF parser

There are Qt-based solutions for parsing tree-structured container formats like XML and JSON, but when a few days ago I came across a format based on the Resource Interchange File Format (RIFF) a quick search for a Qt-based parser yielded nothing for me… but a sigh and also the result of a mkdir command.

There is some nice documentation about RIFF on the English Wikipedia. This container format is as old as from 1991, and its ancestors even older. WAV and AVI formats are based on it, but also younger formats like Google’s WebP.

So to have other developers searching for a Qt-based RIFF parser yield something and to follow the release-often-and-early mantra, please find on the KDE ftp servers now (thanks again to the KDE admins for their less-in-a-day quick support!) what came after that mkdir command and what serves me quite well already in the parsing code of the format I deal with:

release 0.1.0 of libkoralle, a lib for parsing (and hopefully soon also writing) data in Resource Interchange File Format (RIFF) based formats.

How to use libkoralle:
Given a format based on RIFF with a structure like this:

RIFF id='XMPL'
  'VRSN'
  LIST id='DATT'
    'DATA'
    'DATA'

your code would, assuming the stream is well-formatted, be like this with version 0.1:

#include <Koralle0/RiffStreamReader>
...
Koralle0::RiffStreamReader reader(device);
reader.readNextChunkHeader();
// reader.chunkId(): Koralle0::FourCharCode('X','M','P','L')
// reader.isFileChunk(): true
// reader.isListChunk(): true
reader.openList(); // needs matching closeList();
  reader.readNextChunkHeader();
  // reader.chunkId(): Koralle0::FourCharCode('V','R','S','N')
  // reader.isFileChunk()/isListChunk(): false
  // reader.chunkData()/chunkSize(): data of the content
  reader.readNextChunkHeader();
  // reader.chunkId(): Koralle0::FourCharCode('D','A','T','T')
  // reader.isFileChunk(): false
  // reader.isListChunk(): true
  reader.openList();
  while(reader.readNextChunkHeader())
  {
    // reader.chunkId(): Koralle0::FourCharCode('D','A','T','A')
    // reader.isFileChunk()/isListChunk(): false
    // reader.chunkData()/chunkSize(): data of the content
  }
  reader.closeList();
reader.closeList();

Future version might have support for related container formats (IFF, RIFX, …) and allow passing of custom parsers for the data chunk content, to avoid the temporary QByteArray copy, as well as default parsers for standard chunk types like “INFO”.

See also Snorkel, a simple RIFF structure viewer I wrote using also libkoralle.

Contributors and feedback of course welcome! 🙂