[DUG] Validating CDS files
Matthew Comb
matt at ferndigital.com
Mon Jan 17 19:19:39 NZDT 2011
Paul,
Thanks for your thoughts, I was tending towards reverse engineering the
format, I could not see any obvious tokens at footer, and wondered if
someone had beaten me to it.
I actually prefer the simplicity of your idea of compressing/encrypting
the xml file. Thats a tidyier solution for now until we can get to the
hashing
Not sure why we haven't thought of that already....
Cheers,
Matt.
> Matthew wrote:
>
>> I wasn't suggesting that wireless was changing byte
>> structure, but if you are streaming data, and your datastream
>> gets disconnected, then you could end up with an incomplete transfer.
>>
>> I'm not 100% sure that midas catches all scenarios when
>> working off a remote data instance ?
>>
>> Note we use dbx4mysql + midas.
>>
>> Note also that I cannot rule out the drivers either and also
>> could be the data out of the db server, its a tricky one to
>> track down, as you basically have a black box from db ->
>> dbxmysql + midas...
>>
>> What I do know is that its very rare. e.g. maybe 1 in 10,000
>> usages corrupts the file and has occurred in more than 1
>> location, so it does not appear to be station specific.
>>
>> Logically with these stats, I can only put that down to a
>> flakey connection, otherwise the error rate would be more often.
>
> If you can't go the hash route in the short term, it sounds to me like
> you'll need to reverse engineer the cds format to determine whether
> there is a examinable file tail.
>
> I don't imagine the format will be _that_ hard to reverse engineer but
> then I've done a fair bit of binary reverse engineering so maybe I'm
> being a bit cavalier. If you make a number of simple and small cds files
> and compare and contrast them, you can probably work your way up in
> determining the encoding.
>
> Assuming it was implemented using some form of streaming class model,
> there may well be a common interface and then a cds outputter and an xml
> outputter. If so, there is probably going to be a fairly 1-to-1 mapping
> between the binary format and the xml format in terms of logical schema.
>
> These xml inspited compressed binary schemas are often all pretty
> similar - they replace the text tags and attributes with binary tokens
> taken from a dictionary to avoid repetition and reduce the size.
>
> As an side, often you're better off just zipping the xml rather than
> implementing and debugging your own tokenized 'binary xml' engine but
> that's a whole different argument :-)
>
> Anyway, you might get lucky and quickly determine there is a common
> metadata trailer or end of data stream signature in the binary schema,
> similar to how zip files replicate their metadata directory and place it
> at the end of the file.
>
> This may seem overkill for your current problem since you're aiming for
> a short term solution, but having a cds stream viewer which can load
> into a virtual tree view might be a useful in-house debugging tool down
> the track.
>
> Cheers,
> Paul.
>
> _______________________________________________
> NZ Borland Developers Group - Delphi mailing list
> Post: delphi at delphi.org.nz
> Admin: http://delphi.org.nz/mailman/listinfo/delphi
> Unsubscribe: send an email to delphi-request at delphi.org.nz with Subject:
> unsubscribe
>
More information about the Delphi
mailing list