[DUG] Validating CDS files

Jolyon Smith jsmith at deltics.co.nz
Tue Jan 18 08:43:46 NZDT 2011


Which takes me right back to the very first software development course I
ever attended.  One half of a day was spent learning the most important
lesson I ever learned in this business...

    Be cautious or even suspicious when a specification/question is provided

    in the form of a request for a specific technical solution.


The first thing you should do is obtain an understanding of the *problem*
that the solution is supposed to solve.  Anyone that can ask for a specific
technical solution should be able to provide that solution themselves.  If
they have to ask for help in building the solution, then they are most
likely simply not equipped to identify it as the solution in the first
place, and so they should be guided back to the problem itself.

(Of course, it may be that even with complete understanding the same
technical solution is, after all, arrived at - never rule out blind luck!
LOL)


An example might be someone asking for help in building a flotilla of reed
boats.

   Option 1: Show them how/Help them to build a boat out of reeds

   Option 2: Ask them why they need such a seemingly odd flotilla of boats
...


Take option 2 and imagine they reply:-

   "Because I need to get a few hundred people across a river, and 
    where we are there are no trees nearby so I have no wood with 
    which to build regular boats.  There are plenty of reeds though,
    and I heard about this guy that built a boat of reeds and so I 
    figured I'd do that but I need someone to tell me how to do it"

Then ask them where this river is and where they are.  Then imagine that
when you learn that where they are, you know that there is already a
foot-bridge across the river, just around a bend that they apparently didn't
know about.  TA DA!


They presented a request for a solution to a problem without fully
understanding their own problem domain.  With a proper understanding of the
problem domain, the asked for solution is readily identified as misguided -
at best - and a cheaper, faster, more appropriate solution identified.


In my experience, when a request comes in the form of a specific solution, 9
times out of 10 there is a better, more appropriate solution.  Indeed, quite
often it becomes apparent that the asked for solution would not actually
have even been a solution at all.

The species of reed at hand may not be suitable for building boats out of,
so your flotilla would simply have sunk and you still would not have made it
across the river.

:)


Your responses to questions intended to discover more about the problem
domain have been puzzling in some cases, and flat-out contradictory in
others, which lead me to wonder if you perhaps weren't trying to built a
flotilla of reed boats when there was a foot-bridge just out of sight.

Pardon me for trying to help.

Carry on as you were.



-----Original Message-----
From: delphi-bounces at delphi.org.nz [mailto:delphi-bounces at delphi.org.nz] On
Behalf Of Matthew Comb
Sent: Monday, 17 January 2011 22:34
To: NZ Borland Developers Group - Delphi List
Cc: 'NZ Borland Developers Group - Delphi List'
Subject: Re: [DUG] Validating CDS files

Hi Jolyon,

I almost wish I hadn't asked this question now :)

I'm well aware of the options, and setting up a webservice to retrieve a
hashed/checksumed payload set, is our preferred approach yes. This is a
legacy piece of software and doesn't adhere to our architectural patterns
and practices.

Knowing and acknowledging this is the better solution, I asked only one
very simple question.

Is there a known way to validate a binary CDS file.

All other solutions and suggestions are appreciated but really not
necessary. I may have lead the thread offtrack. Sorry for that.

Cheers,

Matt.



> I am thoroughly confused now...
>
> If you are talking about a *stream* that you theorise is being corrupted
> in
> mid-flow over the wire, then I don't see how XML really helps ensuring the
> integrity of that stream, or at least not how it is any more helpful than
> other techniques ... you can't do anything with XML until the document is
> complete as an incomplete document is not valid no matter how valid it may
> be when eventually complete.
>
> So you can't do any validation until your potentially interrupted stream
> has
> finished streaming and you have a complete file to work with.
>
> That being the case, if XML is a viable way to circumvent your apparent
> problem then why can you not simply eliminate the streaming part entirely
> ?
> Dump your output into a file on the server side of your communication,
> then
> transmit the entire file, thereby leveraging the integrity checks of the
> connection to ensure reliable transmission.
>
> Hashing files in this way is a trivially simple and relatively efficient
> matter using MD5, for example.
>
>
> I am not entirely convinced that you are solving the right problem here.
>
>
> The infrequency of the apparent corruption could surely potentially be as
> much due to an infrequent access to some data that is already corrupt on
> the
> server as it is to some sporadic wireless network corruption, no ?
>
>
> -----Original Message-----
> From: delphi-bounces at delphi.org.nz [mailto:delphi-bounces at delphi.org.nz]
> On
> Behalf Of Matthew Comb
> Sent: Monday, 17 January 2011 19:20
> To: NZ Borland Developers Group - Delphi List
> Subject: Re: [DUG] Validating CDS files
>
> Paul,
>
> Thanks for your thoughts, I was tending towards reverse engineering the
> format, I could not see any obvious tokens at footer, and wondered if
> someone had beaten me to it.
>
> I actually prefer the simplicity of your idea of compressing/encrypting
> the xml file. Thats a tidyier solution for now until we can get to the
> hashing
>
> Not sure why we haven't thought of that already....
>
> Cheers,
>
> Matt.
>
>
>> Matthew wrote:
>>
>>> I wasn't suggesting that wireless was changing byte
>>> structure, but if you are streaming data, and your datastream
>>> gets disconnected, then you could end up with an incomplete transfer.
>>>
>>> I'm not 100% sure that midas catches all scenarios when
>>> working off a remote data instance ?
>>>
>>> Note we use dbx4mysql + midas.
>>>
>>> Note also that I cannot rule out the drivers either and also
>>> could be the data out of the db server, its a tricky one to
>>> track down, as you basically have a black box from db ->
>>> dbxmysql + midas...
>>>
>>> What I do know is that its very rare. e.g. maybe 1 in 10,000
>>> usages corrupts the file and has occurred in more than 1
>>> location, so it does not appear to be station specific.
>>>
>>> Logically with these stats, I can only put that down to a
>>> flakey connection, otherwise the error rate would be more often.
>>
>> If you can't go the hash route in the short term, it sounds to me like
>> you'll need to reverse engineer the cds format to determine whether
>> there is a examinable file tail.
>>
>> I don't imagine the format will be _that_ hard to reverse engineer but
>> then I've done a fair bit of binary reverse engineering so maybe I'm
>> being a bit cavalier. If you make a number of simple and small cds files
>> and compare and contrast them, you can probably work your way up in
>> determining the encoding.
>>
>> Assuming it was implemented using some form of streaming class model,
>> there may well be a common interface and then a cds outputter and an xml
>> outputter. If so, there is probably going to be a fairly 1-to-1 mapping
>> between the binary format and the xml format in terms of logical schema.
>>
>> These xml inspited compressed binary schemas are often all pretty
>> similar - they replace the text tags and attributes with binary tokens
>> taken from a dictionary to avoid repetition and reduce the size.
>>
>> As an side, often you're better off just zipping the xml rather than
>> implementing and debugging your own tokenized 'binary xml' engine but
>> that's a whole different argument :-)
>>
>> Anyway, you might get lucky and quickly determine there is a common
>> metadata trailer or end of data stream signature in the binary schema,
>> similar to how zip files replicate their metadata directory and place it
>> at the end of the file.
>>
>> This may seem overkill for your current problem since you're aiming for
>> a short term solution, but having a cds stream viewer which can load
>> into a virtual tree view might be a useful in-house debugging tool down
>> the track.
>>
>> Cheers,
>>   Paul.
>>
>> _______________________________________________
>> NZ Borland Developers Group - Delphi mailing list
>> Post: delphi at delphi.org.nz
>> Admin: http://delphi.org.nz/mailman/listinfo/delphi
>> Unsubscribe: send an email to delphi-request at delphi.org.nz with Subject:
>> unsubscribe
>>
>
>
> _______________________________________________
> NZ Borland Developers Group - Delphi mailing list
> Post: delphi at delphi.org.nz
> Admin: http://delphi.org.nz/mailman/listinfo/delphi
> Unsubscribe: send an email to delphi-request at delphi.org.nz with Subject:
> unsubscribe
>
>
> _______________________________________________
> NZ Borland Developers Group - Delphi mailing list
> Post: delphi at delphi.org.nz
> Admin: http://delphi.org.nz/mailman/listinfo/delphi
> Unsubscribe: send an email to delphi-request at delphi.org.nz with Subject:
> unsubscribe
>


_______________________________________________
NZ Borland Developers Group - Delphi mailing list
Post: delphi at delphi.org.nz
Admin: http://delphi.org.nz/mailman/listinfo/delphi
Unsubscribe: send an email to delphi-request at delphi.org.nz with Subject:
unsubscribe



More information about the Delphi mailing list