Xojo Conferences
MBSSep2018MunichDE
XDCMay2019MiamiUSA

Re: Getting maximum input speed from BinaryStream :-< (Real Studio network user group Mailinglist archive)

Back to the thread list
Previous thread: another encoding headhache
Next thread: add new REAL SQL database not working


Getting maximum input speed from BinaryStream - Best practice?   -   Stefan Pantke
  Re: Getting maximum input speed from BinaryStream :-<   -   Stefan Pantke
    Re: Getting maximum input speed from BinaryStream :-<   -   Dennis Birch
    Re: Getting maximum input speed from BinaryStream :-<   -   Walter Purvis
    Re: Getting maximum input speed from BinaryStream :-<   -   Joseph J. Strout
    Re: Getting maximum input speed from BinaryStream :-<   -   Norman Palardy
   Re: Getting maximum input speed from BinaryStream :-<   -   Theodore H. Smith
    Re: Getting maximum input speed from BinaryStream :-<   -   Stefan Pantke
    Re: Getting maximum input speed from BinaryStream :-<   -   Norman Palardy
    Re: Getting maximum input speed from BinaryStream :-<   -   Stefan Pantke
    Re: Getting maximum input speed from BinaryStream :-<   -   Stefan Pantke

Re: Getting maximum input speed from BinaryStream :-<
Date: 18.08.05 18:58 (Thu, 18 Aug 2005 19:58:18 +0200)
From: Stefan Pantke

Am 18.08.2005 um 15:35 schrieb Dennis Birch:

> At 1:09 PM +0200 8/18/05, Stefan Pantke wrote:
>
>> One thing remains: A good chunk size for the amount of
>> data to be read in one step.
>>
>> I do this:
>>
>> nextChunk = myCurrentInStream.Read( <howMuch?> * 1024 )
>>
>> If I vary <howMuch?> from 50 to 1000, the overall speed changes -
>> as expected.
>
> Have you tried reading in the whole file at once?

I'll check this. But since the files range from MBytes to GBytes, they
might be too big.

I feel that there is some kind of penalty when too much data is
read at once.

Here is why:

The throughput of the app shrinks to 1/3 when reading the whole
file
compared to reading chunks of size ( 40 * 1024 ). The input data is
around 4-9 MByte per file.

10 MByte? Where is the problem here? I really can't see why the
app can't handle this amount of data...

Meanwhile, I believe there is a heavy bug somewhere in RB's runtime.

And - once again - I really hate hunting this one. I already optimized
days to get the app as fast as it is now...

Kinds
_______________________________________________
Unsubscribe or switch delivery mode:
<http://www.realsoftware.com/support/listmanager/>

Search the archives of this list here:
<http://support.realsoftware.com/listarchives/lists.html>

Re: Getting maximum input speed from BinaryStream :-<
Date: 18.08.05 19:13 (Thu, 18 Aug 2005 11:13:25 -0700)
From: Dennis Birch
At 7:58 PM +0200 8/18/05, Stefan Pantke wrote:
>Am 18.08.2005 um 15:35 schrieb Dennis Birch:
>
>>At 1:09 PM +0200 8/18/05, Stefan Pantke wrote:
>>
>>>One thing remains: A good chunk size for the amount of
>>>data to be read in one step.
>>>
>>>I do this:
>>>
>>> nextChunk = myCurrentInStream.Read( <howMuch?> * 1024 )
>>>
>>>If I vary <howMuch?> from 50 to 1000, the overall speed changes -
>>>as expected.
>>
>>Have you tried reading in the whole file at once?
>
>I'll check this. But since the files range from MBytes to GBytes, they
>might be too big.
>
>I feel that there is some kind of penalty when too much data is
>read at once.
>
>Here is why:
>
> The throughput of the app shrinks to 1/3 when reading the whole file
> compared to reading chunks of size ( 40 * 1024 ). The input data is
> around 4-9 MByte per file.
>
>10 MByte? Where is the problem here? I really can't see why the
>app can't handle this amount of data...
>
>Meanwhile, I believe there is a heavy bug somewhere in RB's runtime.
>
>And - once again - I really hate hunting this one. I already optimized
>days to get the app as fast as it is now...

I'm pretty sure that BinaryStream operations are buffered, so you
might be running into an issue with that. Of course that could have
changed since I implanted that little nugget into my brain. But I
seem to remember one of the RS engineers recommending not doing your
own buffering because of that, and just letting RB handle it instead.
I guess if you're seeing evidence that this is not the best course,
that should probably be the deciding factor.
_______________________________________________
Unsubscribe or switch delivery mode:
<http://www.realsoftware.com/support/listmanager/>

Search the archives of this list here:
<http://support.realsoftware.com/listarchives/lists.html>

Re: Getting maximum input speed from BinaryStream :-<
Date: 18.08.05 19:54 (Thu, 18 Aug 2005 14:54:54 -0400)
From: Walter Purvis
Well, (a) Don't change the name of a thread on a mailing list for no
good reason.

(b) I'm not sure, but it seems as if you're reading in a binary stream and
assigning the chunks to a string variable, then comparing against
another string variable, using string functions to extract bits of the
data, etc.

It would probably be faster to read the binary stream into a memory
block and do simple byte comparisons, rolling your own split, rightb,
and so on.

In my experience using memory blocks is far faster for intensive text
processing than using string functions.
_______________________________________________
Unsubscribe or switch delivery mode:
<http://www.realsoftware.com/support/listmanager/>

Search the archives of this list here:
<http://support.realsoftware.com/listarchives/lists.html>

Re: Getting maximum input speed from BinaryStream :-<
Date: 18.08.05 20:06 (Thu, 18 Aug 2005 13:06:52 -0600)
From: Joseph J. Strout
At 7:58 PM +0200 8/18/05, Stefan Pantke wrote:

> The throughput of the app shrinks to 1/3 when reading the whole file
> compared to reading chunks of size ( 40 * 1024 ). The input data is
> around 4-9 MByte per file.
>
>10 MByte? Where is the problem here? I really can't see why the
>app can't handle this amount of data...
>
>Meanwhile, I believe there is a heavy bug somewhere in RB's runtime.

Not to be defensive, but I haven't seen any evidence of that. More
likely, you're learning interesting details of how modern disk
systems (which are quite complex) work with regard to performance.

Best,
- Joe

Re: Getting maximum input speed from BinaryStream :-<
Date: 18.08.05 20:33 (Thu, 18 Aug 2005 13:33:09 -0600)
From: Norman Palardy

On Aug 18, 2005, at 1:06 PM, Joseph J. Strout wrote:

> At 7:58 PM +0200 8/18/05, Stefan Pantke wrote:
>
>> The throughput of the app shrinks to 1/3 when reading the whole
>> file
>> compared to reading chunks of size ( 40 * 1024 ). The input data
>> is
>> around 4-9 MByte per file.
>>
>> 10 MByte? Where is the problem here? I really can't see why the
>> app can't handle this amount of data...
>>
>> Meanwhile, I believe there is a heavy bug somewhere in RB's runtime.
>
> Not to be defensive, but I haven't seen any evidence of that. More
> likely, you're learning interesting details of how modern disk systems
> (which are quite complex) work with regard to performance.
>
Can be quite tricky, like don't try to write more than a single buffer
full of data to the disk at one time.
Writing 16 Mb when the disk has an 8Mb buffer can be q tad slower than
2 8Mb writes.

And reading is similar and depends heavily on the metrics for the given
drive

_______________________________________________
Unsubscribe or switch delivery mode:
<http://www.realsoftware.com/support/listmanager/>

Search the archives of this list here:
<http://support.realsoftware.com/listarchives/lists.html>

Re: Getting maximum input speed from BinaryStream :-<
Date: 18.08.05 21:27 (Thu, 18 Aug 2005 21:27:49 +0100)
From: Theodore H. Smith
Stephan,

This is a common question, no matter what language you are using.
Even C coders need to look this up.

The long answer is: It varies.
The short answer is: 64KB

I usually just read in 64KB blocks if I need speed :)

Re: Getting maximum input speed from BinaryStream :-<
Date: 19.08.05 04:36 (Fri, 19 Aug 2005 05:36:05 +0200)
From: Stefan Pantke

Am 18.08.2005 um 21:33 schrieb Norman Palardy:

>
> On Aug 18, 2005, at 1:06 PM, Joseph J. Strout wrote:
>
>> At 7:58 PM +0200 8/18/05, Stefan Pantke wrote:
>>
>>> The throughput of the app shrinks to 1/3 when reading the
>>> whole file
>>> compared to reading chunks of size ( 40 * 1024 ). The input
>>> data is
>>> around 4-9 MByte per file.
>>>
>>> 10 MByte? Where is the problem here? I really can't see why the
>>> app can't handle this amount of data...
>>>
>>> Meanwhile, I believe there is a heavy bug somewhere in RB's runtime.
>>>
>> Not to be defensive, but I haven't seen any evidence of that.
>> More likely, you're learning interesting details of how modern
>> disk systems (which are quite complex) work with regard to
>> performance.

I'll write a bit of C/ObjC code to check what's up. It
would bet, that this is 5 - 10 times faster...

> Can be quite tricky, like don't try to write more than a single
> buffer full of data to the disk at one time.
> Writing 16 Mb when the disk has an 8Mb buffer can be q tad slower
> than 2 8Mb writes.

I not even write a single bit. The app just reads a binary ASCII
data file sequentially, moving the read position forward only.

_______________________________________________
Unsubscribe or switch delivery mode:
<http://www.realsoftware.com/support/listmanager/>

Search the archives of this list here:
<http://support.realsoftware.com/listarchives/lists.html>

Re: Getting maximum input speed from BinaryStream :-<
Date: 19.08.05 05:27 (Thu, 18 Aug 2005 22:27:00 -0600)
From: Norman Palardy

On Aug 18, 2005, at 9:36 PM, Stefan Pantke wrote:
>
> I not even write a single bit. The app just reads a binary ASCII
> data file sequentially, moving the read position forward only.
>
Reading in chunks that are just the buffer size of the disk _should_
help

But I know of no way to get the buffer size programmatically

And I suspect there is some buffering going on in the RB framework as
well but I can't say how much this might help/hurt.

_______________________________________________
Unsubscribe or switch delivery mode:
<http://www.realsoftware.com/support/listmanager/>

Search the archives of this list here:
<http://support.realsoftware.com/listarchives/lists.html>

Re: Getting maximum input speed from BinaryStream :-<
Date: 19.08.05 17:57 (Fri, 19 Aug 2005 18:57:01 +0200)
From: Stefan Pantke
Am 19.08.2005 um 06:27 schrieb Norman Palardy:

> On Aug 18, 2005, at 9:36 PM, Stefan Pantke wrote:
>
>> I not even write a single bit. The app just reads a binary ASCII
>> data file sequentially, moving the read position forward only.
>>
> Reading in chunks that are just the buffer size of the disk
> _should_ help
>
> But I know of no way to get the buffer size programmatically
>
> And I suspect there is some buffering going on in the RB framework
> as well but I can't say how much this might help/hurt.

Yeah, I'm aware of buffering at different levels.

But, if I read a 4 MByte file in one single step,
the throughput reduces to 30% of reading the stuff
in chunks. You know, what I mean: A 4 MByte file is
small regarding hundreds of MByte RAM today.

I'll run some test using Sampler, ObjectAlloc or BigTop,
bit I suppose I won't get much detail. In fact, even if
I would get detail, I most likely will fail to change
anything - in the RB runtime...

Well, once again a real drawback of RB's string handling.

Switching to a MemoryBlock might or might not help. Since
I not only need to check certain part os the input data, but
I need to get parts of the input data and further process
them.


_______________________________________________
Unsubscribe or switch delivery mode:
<http://www.realsoftware.com/support/listmanager/>

Search the archives of this list here:
<http://support.realsoftware.com/listarchives/lists.html>

Re: Getting maximum input speed from BinaryStream :-<
Date: 19.08.05 18:00 (Fri, 19 Aug 2005 19:00:55 +0200)
From: Stefan Pantke

Am 19.08.2005 um 06:27 schrieb Norman Palardy:

>
> On Aug 18, 2005, at 9:36 PM, Stefan Pantke wrote:
>
>>
>> I not even write a single bit. The app just reads a binary ASCII
>> data file sequentially, moving the read position forward only.
>>
> Reading in chunks that are just the buffer size of the disk
> _should_ help
>
> But I know of no way to get the buffer size programmatically
>
> And I suspect there is some buffering going on in the RB framework
> as well but I can't say how much this might help/hurt.

Ah, I forgot to mention this:

Processing the data takes roughly 30% of total CPU, while
reading and scanning the input data - very simple scanning -
takes 70%.

Unbelievable bad :-(
_______________________________________________
Unsubscribe or switch delivery mode:
<http://www.realsoftware.com/support/listmanager/>

Search the archives of this list here:
<http://support.realsoftware.com/listarchives/lists.html>