Xojo Conferences
MBSOct2019CologneDE

Re: "I feel the need for speed" (string) (Real Studio network user group Mailinglist archive)

Back to the thread list
Previous thread: Drawing Smooth Ovals
Next thread: Serial under OS-X?


RE: Image Scaling in an Image Well   -   Edward Cox
  Re: "I feel the need for speed" (string)   -   Kevin Ballard
   Re: "I feel the need for speed" (string)   -   Charles Yeomans
    Re: "I feel the need for speed" (string)   -   Kevin Ballard
     Re: "I feel the need for speed" (string)   -   Noah Desch
     Re: "I feel the need for speed" (string)   -   Chris Little
      Re: "I feel the need for speed" (string)   -   Charles Yeomans
   Re: "I feel the need for speed" (string)   -   Joseph J. Strout
    Re: "I feel the need for speed" (string)   -   Joseph J. Strout
     Re: "I feel the need for speed" (string)   -   Jan Erik Moström <
      valid "macdata" in drag?   -   Christian Miller
    Re: "I feel the need for speed" (string)   -   Harry Hooie

Re: "I feel the need for speed" (string)
Date: 08.05.02 23:22 (Wed, 08 May 2002 18:22:50 -0400)
From: Kevin Ballard
On 5/8/02 3:42 PM, "Jan Erik Moström" <<email address removed>> wrote:

> A couple of questions about strings. The background is that I'm writing
> an application that read pretty large text files, parses the text,
> reformats it and write it to a couple of other files.
>
> Since I want the program to be as fast as possible I wonder how I should
> use strings.
>
> 1 If I have a string and is adding lines to that string what is
> the fastest way of doing this? I assume there is some overhead
> in doing
>
> strAll = strAll + str1
>
> I'm doing this over and over reading a line at the time from the
> input file (later I might consider reading larger chunks and
> then read from memory ... but right now this is enough). So my
> question is: what is the fastest way of concatenating two
> strings?

Someone has a FastString plugin, I forget who, which should speed it up. If
you know the total size of the string at the end, you can make a memoryblock
of that size and simply use .StringValue to set the parts of the memoryblock
to your string.

> 2 When writing text items (most of them small chunks of 10-40
> bytes, some larger 2 K or so) is it faster to write a chunk at
> the time or do
>
> o.write chunk1 + chunk2 + chunk3 + chunk4 + ...

o.write chunk1 + chunk2 + chunk3 + chunk4 + ... is faster. Each time you
concatenate two strings, RB has to make a third string and copy both
original strings over. If you concatenate all strings in one line, RB is
smart enough to make one über-string to encompass all the substrings, and
copy them all in. If you concatenate a chunk at a time, RB has to go through
the overhead of making a new string each time.

> 3 If I have a string that contains a number of lines, and I want
> to loop through these lines one-by-one and do certain things on
> each line. Something like this
>
> for i = 1 to nrLines
> curLine = nthField(source,sep,i)
>
> do different stuff with the line
>
> output = output + curLine
> next
>
> What is the fastest way of doing this?

AFAIK, use CountFields and nthField to grab lines. One thing to note is,
when dealing with large values for CountFields, set the CountFields result
to a variable and use that in the For loop instead of CountFields, as that
will avoid the overhead of calling the function each iteration through the
loop.

--

Re: "I feel the need for speed" (string)
Date: 08.05.02 23:55 (Wed, 8 May 2002 18:55:27 -0400)
From: Charles Yeomans

On Wednesday, May 8, 2002, at 06:22 PM, Kevin Ballard wrote:

> On 5/8/02 3:42 PM, "Jan Erik Moström" <<email address removed>> wrote:
[snip]
>> 3 If I have a string that contains a number of lines, and I want
>> to loop through these lines one-by-one and do certain things on
>> each line. Something like this
>>
>> for i = 1 to nrLines
>> curLine = nthField(source,sep,i)
>>
>> do different stuff with the line
>>
>> output = output + curLine
>> next
>>
>> What is the fastest way of doing this?
>
> AFAIK, use CountFields and nthField to grab lines. One thing to note is,
> when dealing with large values for CountFields, set the CountFields
> result
> to a variable and use that in the For loop instead of CountFields, as
> that
> will avoid the overhead of calling the function each iteration through
> the
> loop.
>

Actually, NthField is quite slow for this sort of thing. I wrote a
TextFieldIterator class that is optimized for iteration. It's available
at <http://www.quantum-meruit.com/RB/TextFieldIterator.sit.hqx>.

Charles Yeomans


---
Subscribe to the digest:
<mailto:<email address removed>>
Unsubscribe:
<mailto:<email address removed>>

Re: "I feel the need for speed" (string)
Date: 09.05.02 00:46 (Wed, 08 May 2002 19:46:39 -0400)
From: Kevin Ballard
On 5/8/02 6:55 PM, "Charles Yeomans" <<email address removed>> wrote:

> Actually, NthField is quite slow for this sort of thing. I wrote a
> TextFieldIterator class that is optimized for iteration. It's available
> at <http://www.quantum-meruit.com/RB/TextFieldIterator.sit.hqx>.

And what does it do instead of NthField?

Re: "I feel the need for speed" (string)
Date: 09.05.02 01:03 (Wed, 8 May 2002 20:03:20 -0400)
From: Noah Desch

On Wednesday, May 8, 2002, at 07:46 PM, Kevin Ballard wrote:

> On 5/8/02 6:55 PM, "Charles Yeomans" <<email address removed>> wrote:
>
>> Actually, NthField is quite slow for this sort of thing. I wrote a
>> TextFieldIterator class that is optimized for iteration. It's available
>> at <http://www.quantum-meruit.com/RB/TextFieldIterator.sit.hqx>.
>
> And what does it do instead of NthField?

Off hand I'd guess it searches the entire string for the field breaks at
once, instead of Nthfield which will get progressively slower as you
access higher fields because it has to find all the previous fields each
time as well.

-Noah Desch
Wireframe Software
http://wireframe.virtualave.net

A closed mind indeed is he who sees only his own path.

---
Subscribe to the digest:
<mailto:<email address removed>>
Unsubscribe:
<mailto:<email address removed>>

Re: "I feel the need for speed" (string)
Date: 09.05.02 01:17 (Wed, 08 May 2002 20:17:10 -0400)
From: Chris Little
on 5/8/02 7:46 PM, Kevin Ballard at kevin@sb.org wrote:

> On 5/8/02 6:55 PM, "Charles Yeomans" <<email address removed>> wrote:
>
>> Actually, NthField is quite slow for this sort of thing. I wrote a
>> TextFieldIterator class that is optimized for iteration. It's available
>> at <http://www.quantum-meruit.com/RB/TextFieldIterator.sit.hqx>.
>
> And what does it do instead of NthField?

I would assume that it tracks where the last field ended so that it doesn't
have to start at the beginning of the string each time.

Chris

---
Subscribe to the digest:
<mailto:<email address removed>>
Unsubscribe:
<mailto:<email address removed>>

Re: "I feel the need for speed" (string)
Date: 09.05.02 16:27 (Thu, 9 May 2002 11:27:37 -0400)
From: Charles Yeomans

On Wednesday, May 8, 2002, at 08:17 PM, Chris Little wrote:

> on 5/8/02 7:46 PM, Kevin Ballard at kevin@sb.org wrote:
>
>> On 5/8/02 6:55 PM, "Charles Yeomans" <<email address removed>> wrote:
>>
>>> Actually, NthField is quite slow for this sort of thing. I wrote a
>>> TextFieldIterator class that is optimized for iteration. It's
>>> available
>>> at <http://www.quantum-meruit.com/RB/TextFieldIterator.sit.hqx>.
>>
>> And what does it do instead of NthField?
>
> I would assume that it tracks where the last field ended so that it
> doesn't
> have to start at the beginning of the string each time.
>

Yup. I just use Instr and Mid, and keep track of location. As they
say, it's not rocket science.

Charles Yeomans


---
Subscribe to the digest:
<mailto:<email address removed>>
Unsubscribe:
<mailto:<email address removed>>

Re: "I feel the need for speed" (string)
Date: 09.05.02 01:06 (Wed, 8 May 2002 17:06:49 -0700)
From: Joseph J. Strout
At 6:22 PM -0400 5/8/02, Kevin Ballard wrote:

> > o.write chunk1 + chunk2 + chunk3 + chunk4 + ...
>
>o.write chunk1 + chunk2 + chunk3 + chunk4 + ... is faster. Each time you
>concatenate two strings, RB has to make a third string and copy both
>original strings over. If you concatenate all strings in one line, RB is
>smart enough to make one über-string to encompass all the substrings, and
>copy them all in.

But if you o.write each string separately, then it doesn't have to
concatenate anything or create any new strings at all. So that
should be fastest.

> > What is the fastest way of doing this?
>
>AFAIK, use CountFields and nthField to grab lines.

No, not if you're just going to iterate over the fields sequentially.
Use InStr and Mid instead.

Cheers,
- Joe

-

Re: "I feel the need for speed" (string)
Date: 09.05.02 01:01 (Wed, 8 May 2002 17:01:33 -0700)
From: Joseph J. Strout
At 9:42 PM +0200 5/8/02, Jan Erik Moström wrote:

>A couple of questions about strings. The background is that I'm writing
>an application that read pretty large text files, parses the text,
>reformats it and write it to a couple of other files.

Since this is text, I assume you want to properly handle
international characters? This is a very important first decision.

>Since I want the program to be as fast as possible I wonder how I should
>use strings.
>
>1 If I have a string and is adding lines to that string what is
> the fastest way of doing this?

Allocate a big MemoryBlock and stuff your strings into it (by
assigning to m.StringValue). Then when done, get out one big string
(by reading m.StringValue).

> I assume there is some overhead
> in doing
>
> strAll = strAll + str1

Yes, there is.

> So my question is: what is the fastest way of concatenating two
> strings?

For two strings, a = b+c is fastest, but if you're doing something
else (like concatenating many strings), consider something like the
MemoryBlock technique above.

>2 When writing text items (most of them small chunks of 10-40
> bytes, some larger 2 K or so) is it faster to write a chunk at
> the time or do
>
> o.write chunk1 + chunk2 + chunk3 + chunk4 + ...

It's probably faster to write them one at a time. But why don't you
try it and tell us?

>3 If I have a string that contains a number of lines, and I want
> to loop through these lines one-by-one and do certain things on
> each line. Something like this
>
> for i = 1 to nrLines
> curLine = nthField(source,sep,i)
>
> do different stuff with the line
>
> output = output + curLine
> next
>
> What is the fastest way of doing this?

Don't use NthField; use InStr to find the position of the next
separator, and Mid to extract that field. For grabbing a single
field NthField is fine, but for iterating over many or all fields,
InStr/Mid is more efficient.

Cheers,
- Joe

-

Re: "I feel the need for speed" (string)
Date: 23.03.01 11:32 (Thu, 9 May 2002 08:40:44 +0200)
From: Jan Erik Moström <
2002-05-08 17:01: Joseph J. Strout <<email address removed>> is believed to
have typed:

Thanks for the advise.

> >2 When writing text items (most of them small chunks of 10-40
> > bytes, some larger 2 K or so) is it faster to write a chunk at
> > the time or do
> >
> > o.write chunk1 + chunk2 + chunk3 + chunk4 + ...
>
> It's probably faster to write them one at a time. But why don't you
> try it and tell us?

I will ... but not right now, this is a case of the famous "my customers
customer wanted stuff last week and my customer wants the program last
month" (no, I'm not the one that's late ... I've just known about this
the last 24 hours or so 8-)

> Don't use NthField; use InStr to find the position of the next
> separator, and Mid to extract that field. For grabbing a single
> field NthField is fine, but for iterating over many or all fields,
> InStr/Mid is more efficient.

This is really going to help speed things up for me !!

jem

Re: "I feel the need for speed" (string)
Date: 09.05.02 23:45 (Thu, 09 May 2002 17:45:17 -0500)
From: Harry Hooie

Make sure you add something like this if your resultant string might wind up
in the thousands or characters -

for i = 1 to nrLines
curLine = "whatever"
tempString = tempString + curLine
j = j + 1
if j = 500 then
j = 0
output = output + tempString
tempString = ""
end if
next
output = output + tempString

That j = 500 is just a value that worked well for me - I guess you could
optimize a little if you knew the approximate size of the strings you are
adding.

Adding only this sped up my data import routines and listbox cut and drag
routines by more than 10x !
______
Harry Hooie
<email address removed>
http://www.harryhooie.com/

---
Subscribe to the digest:
<mailto:<email address removed>>
Unsubscribe:
<mailto:<email address removed>>