Xojo Developer Conference
25/27th April 2018 in Denver.
MBS Xojo Conference
6/7th September 2018 in Munich, Germany.

Re: Allocate more than 2GB to a process? Or use a DB? help! (Real Studio network user group Mailinglist archive)

Back to the thread list
Previous thread: Atempts to connect with latest Feedback app fails
Next thread: Does 64x dylib = 64-bit address space?


Re: OS X close confirmation save dialog   -   Garth Hjelte
  Re: Allocate more than 2GB to a process? Or use a DB? help!   -   Gino Deblauwe
   Re: Allocate more than 2GB to a process? Or use a DB? help!   -   Phillip Zedalis
   Re: Allocate more than 2GB to a process? Or use a DB? help!   -   Phillip Zedalis
    Re: Allocate more than 2GB to a process? Or use a DB? help!   -   Jon Ogden
    Re: Allocate more than 2GB to a process? Or use a DB? help!   -   Aaron Andrew Hunt
   Re: Allocate more than 2GB to a process? Or use a DB? help!   -   Norman Palardy
    Re: Allocate more than 2GB to a process? Or use a DB? help!   -   Aaron Andrew Hunt
   Re: Allocate more than 2GB to a process? Or use a DB? help!   -   Norman Palardy
   Re: Allocate more than 2GB to a process? Or use a DB? help!   -   Daniel L. Taylor
   Re: Allocate more than 2GB to a process? Or use a DB? help!   -   Aaron Andrew Hunt
   Re: Allocate more than 2GB to a process? Or use a DB? help!   -   Aaron Andrew Hunt
    Allocate more than 2GB to a process? Or use a DB? help!   -   Aaron Andrew Hunt
     Re: Allocate more than 2GB to a process? Or use a DB? help!   -   Michael Diehr
     Re: Allocate more than 2GB to a process? Or use a DB? help!   -   Brad Hutchings
     Re: Allocate more than 2GB to a process? Or use a DB? help!   -   Norman Palardy
     Re: Allocate more than 2GB to a process? Or use a DB? help!   -   Norman Palardy
     Re: Allocate more than 2GB to a process? Or use a DB? help!   -   Aaron Andrew Hunt

Re: Allocate more than 2GB to a process? Or use a DB? help!
Date: 01.08.12 10:28 (Wed, 1 Aug 2012 05:28:17 -0400)
From: Gino Deblauwe
Dont know really what you want to get to exactly, but searching a lot of
data (> 2GB) means a lot of RAM in a 64 bit environment or a database which
happens to be meant for this size of data. Inserting it will be quick
enough and much more scaleable.
For searching it you can do the following: break your sql for getting the
data in an id, a select, a from (with all joins), a where and an orderby
clause.
Then you make an array of bigints (recordset with your id, from, where and
orderby). All records you want to show on screen you can get with your
select, from, and a whereclause with all ids you want to get.
Therefore, if you have 5 million results or 500, the memory usage will not
be that different for your application. And it will always be the least
you need.

Vriendelijke Groeten
Deblauwe Gino
Use It Group NV



----------------------------------------
From: "Aaron Andrew Hunt" <<email address removed>>
Sent: Tuesday, July 31, 2012 7:24 PM
To: <email address removed>
Subject: Re: Allocate more than 2GB to a process? Or use a DB? help!

On Jul 31, 2012, at 6:01 PM, <email address removed>
wrote:
> 64 bit apps will have a limit that is 64 bits (2^64-1 bytes large which
is a simply enormous amount)

Ah, of course. Around 8TB. That's what we need. I know this has been
discussed ad nausem - REAL will compile 64-bit in which projected version
after switching to LLVM?

> ...
> I guess "what's best" might be a function of "how do you use this data"
?
> I'd suppose you could use a file but then I don't know he generating
algorithm or what you do & why you hold on to values in a dictionary.
> Or it could be a combination of a dictionary & external database (the DB
acting as the dictionary)
>
> Norman Palardy

We're just stuffing the data into a database. We then have a complex search
interface for the data...

And I just realized that the 32-bit limit is going to limit how we can
search in this database. I should have realized all this months ago... but
I've obviously never done a project this big, and simply did not know about
these limits.

On a 32-bit system it will be impossible to retrieve recordsets that exceed
the 2GB limit, right?

If so, 'nuff said. We definitely need the 64-bit compile if it gets us 8TB.
Once we have that we don't need a multi-process app in the first place.
Makes much more sense to me. So, I think we just wait until REAL is 64-bit
for version 2 of this project. Or do I misunderstand what 64-bit compiling
will allow us to do?

Thanks!
Aaron
=oD=
_______________________________________________
Unsubscribe or switch delivery mode:

Search the archives:

_______________________________________________
Unsubscribe or switch delivery mode:
<http://www.realsoftware.com/support/listmanager/>

Search the archives:
<http://support.realsoftware.com/listarchives/lists.html>

Re: Allocate more than 2GB to a process? Or use a DB? help!
Date: 01.08.12 13:03 (Wed, 01 Aug 2012 05:03:49 -0700)
From: Phillip Zedalis
Yeah seems like it haha.

Phillip

On Aug 1, 2012, at 4:53 AM, Jon Ogden <<email address removed>> wrote:

> I think you were answering a question on my thread on Aaron's thread! :)
>
> Sent from my iPad
>
> On Aug 1, 2012, at 4:53 AM, Phillip Zedalis <pjzedalis@me.com> wrote:
>
>> What exactly are you searching for? You aren't digging into the movie data to search it I assume. I mean I assume you are extracting/adding any metadata to separate SQL columns and such and searches are being conducted on that.
>>
>> So why combine the video with the metadata? I know some people shy away from the path to the file being separate from the database for ease of use but what about two databases or something... Stuffing GB into databases I think should be avoided unless you have a very specific problem you are trying to solve and in my opinion wanting to make "moving the data easier" is not really a valid reason. Especially considering the database could grow so large that it's not practical to burn on DVD and moving across a network could take awhile. Then you start dealing with splitting the data anyway. At that point what was the advantage of a database over a fragmented zip file or the file system itself?
>>
>> File systems were designed to handle these large amounts of data, databases are simply capable of it. If you look at recent Microsoft SQL Server implementations they have FILESYSTEM support where they abstract the file bits away from you so it seems like you are adding a blob to the db but it's actually being offloaded to the file system because frankly that's best for every use case I can imagine.
>>
>> Phillip
>>
>> On Aug 1, 2012, at 2:28 AM, Gino Deblauwe <<email address removed>> wrote:
>>
>>> Dont know really what you want to get to exactly, but searching a lot of
>>> data (> 2GB) means a lot of RAM in a 64 bit environment or a database which
>>> happens to be meant for this size of data. Inserting it will be quick
>>> enough and much more scaleable.
>>> For searching it you can do the following: break your sql for getting the
>>> data in an id, a select, a from (with all joins), a where and an orderby
>>> clause.
>>> Then you make an array of bigints (recordset with your id, from, where and
>>> orderby). All records you want to show on screen you can get with your
>>> select, from, and a whereclause with all ids you want to get.
>>> Therefore, if you have 5 million results or 500, the memory usage will not
>>> be that different for your application. And it will always be the least
>>> you need.
>>>
>>> Vriendelijke Groeten
>>> Deblauwe Gino
>>> Use It Group NV
>>>
>>>
>>>
>>>
>>> ----------------------------------------
>>> From: "Aaron Andrew Hunt" <<email address removed>>
>>> Sent: Tuesday, July 31, 2012 7:24 PM
>>> To: <email address removed>
>>> Subject: Re: Allocate more than 2GB to a process? Or use a DB? help!
>>>
>>> On Jul 31, 2012, at 6:01 PM, <email address removed>
>>> wrote:
>>>> 64 bit apps will have a limit that is 64 bits (2^64-1 bytes large which
>>> is a simply enormous amount)
>>>
>>> Ah, of course. Around 8TB. That's what we need. I know this has been
>>> discussed ad nausem - REAL will compile 64-bit in which projected version
>>> after switching to LLVM?
>>>
>>>> ...
>>>> I guess "what's best" might be a function of "how do you use this data"
>>> ?
>>>> I'd suppose you could use a file but then I don't know he generating
>>> algorithm or what you do & why you hold on to values in a dictionary.
>>>> Or it could be a combination of a dictionary & external database (the DB
>>> acting as the dictionary)
>>>>
>>>> Norman Palardy
>>>
>>> We're just stuffing the data into a database. We then have a complex search
>>> interface for the data...
>>>
>>> And I just realized that the 32-bit limit is going to limit how we can
>>> search in this database. I should have realized all this months ago... but
>>> I've obviously never done a project this big, and simply did not know about
>>> these limits.
>>>
>>> On a 32-bit system it will be impossible to retrieve recordsets that exceed
>>> the 2GB limit, right?
>>>
>>> If so, 'nuff said. We definitely need the 64-bit compile if it gets us 8TB.
>>> Once we have that we don't need a multi-process app in the first place.
>>> Makes much more sense to me. So, I think we just wait until REAL is 64-bit
>>> for version 2 of this project. Or do I misunderstand what 64-bit compiling
>>> will allow us to do?
>>>
>>> Thanks!
>>> Aaron
>>> =D>>>>
>>> _______________________________________________
>>> Unsubscribe or switch delivery mode:
>>>
>>> Search the archives:
>>>
>>>
>>> _______________________________________________
>>> Unsubscribe or switch delivery mode:
>>> <http://www.realsoftware.com/support/listmanager/>
>>>
>>> Search the archives:
>>> <http://support.realsoftware.com/listarchives/lists.html>
>>
>> _______________________________________________
>> Unsubscribe or switch delivery mode:
>> <http://www.realsoftware.com/support/listmanager/>
>>
>> Search the archives:
>> <http://support.realsoftware.com/listarchives/lists.html>
> _______________________________________________
> Unsubscribe or switch delivery mode:
> <http://www.realsoftware.com/support/listmanager/>
> Search the archives:
> <http://support.realsoftware.com/listarchives/lists.html>

_______________________________________________
Unsubscribe or switch delivery mode:
<http://www.realsoftware.com/support/listmanager/>

Search the archives:
<http://support.realsoftware.com/listarchives/lists.html>

Re: Allocate more than 2GB to a process? Or use a DB? help!
Date: 01.08.12 10:53 (Wed, 01 Aug 2012 02:53:38 -0700)
From: Phillip Zedalis
What exactly are you searching for? You aren't digging into the movie data to search it I assume. I mean I assume you are extracting/adding any metadata to separate SQL columns and such and searches are being conducted on that.

So why combine the video with the metadata? I know some people shy away from the path to the file being separate from the database for ease of use but what about two databases or something... Stuffing GB into databases I think should be avoided unless you have a very specific problem you are trying to solve and in my opinion wanting to make "moving the data easier" is not really a valid reason. Especially considering the database could grow so large that it's not practical to burn on DVD and moving across a network could take awhile. Then you start dealing with splitting the data anyway. At that point what was the advantage of a database over a fragmented zip file or the file system itself?

File systems were designed to handle these large amounts of data, databases are simply capable of it. If you look at recent Microsoft SQL Server implementations they have FILESYSTEM support where they abstract the file bits away from you so it seems like you are adding a blob to the db but it's actually being offloaded to the file system because frankly that's best for every use case I can imagine.

Phillip

On Aug 1, 2012, at 2:28 AM, Gino Deblauwe <<email address removed>> wrote:

> Dont know really what you want to get to exactly, but searching a lot of
> data (> 2GB) means a lot of RAM in a 64 bit environment or a database which
> happens to be meant for this size of data. Inserting it will be quick
> enough and much more scaleable.
> For searching it you can do the following: break your sql for getting the
> data in an id, a select, a from (with all joins), a where and an orderby
> clause.
> Then you make an array of bigints (recordset with your id, from, where and
> orderby). All records you want to show on screen you can get with your
> select, from, and a whereclause with all ids you want to get.
> Therefore, if you have 5 million results or 500, the memory usage will not
> be that different for your application. And it will always be the least
> you need.
>
> Vriendelijke Groeten
> Deblauwe Gino
> Use It Group NV
>
> ----------------------------------------
> From: "Aaron Andrew Hunt" <<email address removed>>
> Sent: Tuesday, July 31, 2012 7:24 PM
> To: <email address removed>
> Subject: Re: Allocate more than 2GB to a process? Or use a DB? help!
>
> On Jul 31, 2012, at 6:01 PM, <email address removed>
> wrote:
>> 64 bit apps will have a limit that is 64 bits (2^64-1 bytes large which
> is a simply enormous amount)
>
> Ah, of course. Around 8TB. That's what we need. I know this has been
> discussed ad nausem - REAL will compile 64-bit in which projected version
> after switching to LLVM?
>
>> ...
>> I guess "what's best" might be a function of "how do you use this data"
> ?
>> I'd suppose you could use a file but then I don't know he generating
> algorithm or what you do & why you hold on to values in a dictionary.
>> Or it could be a combination of a dictionary & external database (the DB
> acting as the dictionary)
>>
>> Norman Palardy
>
> We're just stuffing the data into a database. We then have a complex search
> interface for the data...
>
> And I just realized that the 32-bit limit is going to limit how we can
> search in this database. I should have realized all this months ago... but
> I've obviously never done a project this big, and simply did not know about
> these limits.
>
> On a 32-bit system it will be impossible to retrieve recordsets that exceed
> the 2GB limit, right?
>
> If so, 'nuff said. We definitely need the 64-bit compile if it gets us 8TB.
> Once we have that we don't need a multi-process app in the first place.
> Makes much more sense to me. So, I think we just wait until REAL is 64-bit
> for version 2 of this project. Or do I misunderstand what 64-bit compiling
> will allow us to do?
>
> Thanks!
> Aaron
> =WDa>
> _______________________________________________
> Unsubscribe or switch delivery mode:
>
> Search the archives:
>
> _______________________________________________
> Unsubscribe or switch delivery mode:
> <http://www.realsoftware.com/support/listmanager/>
> Search the archives:
> <http://support.realsoftware.com/listarchives/lists.html>

_______________________________________________
Unsubscribe or switch delivery mode:
<http://www.realsoftware.com/support/listmanager/>

Search the archives:
<http://support.realsoftware.com/listarchives/lists.html>

Re: Allocate more than 2GB to a process? Or use a DB? help!
Date: 01.08.12 12:53 (Wed, 01 Aug 2012 06:53:28 -0500)
From: Jon Ogden
I think you were answering a question on my thread on Aaron's thread! :)

Sent from my iPad

On Aug 1, 2012, at 4:53 AM, Phillip Zedalis <pjzedalis@me.com> wrote:

> What exactly are you searching for? You aren't digging into the movie data to search it I assume. I mean I assume you are extracting/adding any metadata to separate SQL columns and such and searches are being conducted on that.
>
> So why combine the video with the metadata? I know some people shy away from the path to the file being separate from the database for ease of use but what about two databases or something... Stuffing GB into databases I think should be avoided unless you have a very specific problem you are trying to solve and in my opinion wanting to make "moving the data easier" is not really a valid reason. Especially considering the database could grow so large that it's not practical to burn on DVD and moving across a network could take awhile. Then you start dealing with splitting the data anyway. At that point what was the advantage of a database over a fragmented zip file or the file system itself?
>
> File systems were designed to handle these large amounts of data, databases are simply capable of it. If you look at recent Microsoft SQL Server implementations they have FILESYSTEM support where they abstract the file bits away from you so it seems like you are adding a blob to the db but it's actually being offloaded to the file system because frankly that's best for every use case I can imagine.
>
> Phillip
>
> On Aug 1, 2012, at 2:28 AM, Gino Deblauwe <<email address removed>> wrote:
>
>> Dont know really what you want to get to exactly, but searching a lot of
>> data (> 2GB) means a lot of RAM in a 64 bit environment or a database which
>> happens to be meant for this size of data. Inserting it will be quick
>> enough and much more scaleable.
>> For searching it you can do the following: break your sql for getting the
>> data in an id, a select, a from (with all joins), a where and an orderby
>> clause.
>> Then you make an array of bigints (recordset with your id, from, where and
>> orderby). All records you want to show on screen you can get with your
>> select, from, and a whereclause with all ids you want to get.
>> Therefore, if you have 5 million results or 500, the memory usage will not
>> be that different for your application. And it will always be the least
>> you need.
>>
>> Vriendelijke Groeten
>> Deblauwe Gino
>> Use It Group NV
>>
>> ----------------------------------------
>> From: "Aaron Andrew Hunt" <<email address removed>>
>> Sent: Tuesday, July 31, 2012 7:24 PM
>> To: <email address removed>
>> Subject: Re: Allocate more than 2GB to a process? Or use a DB? help!
>>
>> On Jul 31, 2012, at 6:01 PM, <email address removed>
>> wrote:
>>> 64 bit apps will have a limit that is 64 bits (2^64-1 bytes large which
>> is a simply enormous amount)
>>
>> Ah, of course. Around 8TB. That's what we need. I know this has been
>> discussed ad nausem - REAL will compile 64-bit in which projected version
>> after switching to LLVM?
>>
>>> ...
>>> I guess "what's best" might be a function of "how do you use this data"
>> ?
>>> I'd suppose you could use a file but then I don't know he generating
>> algorithm or what you do & why you hold on to values in a dictionary.
>>> Or it could be a combination of a dictionary & external database (the DB
>> acting as the dictionary)
>>>
>>> Norman Palardy
>>
>> We're just stuffing the data into a database. We then have a complex search
>> interface for the data...
>>
>> And I just realized that the 32-bit limit is going to limit how we can
>> search in this database. I should have realized all this months ago... but
>> I've obviously never done a project this big, and simply did not know about
>> these limits.
>>
>> On a 32-bit system it will be impossible to retrieve recordsets that exceed
>> the 2GB limit, right?
>>
>> If so, 'nuff said. We definitely need the 64-bit compile if it gets us 8TB.
>> Once we have that we don't need a multi-process app in the first place.
>> Makes much more sense to me. So, I think we just wait until REAL is 64-bit
>> for version 2 of this project. Or do I misunderstand what 64-bit compiling
>> will allow us to do?
>>
>> Thanks!
>> Aaron
>> =?D >>
>> _______________________________________________
>> Unsubscribe or switch delivery mode:
>>
>> Search the archives:
>>
>> _______________________________________________
>> Unsubscribe or switch delivery mode:
>> <http://www.realsoftware.com/support/listmanager/>
>>
>> Search the archives:
>> <http://support.realsoftware.com/listarchives/lists.html>
> _______________________________________________
> Unsubscribe or switch delivery mode:
> <http://www.realsoftware.com/support/listmanager/>
> Search the archives:
> <http://support.realsoftware.com/listarchives/lists.html>

_______________________________________________
Unsubscribe or switch delivery mode:
<http://www.realsoftware.com/support/listmanager/>

Search the archives:
<http://support.realsoftware.com/listarchives/lists.html>

Re: Allocate more than 2GB to a process? Or use a DB? help!
Date: 01.08.12 09:13 (Wed, 1 Aug 2012 09:13:33 +0100)
From: Aaron Andrew Hunt
Thanks for all the help. We'll optimize the DB at its current limit and wait for 64 bit compiling from RS (started another thread).
Aaron
_______________________________________________
Unsubscribe or switch delivery mode:
<http://www.realsoftware.com/support/listmanager/>

Search the archives:
<http://support.realsoftware.com/listarchives/lists.html>

Re: Allocate more than 2GB to a process? Or use a DB? help!
Date: 31.07.12 19:17 (Tue, 31 Jul 2012 12:17:59 -0600)
From: Norman Palardy

On 2012-07-31, at 11:33 AM, Daniel L. Taylor wrote:

>> Thought of another idea - spawning another process which is just an IPC socket that keeps data in another dictionary. When the first dictionary bumps up against its limit, the second process does the work over IPC, and I can just keep doing that for each new barrier encountered. Keeping the dictionaries in sync would take some doing but I think it should be possible, right?
>
> If your solution allows you to split the dictionaries across separate processes, then this will let you get around the problem while keeping the data in RAM. There will be some overhead for the interprocess communication, and may be some overhead for any synchronization issues you have. But then again, this will also allow Dictionary searches to occur on multiple cores simultaneously.
>
> Dictionaries use hashes internally. I'm not sure what the memory implications of this are, or what the possibility is of a collision. You might be able to store more data in RAM using a half interval search algorithm to maintain and search a sorted array.

Ends up being something like a sharding approach but with processes on the same machine rather than across machines.
Basically the main app creates all the data and sends non-overlapping ranges of keys to certain processes via UDP.
As long as the keys are relatively well distributed non-of the individual processes would exceed memory limits & with enough "shards" you could probably max out any machines memory.

ie/
generate key / value
if key < 1000 send key & value to process1
elseif key < 2000 send key & to process2

etc

to find a given key you can go to the correct shard with much the same algorithm

But if you're just loading them into a DB I really would try to optimize that as long term it will let you generate ALL keys from one process & avoid all this busy work just to get the data generated.
And if your DB engine IS already 64 bit then you've already got the advantages of some of the process being 64 bit already.

Norman Palardy



_______________________________________________
Unsubscribe or switch delivery mode:
<http://www.realsoftware.com/support/listmanager/>

Search the archives:
<http://support.realsoftware.com/listarchives/lists.html>

Re: Allocate more than 2GB to a process? Or use a DB? help!
Date: 02.08.12 11:18 (Thu, 2 Aug 2012 11:18:20 +0100)
From: Aaron Andrew Hunt
That makes very good sense, and being a novice with databases, I had not thought of that.
Thank you kindly!
Aaron

> Dont know really what you want to get to exactly, but searching a lot of
> data (> 2GB) means a lot of RAM in a 64 bit environment or a database which
> happens to be meant for this size of data. Inserting it will be quick
> enough and much more scaleable.
> For searching it you can do the following: break your sql for getting the
> data in an id, a select, a from (with all joins), a where and an orderby
> clause.
> Then you make an array of bigints (recordset with your id, from, where and
> orderby). All records you want to show on screen you can get with your
> select, from, and a whereclause with all ids you want to get.
> Therefore, if you have 5 million results or 500, the memory usage will not
> be that different for your application. And it will always be the least
> you need.
>
> Vriendelijke Groeten
> Deblauwe Gino
> Use It Group NV

_______________________________________________
Unsubscribe or switch delivery mode:
<http://www.realsoftware.com/support/listmanager/>

Search the archives:
<http://support.realsoftware.com/listarchives/lists.html>

Re: Allocate more than 2GB to a process? Or use a DB? help!
Date: 31.07.12 18:34 (Tue, 31 Jul 2012 11:34:44 -0600)
From: Norman Palardy

On 2012-07-31, at 11:21 AM, Aaron Andrew Hunt wrote:

> We're just stuffing the data into a database. We then have a complex search interface for the data...

Stuffing them into a db should be relatively fast IF you dont try to commit on every insert
If you do it will be gawdawfully slow

> And I just realized that the 32-bit limit is going to limit how we can search in this database. I should have realized all this months ago... but I've obviously never done a project this big, and simply did not know about these limits.
>
> On a 32-bit system it will be impossible to retrieve recordsets that exceed the 2GB limit, right?

Only if you try to grab them all at once.
There are ways to get subsets depending on what DB you're using.

> If so, 'nuff said. We definitely need the 64-bit compile if it gets us 8TB.

More like 18446744073709551616 (over 18 quintillion) bytes addressable (in theory)

> Once we have that we don't need a multi-process app in the first place. Makes much more sense to me. So, I think we just wait until REAL is 64-bit for version 2 of this project. Or do I misunderstand what 64-bit compiling will allow us to do?

64 bit really amounts to HUGE address space for 99.9% of all use cases

Norman Palardy



_______________________________________________
Unsubscribe or switch delivery mode:
<http://www.realsoftware.com/support/listmanager/>

Search the archives:
<http://support.realsoftware.com/listarchives/lists.html>

Re: Allocate more than 2GB to a process? Or use a DB? help!
Date: 31.07.12 18:33 (Tue, 31 Jul 2012 10:33:53 -0700)
From: Daniel L. Taylor
> Thought of another idea - spawning another process which is just an IPC socket that keeps data in another dictionary. When the first dictionary bumps up against its limit, the second process does the work over IPC, and I can just keep doing that for each new barrier encountered. Keeping the dictionaries in sync would take some doing but I think it should be possible, right?

If your solution allows you to split the dictionaries across separate processes, then this will let you get around the problem while keeping the data in RAM. There will be some overhead for the interprocess communication, and may be some overhead for any synchronization issues you have. But then again, this will also allow Dictionary searches to occur on multiple cores simultaneously.

Dictionaries use hashes internally. I'm not sure what the memory implications of this are, or what the possibility is of a collision. You might be able to store more data in RAM using a half interval search algorithm to maintain and search a sorted array.

> Is that a better solution? Only problem I see there is that IPC sockets can be tricky to initiate and connect, and if a connection is lost it's impossible to reconnect it, leading to disaster.

Use TCP or UDP.

> Also I am not sure how to spawn more than one of them dynamically without keeping an army of them inside the app bundle.

http://osxdaily.com/2011/05/11/multiple-instances-application-mac/

Daniel L. Taylor
Taylor Design
Computer Consulting & Software Development
<email address removed>
www.taylor-design.com

_______________________________________________
Unsubscribe or switch delivery mode:
<http://www.realsoftware.com/support/listmanager/>

Search the archives:
<http://support.realsoftware.com/listarchives/lists.html>

Re: Allocate more than 2GB to a process? Or use a DB? help!
Date: 31.07.12 18:25 (Tue, 31 Jul 2012 18:25:25 +0100)
From: Aaron Andrew Hunt
On Jul 31, 2012, at 6:01 PM, <email address removed> wrote:
> If you just need "millions" of records, consider storing a hash each value in memory rather than the value itself -- if the hash is considerably shorter. The potential downside is a hash collision. sha1 might be a good candidate. sha1 + md5 should protect you from collisions better, at the expense of more bytes per record.
>
> -Brad

Clever suggestion. I guess it would mean a speed hit but could save quite a bit on RAM which is exactly what I was asking for, so I may try that in order to push our current version to its limits.

Aaron
_______________________________________________
Unsubscribe or switch delivery mode:
<http://www.realsoftware.com/support/listmanager/>

Search the archives:
<http://support.realsoftware.com/listarchives/lists.html>

Re: Allocate more than 2GB to a process? Or use a DB? help!
Date: 31.07.12 18:21 (Tue, 31 Jul 2012 18:21:54 +0100)
From: Aaron Andrew Hunt
On Jul 31, 2012, at 6:01 PM, <email address removed> wrote:
> 64 bit apps will have a limit that is 64 bits (2^64-1 bytes large which is a simply enormous amount)

Ah, of course. Around 8TB. That's what we need. I know this has been discussed ad nausem - REAL will compile 64-bit in which projected version after switching to LLVM?

> ...
> I guess "what's best" might be a function of "how do you use this data" ?
> I'd suppose you could use a file but then I don't know he generating algorithm or what you do & why you hold on to values in a dictionary.
> Or it could be a combination of a dictionary & external database (the DB acting as the dictionary)
>
> Norman Palardy

We're just stuffing the data into a database. We then have a complex search interface for the data...

And I just realized that the 32-bit limit is going to limit how we can search in this database. I should have realized all this months ago... but I've obviously never done a project this big, and simply did not know about these limits.

On a 32-bit system it will be impossible to retrieve recordsets that exceed the 2GB limit, right?

If so, 'nuff said. We definitely need the 64-bit compile if it gets us 8TB. Once we have that we don't need a multi-process app in the first place. Makes much more sense to me. So, I think we just wait until REAL is 64-bit for version 2 of this project. Or do I misunderstand what 64-bit compiling will allow us to do?

Thanks!
Aaron
=_D_


_______________________________________________
Unsubscribe or switch delivery mode:
<http://www.realsoftware.com/support/listmanager/>

Search the archives:
<http://support.realsoftware.com/listarchives/lists.html>

Allocate more than 2GB to a process? Or use a DB? help!
Date: 31.07.12 17:01 (Tue, 31 Jul 2012 17:01:29 +0100)
From: Aaron Andrew Hunt
//=#DSDODODM
// PROBLEM //
//=3D?D3D?D/

My app needs more than 2GB of RAM. It keeps crashing when it reaches this limit. I am assuming the only way to deal with this is to split up my app into multiple processes. BUT, I do not know how to do that when I have a generating algorithm that needs access a dictionary which is what is bumping up against the RAM limit ...

//=3D?D3D?D//
// DETAILS //
//=OD#D/DD//

The app creates data arrays, writing the data immediately to a database. It retains one piece of data in a dictionary for each written record, to compare to new records it creates, as part of the generating algorithm. Otherwise virtually nothing is being stored (piling up) in memory...

There are varying sizes of output arrays based on initial variables, but the largest arrays reach millions of records. Tracking memory usage in Apple's Activity Monitor, and viewing crash reports, the app is bumping up against the 2GB per process 32-bit system limit.

//=wDsDsDgD#DkD//
// SOLUTIONS? //
//=GD?DGD?DGD/D//

The only solution I can think of to get across the limit without breaking into multiple processes is to store the values I need to compare in a second database, so that nothing is done in RAM, and it's all done on the disk. The problem I foresee with this is that it will be a death knell for our now speedy algorithm. I have been pushing this app for speed, and have for example improved it from an initial 14 hours for generating one large array to under 15 minutes. I fear moving away from RAM is going to force us into waiting for days for the thing to generate what's needed.

//={D#DoD#D//
// ADVICE? //
//=wDcD'D#D//

Ideas? Is it possible to open up more RAM? I guess in 64-bit is there a 4GB limit? Googling this stuff doesn't quite convince me of any correct answers on the net, though I see some agreement on the 2GB limit, and something about a 3GB switch on Windows. This is an app primarily for Mac, but also for Windows, so the solution preferably needs to work on both platforms.

Thanks in advance as always for whatever help you can offer. You guys are the best.
Aaron
_______________________________________________
Unsubscribe or switch delivery mode:
<http://www.realsoftware.com/support/listmanager/>

Search the archives:
<http://support.realsoftware.com/listarchives/lists.html>

Re: Allocate more than 2GB to a process? Or use a DB? help!
Date: 31.07.12 18:13 (Tue, 31 Jul 2012 10:13:54 -0700)
From: Michael Diehr
Check out MBS's large file and file mapping classes?
http://www.monkeybreadsoftware.net/class-filemappingviewmbs.shtml

On Jul 31, 2012, at 9:01 AM, Aaron Andrew Hunt wrote:

> //=#DSDODODM
> // PROBLEM //
> //=?D3D?D/D>
> My app needs more than 2GB of RAM. It keeps crashing when it reaches this limit. I am assuming the only way to deal with this is to split up my app into multiple processes. BUT, I do not know how to do that when I have a generating algorithm that needs access a dictionary which is what is bumping up against the RAM limit ...
[...]
_______________________________________________
Unsubscribe or switch delivery mode:
<http://www.realsoftware.com/support/listmanager/>

Search the archives:
<http://support.realsoftware.com/listarchives/lists.html>

Re: Allocate more than 2GB to a process? Or use a DB? help!
Date: 31.07.12 18:00 (Tue, 31 Jul 2012 10:00:59 -0700)
From: Brad Hutchings
If you just need "millions" of records, consider storing a hash each value in memory rather than the value itself -- if the hash is considerably shorter. The potential downside is a hash collision. sha1 might be a good candidate. sha1 + md5 should protect you from collisions better, at the expense of more bytes per record.

-Brad

On Jul 31, 2012, at 9:01 AM, Aaron Andrew Hunt wrote:

> //=#DSDODODM
> // PROBLEM //
> //=?D3D?D/D>
> My app needs more than 2GB of RAM. It keeps crashing when it reaches this limit. I am assuming the only way to deal with this is to split up my app into multiple processes. BUT, I do not know how to do that when I have a generating algorithm that needs access a dictionary which is what is bumping up against the RAM limit ...
>
> //=3D?D3D?D//
> // DETAILS //
> //=#D/DD#D//
>
> The app creates data arrays, writing the data immediately to a database. It retains one piece of data in a dictionary for each written record, to compare to new records it creates, as part of the generating algorithm. Otherwise virtually nothing is being stored (piling up) in memory...
>
> There are varying sizes of output arrays based on initial variables, but the largest arrays reach millions of records. Tracking memory usage in Apple's Activity Monitor, and viewing crash reports, the app is bumping up against the 2GB per process 32-bit system limit.
>
> //=#D{DwDoDoDoD//
> // SOLUTIONS? //
> //=?D3D?D3D/DD//
>
> The only solution I can think of to get across the limit without breaking into multiple processes is to store the values I need to compare in a second database, so that nothing is done in RAM, and it's all done on the disk. The problem I foresee with this is that it will be a death knell for our now speedy algorithm. I have been pushing this app for speed, and have for example improved it from an initial 14 hours for generating one large array to under 15 minutes. I fear moving away from RAM is going to force us into waiting for days for the thing to generate what's needed.
>
> //=cDsDgDsD//
> // ADVICE? //
> //=wDsDoDgD//
>
> Ideas? Is it possible to open up more RAM? I guess in 64-bit is there a 4GB limit? Googling this stuff doesn't quite convince me of any correct answers on the net, though I see some agreement on the 2GB limit, and something about a 3GB switch on Windows. This is an app primarily for Mac, but also for Windows, so the solution preferably needs to work on both platforms.
>
> Thanks in advance as always for whatever help you can offer. You guys are the best.
> Aaron
> _______________________________________________
> Unsubscribe or switch delivery mode:
> <http://www.realsoftware.com/support/listmanager/>
> Search the archives:
> <http://support.realsoftware.com/listarchives/lists.html>

_______________________________________________
Unsubscribe or switch delivery mode:
<http://www.realsoftware.com/support/listmanager/>

Search the archives:
<http://support.realsoftware.com/listarchives/lists.html>

Re: Allocate more than 2GB to a process? Or use a DB? help!
Date: 31.07.12 18:00 (Tue, 31 Jul 2012 11:00:28 -0600)
From: Norman Palardy

On 2012-07-31, at 10:40 AM, Aaron Andrew Hunt wrote:

> Thought of another idea - spawning another process which is just an IPC socket that keeps data in another dictionary. When the first dictionary bumps up against its limit, the second process does the work over IPC, and I can just keep doing that for each new barrier encountered. Keeping the dictionaries in sync would take some doing but I think it should be possible, right?
>
> Is that a better solution?

That really depends on HOW you use the resulting data more than anything.
If it all gets fed to another process via a file then I'd use files & set it up so things get written as fast as possible so it does not kill your processing time.

But at this point I certainly don't feel I know enough to even nudge you in any "better" direction.

Norman Palardy



_______________________________________________
Unsubscribe or switch delivery mode:
<http://www.realsoftware.com/support/listmanager/>

Search the archives:
<http://support.realsoftware.com/listarchives/lists.html>

Re: Allocate more than 2GB to a process? Or use a DB? help!
Date: 31.07.12 17:52 (Tue, 31 Jul 2012 10:52:03 -0600)
From: Norman Palardy

On 2012-07-31, at 10:01 AM, Aaron Andrew Hunt wrote:

> //=/D#DSDODL
> // PROBLEM //
> //=3D?D3D?D/
>
> My app needs more than 2GB of RAM. It keeps crashing when it reaches this limit. I am assuming the only way to deal with this is to split up my app into multiple processes. BUT, I do not know how to do that when I have a generating algorithm that needs access a dictionary which is what is bumping up against the RAM limit ...
>
> //=/D3D?D3D//
> // DETAILS //
> //=OD#D/DD//
>
> The app creates data arrays, writing the data immediately to a database. It retains one piece of data in a dictionary for each written record, to compare to new records it creates, as part of the generating algorithm. Otherwise virtually nothing is being stored (piling up) in memory...
>
> There are varying sizes of output arrays based on initial variables, but the largest arrays reach millions of records. Tracking memory usage in Apple's Activity Monitor, and viewing crash reports, the app is bumping up against the 2GB per process 32-bit system limit.
>
> //=kD#D{DwDoDoD//
> // SOLUTIONS? //
> //=3D?D3D?D3D/D//
>
> The only solution I can think of to get across the limit without breaking into multiple processes is to store the values I need to compare in a second database, so that nothing is done in RAM, and it's all done on the disk. The problem I foresee with this is that it will be a death knell for our now speedy algorithm. I have been pushing this app for speed, and have for example improved it from an initial 14 hours for generating one large array to under 15 minutes. I fear moving away from RAM is going to force us into waiting for days for the thing to generate what's needed.
>
> //=#DcDsDgD//
> // ADVICE? //
> //=kDwDsDoD//
>
> Ideas? Is it possible to open up more RAM? I guess in 64-bit is there a 4GB limit?

64 bit apps will have a limit that is 64 bits (2^64-1 bytes large which is a simply enormous amount)

> Googling this stuff doesn't quite convince me of any correct answers on the net, though I see some agreement on the 2GB limit, and something about a 3GB switch on Windows. This is an app primarily for Mac, but also for Windows, so the solution preferably needs to work on both platforms.
>
> Thanks in advance as always for whatever help you can offer. You guys are the best.
> Aaron

I guess "what's best" might be a function of "how do you use this data" ?
I'd suppose you could use a file but then I don't know he generating algorithm or what you do & why you hold on to values in a dictionary.
Or it could be a combination of a dictionary & external database (the DB acting as the dictionary)

Norman Palardy



_______________________________________________
Unsubscribe or switch delivery mode:
<http://www.realsoftware.com/support/listmanager/>

Search the archives:
<http://support.realsoftware.com/listarchives/lists.html>

Re: Allocate more than 2GB to a process? Or use a DB? help!
Date: 31.07.12 17:40 (Tue, 31 Jul 2012 17:40:46 +0100)
From: Aaron Andrew Hunt
Thought of another idea - spawning another process which is just an IPC socket that keeps data in another dictionary. When the first dictionary bumps up against its limit, the second process does the work over IPC, and I can just keep doing that for each new barrier encountered. Keeping the dictionaries in sync would take some doing but I think it should be possible, right?

Is that a better solution? Only problem I see there is that IPC sockets can be tricky to initiate and connect, and if a connection is lost it's impossible to reconnect it, leading to disaster. Also I am not sure how to spawn more than one of them dynamically without keeping an army of them inside the app bundle. Debugging multi-process apps is also a frustrating task in my experience.

Thanks,
Aaron

On Jul 31, 2012, at 5:01 PM, Aaron Andrew Hunt wrote:
> //=3D/DD#D/D#D//
> // SOLUTIONS? //
> //=?D3D?D3D?D3D//
>
> The only solution I can think of to get across the limit without breaking into multiple processes is to store the values I need to compare in a second database, so that nothing is done in RAM, and it's all done on the disk. The problem I foresee with this is that it will be a death knell for our now speedy algorithm. I have been pushing this app for speed, and have for example improved it from an initial 14 hours for generating one large array to under 15 minutes. I fear moving away from RAM is going to force us into waiting for days for the thing to generate what's needed.

_______________________________________________
Unsubscribe or switch delivery mode:
<http://www.realsoftware.com/support/listmanager/>

Search the archives:
<http://support.realsoftware.com/listarchives/lists.html>