|
List Info
Thread: "ocaml_beginners"::[] Deal with large lists
|
|
| "ocaml_beginners"::[] Deal
with large lists |

|
2006-11-20 04:39:41 |
|
Hi,
I'm trying to write an SPH code in Ocaml. SPH stands for Smooth
Particle Hydrodynamics and is a particle based code to solve the
hydrodynamical equations. I'm still at draft stage and am currently
wondering about how to represent the particles. One thing that I have
been considering is setting up lists of tuples where each tuple
represents one particle. However, what I am a bit worried about is that
over time the number of particles could easily grow to 1e5 to 1e6 and
I'm just not sure if a recursive approach like you'd use with lists
would not break then. Wouldn't this exhaust the stack? Is this the
completely wrong approach? Are lists a good idea at all? The point is
that order doesn't matter. All I care about is neighbourhood of
particles but if they have an index (like in an array representation
this index is usually almost meaningless. In general I also care the
most about speed but not much about memory. Any hints on this?
Cheers,
Christian
__._,_.___
.
__,_._,___
|
| "ocaml_beginners"::[] Deal
with large lists |

|
2006-11-20 09:10:38 |
|
On Mon, Nov 20, 2006 at 03:39:41PM +1100, Christian Lerrahn wrote:
> I'm trying to write an SPH code in Ocaml. SPH stands for Smooth
> Particle Hydrodynamics and is a particle based code to solve the
> hydrodynamical equations. I'm still at draft stage and am currently
> wondering about how to represent the particles. One thing that I have
> been considering is setting up lists of tuples where each tuple
> represents one particle. However, what I am a bit worried about is that
> over time the number of particles could easily grow to 1e5 to 1e6 and
> I'm just not sure if a recursive approach like you'd use with lists
> would not break then. Wouldn't this exhaust the stack? Is this the
> completely wrong approach? Are lists a good idea at all? The point is
> that order doesn't matter. All I care about is neighbourhood of
> particles but if they have an index (like in an array representation
> this index is usually almost meaningless. In general I also care the
> most about speed but not much about memory. Any hints on this?
Lists are elegant, but in the situation you have described you are
(cautiously) better off using Arrays.
It largely depends on whether you will need to resize the arrays,
since resizing for arrays is an expensive and time-consuming
operation. Can you use a fixed size array or preallocate an array?
Whether you decide to use lists or arrays, you should look at Extlib
(http://ocaml-lib.sourceforge.net/ and also a standard part of many
Linux distributions). This library has two things which will help
you: (1) tail recursive versions of the common List functions, and (2)
a variable sized array.
BTW, it's better to use a struct rather than a tuple, for both speed
and memory reasons. type my_struct = {x : float; y : float; z : float}
is one object stored in 32 bytes, whereas (x, y, z) is four objects
stored in 64 bytes.
Rich.
--
Richard Jones, CTO Merjis Ltd.
Merjis - web marketing and technology - http://merjis.com
Internet Marketing and AdWords courses - http://merjis.com/courses - NEW!
Merjis blog - http://blog.merjis.com - NEW!
__._,_.___
.
__,_._,___
|
| "ocaml_beginners"::[] Deal
with large lists |

|
2006-11-20 09:30:35 |
|
On Monday 20 November 2006 04:39, Christian Lerrahn wrote:
> I'm trying to write an SPH code in Ocaml. SPH stands for Smooth
> Particle Hydrodynamics and is a particle based code to solve the
> hydrodynamical equations. I'm still at draft stage and am currently
> wondering about how to represent the particles. One thing that I have
> been considering is setting up lists of tuples where each tuple
> represents one particle. However, what I am a bit worried about is that
> over time the number of particles could easily grow to 1e5 to 1e6 and
> I'm just not sure if a recursive approach like you'd use with lists
> would not break then. Wouldn't this exhaust the stack? Is this the
> completely wrong approach? Are lists a good idea at all? The point is
> that order doesn't matter. All I care about is neighbourhood of
> particles but if they have an index (like in an array representation
> this index is usually almost meaningless. In general I also care the
> most about speed but not much about memory. Any hints on this?
There are many ways to do this. Which one is fastest depends what operations
you will be doing the most (i.e. where is the bottleneck). The most important
thing to do for now is factor the problem into a hierarchy of functions where
only the bottom-most functions care about your actual representations.
Higher-order functions will be instrumental in doing this.
You can start with an array or a list to prototype your program (if you use
lists then use tail-recursive functions like rev_map and fold_left instead of
map and fold_right). If you write it well then transferring to a more
efficient data structure will be painless.
I suspect you will want to hierarchically decompose the space that contains
your particles, in which case you'll end up wanting a tree (e.g. a k-D tree).
There is a discussion of this approach and its implementation in OCaml in my
book "OCaml for Scientists". From a figure in the book, this is 1e3x faster
with an error of 1e-6 for 1e6 particles.
Will Farr wrote an interesting blog entry about spatial partitioning tree in
OCaml.
--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
Objective CAML for Scientists
http://www.ffconsultancy.com/products/ocaml_for_scientists
__._,_.___
.
__,_._,___
|
| "ocaml_beginners"::[] Deal
with large lists |

|
2006-11-20 14:25:57 |
|
__,_._,___
|
| "ocaml_beginners"::[] Deal
with large lists |

|
2006-11-20 14:08:21 |
|
Hi Jon,
> > I'm trying to write an SPH code in Ocaml. SPH stands for Smooth
> > Particle Hydrodynamics and is a particle based code to solve the
> > hydrodynamical equations. I'm still at draft stage and am currently
> > wondering about how to represent the particles. One thing that I
> > have been considering is setting up lists of tuples where each tuple
> > represents one particle. However, what I am a bit worried about is
> > that over time the number of particles could easily grow to 1e5 to
> > 1e6 and I'm just not sure if a recursive approach like you'd use
> > with lists would not break then. Wouldn't this exhaust the stack?
> > Is this the completely wrong approach? Are lists a good idea at
> > all? The point is that order doesn't matter. All I care about is
> > neighbourhood of particles but if they have an index (like in an
> > array representation this index is usually almost meaningless. In
> > general I also care the most about speed but not much about memory.
> > Any hints on this?
>
> There are many ways to do this. Which one is fastest depends what
> operations you will be doing the most (i.e. where is the bottleneck).
> The most important thing to do for now is factor the problem into a
> hierarchy of functions where only the bottom-most functions care
> about your actual representations. Higher-order functions will be
> instrumental in doing this.
SPH basically solves all its equations by integrating over the space of
neighbours in n dimensions. The most time consuming is probably finding
the neighbours (which are defined by a maximum distance). This is
currently done in several different ways, among which are link lists
and tree codes. The equations themselves usually involve all the
properties of a neighbouring particle in one way or another plus the
relative vector.
> You can start with an array or a list to prototype your program (if
> you use lists then use tail-recursive functions like rev_map and
> fold_left instead of map and fold_right). If you write it well then
> transferring to a more efficient data structure will be painless.
Thank you for this suggestion. I'll try to heed this.
> I suspect you will want to hierarchically decompose the space that
> contains your particles, in which case you'll end up wanting a tree
> (e.g. a k-D tree). There is a discussion of this approach and its
> implementation in OCaml in my book "OCaml for Scientists". From a
> figure in the book, this is 1e3x faster with an error of 1e-6 for 1e6
> particles.
Yes, a tree structure might be a thing to go for. It will probably be
helpful with finding neighbours. I actually have been considering
getting your book but it's not easy to afford for a student. :( It
looks like a good investment, however.
> Will Farr wrote an interesting blog entry about spatial partitioning
> tree in OCaml.
I'll try to have a look at this as well. Thanks for the hint.
Cheers,
Christian
__._,_.___
.
__,_._,___
|
| "ocaml_beginners"::[] Deal
with large lists |

|
2006-11-20 14:13:04 |
|
Hi Richard,
> > I'm trying to write an SPH code in Ocaml. SPH stands for Smooth
> > Particle Hydrodynamics and is a particle based code to solve the
> > hydrodynamical equations. I'm still at draft stage and am currently
> > wondering about how to represent the particles. One thing that I
> > have been considering is setting up lists of tuples where each tuple
> > represents one particle. However, what I am a bit worried about is
> > that over time the number of particles could easily grow to 1e5 to
> > 1e6 and I'm just not sure if a recursive approach like you'd use
> > with lists would not break then. Wouldn't this exhaust the stack?
> > Is this the completely wrong approach? Are lists a good idea at
> > all? The point is that order doesn't matter. All I care about is
> > neighbourhood of particles but if they have an index (like in an
> > array representation this index is usually almost meaningless. In
> > general I also care the most about speed but not much about memory.
> > Any hints on this?
>
> Lists are elegant, but in the situation you have described you are
> (cautiously) better off using Arrays.
>
> It largely depends on whether you will need to resize the arrays,
> since resizing for arrays is an expensive and time-consuming
> operation. Can you use a fixed size array or preallocate an array?
I can to a certain degree. The problem is that I have ghost particles
because I'm working on a periodic space. So if I want to keep arrays
fixed (as for example the existing Fortran 77 codes have to do), I have
to allocate n*3 times the number of particles to be safe (where n is
the number of dimensions).
> Whether you decide to use lists or arrays, you should look at Extlib
> (http://ocaml-lib.sourceforge.net/ and also a standard part of many
> Linux distributions). This library has two things which will help
> you: (1) tail recursive versions of the common List functions, and (2)
> a variable sized array.
Thanks for this hint. I'll have a look at the library.
> BTW, it's better to use a struct rather than a tuple, for both speed
> and memory reasons. type my_struct = {x : float; y : float; z :
> float} is one object stored in 32 bytes, whereas (x, y, z) is four
> objects stored in 64 bytes.
Ok, this is helpful to know. The tuples were what came to my mind
first. But the structs might also make the code more readable.
Cheers,
Christian
__._,_.___
.
__,_._,___
|
| "ocaml_beginners"::[] Deal
with large lists |

|
2006-11-20 16:30:51 |
|
On Tue, Nov 21, 2006 at 01:13:04AM +1100, Christian Lerrahn wrote:
> Hi Richard,
[...]
> > It largely depends on whether you will need to resize the arrays,
> > since resizing for arrays is an expensive and time-consuming
> > operation. Can you use a fixed size array or preallocate an array?
>
> I can to a certain degree. The problem is that I have ghost particles
> because I'm working on a periodic space. So if I want to keep arrays
> fixed (as for example the existing Fortran 77 codes have to do), I have
> to allocate n*3 times the number of particles to be safe (where n is
> the number of dimensions).
I think over all you're probably better to follow Jon Harrop's advice
here. My 2 cents is that you can start with an array of a certain
size, and if your code exceeds that array size, reallocate the array
to be twice as large, (and so on). This is what the Extlib VarArray
basically does, but it also includes some trickery to avoid having to
initialize the unused extra elements of the array which to explain
properly would involve a rather detailed discussion of the internals
of OCaml and the garbage collector.
> > BTW, it's better to use a struct rather than a tuple, for both speed
> > and memory reasons. type my_struct = {x : float; y : float; z :
> > float} is one object stored in 32 bytes, whereas (x, y, z) is four
> > objects stored in 64 bytes.
>
> Ok, this is helpful to know. The tuples were what came to my mind
> first. But the structs might also make the code more readable.
I think it's true to say that structs will always be of equal of
greater efficiency than tuples. Note that the only real advantage of
tuples - namely easy pattern matching - is almost as easy with
structures. For example:
let x, y, _ = my_tuple;;
vs:
let { x = x; y = y } = my_struct;;
Rich.
--
Richard Jones, CTO Merjis Ltd.
Merjis - web marketing and technology - http://merjis.com
Internet Marketing and AdWords courses - http://merjis.com/courses - NEW!
Merjis blog - http://blog.merjis.com - NEW!
__._,_.___
.
__,_._,___
|
| "ocaml_beginners"::[] Deal
with large lists |

|
2006-11-20 20:55:27 |
|
On Monday 20 November 2006 16:30, Richard Jones wrote:
> On Tue, Nov 21, 2006 at 01:13:04AM +1100, Christian Lerrahn wrote:
> > I can to a certain degree. The problem is that I have ghost particles
> > because I'm working on a periodic space. So if I want to keep arrays
> > fixed (as for example the existing Fortran 77 codes have to do), I have
> > to allocate n*3 times the number of particles to be safe (where n is
> > the number of dimensions).
I would not duplicate data for the ghost particles. I would make them implicit
and have your functions transparently indirect real particles to get the
effect of ghost particles. Again, this is explained in my book in the context
of finding nth-nearest neighbours in a periodic network. Have a look at the
freely available source code:
http://www.ffconsultancy.com/products/ocaml_for_scientists/complete
> > Ok, this is helpful to know. The tuples were what came to my mind
> > first. But the structs might also make the code more readable.
>
> I think it's true to say that structs will always be of equal of
> greater efficiency than tuples.
Tuples might be better optimised for argument passing, i.e. when used instead
of currying.
--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
Objective CAML for Scientists
http://www.ffconsultancy.com/products/ocaml_for_scientists
__._,_.___
.
__,_._,___
|
| "ocaml_beginners"::[] Re: Deal
with large lists |

|
2006-11-21 07:28:05 |
|
--- In ocaml_beginners%40yahoogroups.com">ocaml_beginners yahoogroups.com, Christian Lerrahn <ocaml ...>
wrote:
>
> Hi,
> I'm trying to write an SPH code in Ocaml. SPH stands for Smooth
> Particle Hydrodynamics and is a particle based code to solve the
> hydrodynamical equations. I'm still at draft stage and am currently
> wondering about how to represent the particles.
<cut>
Hi Christian,
I'm thinking that a graph data structure might me useful. That will
make it easy to find the neighbour of any given particle.
Now most graph algoritms are mostly imperative, but recently some
algoritms for functional style of programming using graphs have been
developed.
http://web.engr.oregonstate.edu/~erwig/fgl/
I do not know how it will work with regards to efficiency, since I do
not have enough experience with that, but the bookkeeping should be
easier.
Just a thought.
Cheers
Anders
__._,_.___
.
__,_._,___
|
[1-9]
|
|