List Info

Thread: file grouping answer.




file grouping answer.
user name
2006-03-29 12:51:32

I was going to ask:-

I've got a load of files I want to efficently sort into
groups of 4.2Gb(1)
for burning to DVD.  How do I go about doing this?

I did some digging and found the answer and thought I'd
save time by
posting the answer as well.

The hard part was finding out what the problem was called I
guessed it was
going to be some varient on the 'Travelling salesman
problem' (2) which
ment that any any home rolled solution would be to say the
least sub
optimal....

I started out by Googling on algorithm group, algorithm sort
and perl
algorithm but came up with little of use, finally I tried
just algorithm
which took me to the wikipedia Algorithm page (3).  I
flicked through it
and finally noticed a reference to Approximation algorithms
(4).  The
Travelling salesman is NP-Complete (5), however Nicholas
Clark YAPC::Braga
talk 'When Perl is not quite fast enough' made the point
that even if a
problem is NP-complete there are faster approximation
algorithms which
will get a nearly optimal solution (6).  I was in luck it
mentioned the
'bin packing problem' (7)

Search CPAN on bin packing got me almost straight to:-

Algorithm::BinPack and via BinPack to Algorithm::Bucketizer
(which called
in it bin-packing and was hence missed in future I think I
may just google
site:http://search.cpan.org)

(1) when a disk starts to fail IME it often starts from the
outer edge,
    not filling it provides a buffer zone
(2) <
http://en.wikipedia.org/wiki/Travelling_salesman>
(3) <http://en.
wikipedia.org/wiki/Algorithm>
(4) <http://en.wikipedia.org/wiki/Approximation_algorithms&g
t;
(5) ie the time to get an exact answer  increases
exponentaily with the
    number of variables
(6) if your wavering about going to YAPC::Birmingham this is
proof it will
    save you time! <http://w
ww.ccl4.org/~nick/P/Fast_Enough/>
(7) <
http://en.wikipedia.org/wiki/Bin_packing_problem>


--
Michael

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~~~~~~~
Michael John Lush PhD			Tel:44-20-7679-5027
Nomenclature Bioinformatician		Fax:44-20-7387-3496
HUGO Gene Nomenclature Committee	Email:  nomegalton.ucl.ac.uk
The Galton Laboratory
University College bond, UK
URL: http://www.ge
ne.ucl.ac.uk/nomenclature/
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~~~~~~~
[1]

about | contact  Other archives ( Real Estate discussion Medical topics )