Jonathan Feally wrote:
> Consider the following tables:
>
> CREATE TABLE `dbmail_messageblks` (
> `messageblk_idnr` bigint(21) NOT NULL
auto_increment,
> `physmessage_id` bigint(21) NOT NULL default '0',
> `messagepart_order` int NOT NULL,
> `messagepart_idnr` bigint(21) NOT NULL,
> `mime_header` tinyint(1) NOT NULL DEFAULT '0',
> PRIMARY KEY (`messageblk_idnr`),
> UNIQUE KEY `messageblk_idnr_message_part`
> (`messageblk_idnr`,`messagepart_idnr`),
> KEY `physmessage_id_index` (`physmessage_id`),
> KEY `messagepart_id_index` (`messagepart_idnr`),
> CONSTRAINT `dbmail_messageblks_ibfk_1` FOREIGN KEY
(`physmessage_id`)
> REFERENCES `dbmail_physmessage` (`id`) ON DELETE
CASCADE ON UPDATE CASCADE
> ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
>
> CREATE TABLE `dbmail_messageparts` (
> `messagepart_idnr` bigint(20) NOT NULL
auto_increment,
> `message` longblob,
> `size` bigint(20) NOT NULL,
> `hash` varchar(64) NOT NULL,
> PRIMARY KEY (`messagepart_idnr`),
> UNIQUE KEY `hash_size` (`hash`,`size`)
> ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
>
> and a message constructed like follows
>
> ->Message
> -Non Mime Message or Mime Warning "This is a
multi-part message in
> MIME format."
> ->Plain Text
> ->HTML
> ->JPEG
> ->Attached Message 1
> -Non Mime Message or Mime Warning "This is
a multi-part message
> in MIME format."
> ->Plain Text
> ->HTML
> ->MPEG
> ->Attached Message 2
> -Non Mime Message or Mime Warning "This is
a multi-part message
> in MIME format."
> ->Plain Text
> ->HTML
> ->GIF
>
> Would be come these message parts
>
> begin
> message_part=0 Root Message Headers
> message_part=1 Root Non-Mime Message
> message_part=2 Boundry and headers ("Plain
Text")
> message_part=3 Body ("The Plain Text")
> message_part=4 Boundry and headers
("HTML")
> message_part=5 Body ("The HTML")
> message_part=6 Boundry and headers ("JPEG
Attachment")
> message_part=7 Body ("Base64 of the JPEG")
> message_part=8 Boundry and headers ("Attached
Message 1")
> message_part=9 Attached message 1 Headers
> message_part=10 Attached message 1 Non-Mime
> message_part=11 Attached message 1 Boundry and headers
("Plain Text")
> message_part=12 Attached message 1 Body ("Plain
Text")
> message_part=13 Attached message 1 Boundry and headers
("HTML")
> message_part=14 Attached message 1 Body
("HTML")
> message_part=15 Attached message 1 Boundry and headers
("MPEG")
> message_part=16 Attached message 1 Body
("MPEG")
> message_part=17 Attached message 1 Closing boundry
> message_part=18 Boundry and headers ("Attached
Message 2")
> message_part=19 Attached message 2 Headers
> message_part=20 Attached message 2 Non-Mime
> message_part=21 Attached message 2 Boundry and headers
("Plain Text")
> message_part=22 Attached message 2 Body ("Plain
Text")
> message_part=23 Attached message 2 Boundry and headers
("HTML")
> message_part=24 Attached message 2 Body
("HTML")
> message_part=25 Attached message 2 Boundry and headers
("GIF")
> message_part=26 Attached message 2 Body
("GIF")
> message_part=27 Attached message 2 Closing Boundry
> message_part=28 Root Message Closing Boundry
> end
>
> While a simple non-mime message would be
> begin
> message_part=0 Message Headers
> message_part=1 Message Body
> end
>
>
> I propose using sha256 as it will give a more unique
key at 64 chars
> than sha1 at 40 chars and only doing the sha256 on the
first 8 or 16 MB.
> I can't think of any real data files that could result
in the same size
> and sha256 of the first 8 MB. This could also be
configurable in the
> dbmail.conf to what ever size the admin choses, and
could be changed
> later, but would require dbmail to be shut down and the
dbmail-util run
> to update every hash in the database to use the new
value.
>
So if I add one line at the end of a word doc, this will be
considered to be the
same file unless the file format includes something unique
within the first 8MB
of the file?
_______________________________________________
DBmail mailing list
DBmail dbmail.org
htt
ps://mailman.fastxs.nl/mailman/listinfo/dbmail
|