List Info

Thread: Binary -> *ML




Binary -> *ML
user name
2006-08-25 17:30:35
> No harm compressing it?

No harm in principle of course.

Tha nastier issue wrto rpm is that if you
sign compressed blobs, you can never change
compression without voiding the signature.

Which is one reason why header+payload signatures need
to be phased out, as the signature is applied to an
uncompressed header and a zlib compressed payload.

And the vendor signature on *.rpm packages is "branding" for
OSS packages, since all OSS vendors are basically using
the same source code.

Meanwhile, comparing the output of --xml and --yaml on
the time package, I see that uncompressed YAML is like 60%
of the size of uncompressed XML, with the 2 existing
(totally untuned) implementations in rpm.

No matter what you are compressing, a smaller plaintext will
usually compress better than a larger one, for sufficiently
similar plaintexts.

And compressed YAML is 10% smaller than the equivalent XML compressed.

73 de Jeff
Binary -> *ML
user name
2006-08-25 19:33:36
Jeff Johnson wrote:

> Tha nastier issue wrto rpm is that if you
> sign compressed blobs, you can never change
> compression without voiding the signature.
> 
> Which is one reason why header+payload signatures need
> to be phased out, as the signature is applied to an
> uncompressed header and a zlib compressed payload.

The concept of signing is to show that there were no
changes, to 
compression action or otherwise.  If the payload becomes
compressed some 
other way by the originator then the signature applied at
creation-time 
will reflect that.  There is surely no scenario where
someone other than 
the originator decides to recompress the content of an rpm
on another 
algorithm but legitimately feels the original sig should
still work.

Still: since yum has the concept of having the headers
separate from the 
payload, the best way would seem to be to sign the header
and the 
payload separately.  AIUI unless there is some better check
at the 
moment via yum I could exploit a buffer overflow in rpmlib
header 
processing by perverting a repo without worrying about
signatures.

> And the vendor signature on *.rpm packages is
"branding" for
> OSS packages, since all OSS vendors are basically using
> the same source code.

Well no, because, say, Fedora is using repo mirrors.  If a
mirror is 
compromised, and in truth the security level of the
individual mirrors 
is unknown, then without originator sigs there would be no
way to detect 
and reject that a package had been jerked to have a %post of
rm -rf /.

> Meanwhile, comparing the output of --xml and --yaml on
> the time package, I see that uncompressed YAML is like
60%
> of the size of uncompressed XML, with the 2 existing
> (totally untuned) implementations in rpm.

Well I am thinking about package bloat, I am agnostic if the
sword comes 
down on XML or YAML.

> No matter what you are compressing, a smaller plaintext
will
> usually compress better than a larger one, for
sufficiently
> similar plaintexts.

But surely that is not true, you will get better compression
on War and 
Peace than one a few random sentences of it:

htt
p://www.gutenberg.org/dirs/etext01/wrnpc11.txt

$ ll wrnpc11.txt
-rw-r--r-- 1 agreen agreen 3284807 Aug 25 20:24 wrnpc11.txt
$ gzip -9 wrnpc11.txt
$ ll wrnpc11.txt.gz
-rw-r--r-- 1 agreen agreen 1209808 Aug 25 20:24
wrnpc11.txt.gz

compressed = 36% of original

First 33 lines of content from Gutenberg version -->

$ ll warsample.txt
-rw-r--r-- 1 agreen agreen 1863 Aug 25 20:26 warsample.txt
$ gzip warsample.txt
$ ll warsample.txt.gz
-rw-r--r-- 1 agreen agreen 1056 Aug 25 20:26
warsample.txt.gz

compressed = 56% of original

-Andy
_______________________________________________
Rpm-devel mailing list
Rpm-devellists.dulug.duke.edu
https://lists.dulug.duke.edu/mailman/listinfo/rpm-devel
Binary -> *ML
user name
2006-08-25 19:37:32
On Friday, 25 August 2006, at 20:33:36 (+0100),
Andy Green wrote:

> >No matter what you are compressing, a smaller
plaintext will
> >usually compress better than a larger one, for
sufficiently
> >similar plaintexts.                       
^^^^^^^^^^^^^^^^
>  ^^^^^^^^^^^^^^^^^^
> But surely that is not true, you will get better
compression on War
> and Peace than one a few random sentences of it:

I think you missed a bit of that.

Michael

-- 
Michael Jennings (a.k.a. KainX)  http://www.kainx.org/ 
<mejkainx.org>
n + 1, Inc., http://www.nplus1.net/    
  Author, Eterm (www.eterm.org)
------------------------------------------------------------
-----------
 "There are so many things that are incredible about
me.  The most
  amazing is my humility."                            
  -- Will Smith
_______________________________________________
Rpm-devel mailing list
Rpm-devellists.dulug.duke.edu
https://lists.dulug.duke.edu/mailman/listinfo/rpm-devel
Binary -> *ML
user name
2006-08-25 19:44:16
On Fri, 2006-08-25 at 20:33 +0100, Andy Green wrote:
> Jeff Johnson wrote:

> The concept of signing is to show that there were no
changes, to 
> compression action or otherwise.  If the payload
becomes compressed some 
> other way by the originator then the signature applied
at creation-time 
> will reflect that.  There is surely no scenario where
someone other than 
> the originator decides to recompress the content of an
rpm on another 
> algorithm but legitimately feels the original sig
should still work.
> 
> Still: since yum has the concept of having the headers
separate from the 
> payload, the best way would seem to be to sign the
header and the 
> payload separately.  AIUI unless there is some better
check at the 
> moment via yum I could exploit a buffer overflow in
rpmlib header 
> processing by perverting a repo without worrying about
signatures.


yum doesn't split the headers out anymore. It hasn't for a
couple of
years, now.
It does read the headers from the beginning of the rpms
using http
byte-ranges but it doesn't rely on anything in that header
for trust.


-sv


_______________________________________________
Rpm-devel mailing list
Rpm-devellists.dulug.duke.edu
https://lists.dulug.duke.edu/mailman/listinfo/rpm-devel
Binary -> *ML
user name
2006-08-25 19:46:02
seth vidal wrote:

> It does read the headers from the beginning of the rpms
using http
> byte-ranges but it doesn't rely on anything in that
header for trust.

Yeah but it does trust it not to exploit any buffer
overflow, sans 
signature check, Python being in the way accepted.

-Andy
_______________________________________________
Rpm-devel mailing list
Rpm-devellists.dulug.duke.edu
https://lists.dulug.duke.edu/mailman/listinfo/rpm-devel
Binary -> *ML
user name
2006-08-25 19:48:30
Michael Jennings wrote:
> On Friday, 25 August 2006, at 20:33:36 (+0100),
> Andy Green wrote:
> 
>>> No matter what you are compressing, a smaller
plaintext will
>>> usually compress better than a larger one, for
sufficiently
>>> similar plaintexts.                       
^^^^^^^^^^^^^^^^
>>  ^^^^^^^^^^^^^^^^^^
>> But surely that is not true, you will get better
compression on War
>> and Peace than one a few random sentences of it:
> 
> I think you missed a bit of that.

I didn't miss it, but quite possibly I am not evolved
enough to 
understand it.  Is the first 33 lines of War and Peace not 
representative enough to make the comparison to the whole
work?

He he I just scraped by with sang froid intact on this one:

$ dd if=wrnpc11.txt of=warhalf.txt bs=1000 count=1600
1600+0 records in
1600+0 records out
1600000 bytes (1.6 MB) copied, 0.0395766 seconds, 40.4 MB/s
$ ll warhalf.txt
-rw-r--r-- 1 agreen agreen 1600000 Aug 25 20:43 warhalf.txt
$ gzip -9 warhalf.txt
$ ll warhalf.txt.gz
-rw-r--r-- 1 agreen agreen 592500 Aug 25 20:43
warhalf.txt.gz

37% of the original is still worse than 36% of the original.

-Andy
_______________________________________________
Rpm-devel mailing list
Rpm-devellists.dulug.duke.edu
https://lists.dulug.duke.edu/mailman/listinfo/rpm-devel
Binary -> *ML
user name
2006-08-25 20:08:15
On Aug 25, 2006, at 3:33 PM, Andy Green wrote:

> Jeff Johnson wrote:
>
>> Tha nastier issue wrto rpm is that if you
>> sign compressed blobs, you can never change
>> compression without voiding the signature.
>> Which is one reason why header+payload signatures
need
>> to be phased out, as the signature is applied to an
>> uncompressed header and a zlib compressed payload.
>
> The concept of signing is to show that there were no
changes, to  
> compression action or otherwise.  If the payload
becomes compressed  
> some other way by the originator then the signature
applied at  
> creation-time will reflect that.  There is surely no
scenario where  
> someone other than the originator decides to recompress
the content  
> of an rpm on another algorithm but legitimately feels
the original  
> sig should still work.
>
> Still: since yum has the concept of having the headers
separate  
> from the payload, the best way would seem to be to sign
the header  
> and the payload separately.  AIUI unless there is some
better check  
> at the moment via yum I could exploit a buffer overflow
in rpmlib  
> header processing by perverting a repo without worrying
about  
> signatures.
>

Yep, header-only signatures were implemented in rpm-4.1, and
all RH  
packages
since RHL 7.3 have had both header-only and header+payload
signatures.

And there is a class of possible exploits that can be
prevented by  
signing compressed
rather than copmpressing signed plaintext. I'll leave it to
security  
experts to figger
the risk factor, but I do know that users often want to
change the  
compression *AND*
preserve the "branding" signature on packages.
So compressing signed  
to permit
that functionality.

Phasing out header+payload signatures is the core issue, as
certain  
vendors -- IBM, Sun, HP --
are still producing packages with rpm-3.0.x.

No matter what, compressing the signed header, rather than
signing  
the compressed header,
is preferred for existing *.rpm package use.

What remains to do in rpm is to compress the signed header.

>> And the vendor signature on *.rpm packages is
"branding" for
>> OSS packages, since all OSS vendors are basically
using
>> the same source code.
>
> Well no, because, say, Fedora is using repo mirrors. 
If a mirror  
> is compromised, and in truth the security level of the
individual  
> mirrors is unknown, then without originator sigs there
would be no  
> way to detect and reject that a package had been jerked
to have a % 
> post of rm -rf /.
>
>> Meanwhile, comparing the output of --xml and --yaml
on
>> the time package, I see that uncompressed YAML is
like 60%
>> of the size of uncompressed XML, with the 2
existing
>> (totally untuned) implementations in rpm.
>
> Well I am thinking about package bloat, I am agnostic
if the sword  
> comes down on XML or YAML.
>
>> No matter what you are compressing, a smaller
plaintext will
>> usually compress better than a larger one, for
sufficiently
>> similar plaintexts.
>
> But surely that is not true, you will get better
compression on War  
> and Peace than one a few random sentences of it:
>
> htt
p://www.gutenberg.org/dirs/etext01/wrnpc11.txt
>
> $ ll wrnpc11.txt
> -rw-r--r-- 1 agreen agreen 3284807 Aug 25 20:24
wrnpc11.txt
> $ gzip -9 wrnpc11.txt
> $ ll wrnpc11.txt.gz
> -rw-r--r-- 1 agreen agreen 1209808 Aug 25 20:24
wrnpc11.txt.gz
>
> compressed = 36% of original
>
> First 33 lines of content from Gutenberg version -->
>
> $ ll warsample.txt
> -rw-r--r-- 1 agreen agreen 1863 Aug 25 20:26
warsample.txt
> $ gzip warsample.txt
> $ ll warsample.txt.gz
> -rw-r--r-- 1 agreen agreen 1056 Aug 25 20:26
warsample.txt.gz
>
> compressed = 56% of original
>

I was speaking generally, and in absolute terms, like
compressing 2  
copies of
the same plaintext will be larger (generally) than
compressing 1 copy.

But yes, there's lots and lots more to compression and
entropy and  
whatever
that I'm not doing justice to.

My test was simply
     rpm -q --yaml time | gzip -9 > yaml
     rpm -q --xml time | gzip -9 > xml
and comparing the file sizes.

Not even close to rocket science, or does it have any value
other  
than a rude and crude
estimate of costs of choosing XML or YAML as markup.

73 de Jeff

_______________________________________________
Rpm-devel mailing list
Rpm-devellists.dulug.duke.edu
https://lists.dulug.duke.edu/mailman/listinfo/rpm-devel
Binary -> *ML
user name
2006-08-25 20:13:35
On Aug 25, 2006, at 3:44 PM, seth vidal wrote:

> On Fri, 2006-08-25 at 20:33 +0100, Andy Green wrote:
>> Jeff Johnson wrote:
>
>> The concept of signing is to show that there were
no changes, to
>> compression action or otherwise.  If the payload
becomes  
>> compressed some
>> other way by the originator then the signature
applied at creation- 
>> time
>> will reflect that.  There is surely no scenario
where someone  
>> other than
>> the originator decides to recompress the content of
an rpm on another
>> algorithm but legitimately feels the original sig
should still work.
>>
>> Still: since yum has the concept of having the
headers separate  
>> from the
>> payload, the best way would seem to be to sign the
header and the
>> payload separately.  AIUI unless there is some
better check at the
>> moment via yum I could exploit a buffer overflow in
rpmlib header
>> processing by perverting a repo without worrying
about signatures.
>
>
> yum doesn't split the headers out anymore. It hasn't
for a couple of
> years, now.
> It does read the headers from the beginning of the rpms
using http
> byte-ranges but it doesn't rely on anything in that
header for trust.
>

With all fairness, yum transport is hopeless.

First the metadata for the repo is downloaded.

Then -- for packages chosen -- headers are downloaded,
containing
a great deal of redundant information with the original
rpm-metadata  
download.

Finally the package itself is downloaded which contains
exactly
the same header as previously downloaded.

That's a lot of unecessary downloading afaict.

And yes, the reason for this bogosity is that yum *must*
pass
header in ts.addInstall().

So go ahead and blame rpm for your yum woes, yum transport
is poorly designed no matter how sexy http with byte ranges
using
urlgrabber is.

73 de Jeff

_______________________________________________
Rpm-devel mailing list
Rpm-devellists.dulug.duke.edu
https://lists.dulug.duke.edu/mailman/listinfo/rpm-devel
Binary -> *ML
user name
2006-08-25 20:16:27
On Friday, 25 August 2006, at 16:08:15 (-0400),
Jeff Johnson wrote:

> No matter what, compressing the signed header, rather
than signing
> the compressed header, is preferred for existing *.rpm
package use.
> 
> What remains to do in rpm is to compress the signed
header.

I'm no security expert either, but it seems pretty simple
to me:

The data of value is the uncompressed header and the
uncompressed
payload.  As long as those are unchanged, the package
contents and
their functionality are also unchanged.  Signing header or
signing
header+payload, both uncompressed, makes sense because
you're
protecting the data they contain.  Signing compressed data
is merely
signing an alternative representation of the data, not the
data
itself, and gains nothing in terms of the security of the
data.
Signing compressed data guarantees the security of the
compression
only.

The only advantage of signing the compressed data is that
you
guarantee the compressed data is unaltered.  As the data in
question
is only useful in its uncompressed form, this seems
pointless.

Michael

-- 
Michael Jennings (a.k.a. KainX)  http://www.kainx.org/ 
<mejkainx.org>
n + 1, Inc., http://www.nplus1.net/    
  Author, Eterm (www.eterm.org)
------------------------------------------------------------
-----------
 "From lost and not found, to run and not hide, my
hand inside Your
  hand.  Losing my grip, falling so far, my hand inside Your
hand."
                                               -- Jars of
Clay, "Hand"
_______________________________________________
Rpm-devel mailing list
Rpm-devellists.dulug.duke.edu
https://lists.dulug.duke.edu/mailman/listinfo/rpm-devel
Binary -> *ML
user name
2006-08-25 20:21:09
Michael Jennings wrote:

> The only advantage of signing the compressed data is
that you
> guarantee the compressed data is unaltered.  As the
data in question
> is only useful in its uncompressed form, this seems
pointless.

It's not pointless if the compressed data has been messed
with to 
exploit a buffer overflow in the decompression code that you
execute in 
order to determine that you just got hacked.

-Andy
_______________________________________________
Rpm-devel mailing list
Rpm-devellists.dulug.duke.edu
https://lists.dulug.duke.edu/mailman/listinfo/rpm-devel
[1-10] [11-20] [21-30] [31-40] [41-48]

about | contact  Other archives ( Real Estate discussion Medical topics )