From kragen@dnaco.net Mon Sep 28 10:52:47 1998
Date: Mon, 28 Sep 1998 10:52:46 -0400 (EDT)
From: Kragen <kragen@dnaco.net>
To: "'talkback@threepoint.com'" <talkback@threepoint.com>
cc: linux-news-security@threepoint.com, 
    "'tarreau@aemiaif.lip6.fr'" <tarreau@aemiaif.lip6.fr>
Subject: RE: A good reason not to un-tar files as root
In-Reply-To: <51B2270AE372D0118D3D00A024969E8D96C9F0@exchange-iss.lancs.ac.uk>
Message-ID: <Pine.SUN.3.96.980928103947.21177V-100000@picard.dnaco.net>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Status: O
X-Status: 

On Fri, 25 Sep 1998, Bennett, Steve wrote:
> Presumably the 'right' thing to do would be to keep a list of symlinks that
> have been generated by the tar file, and to subsequently refuse to extract
> directories that contain those dirs as an element. In normal use you will get
> no symlinks like this being created at all, so the performance impact would be
> minimal. 

I do not believe that this is correct.

1. To do unlimited damage, it is sufficient to extract a file whose
pathname contains the symlink; it is not necessary to extract a
directory whose pathname contains the symlink.
2. This is vulnerable to the vagaries of your filesystem.  Some examples:
garbage(link/ -> /etc
garbage/profile

'(' is an invalid character on NTFS filesystems, as well as lots of
others Linux supports.  I think the above example will result in
overwriting /etc/profile.

Garbage/ -> /etc
garbage/profile

(case-insensitive filesystems)

garbage/ -> /etc
./garbage/profile

(.)

./garbage/ -> /etc
.//garbage/profile

(//)

garbage/ -> /etc
legit/
legit/../garbage/profile

(..)

Now, you could put all the above special cases into your tar checking
code, and close *those particular* holes.  However, any time you try to
write one piece of code that predicts what another complex piece of
code will do, and refuses to pass along inputs to that second piece of
code that it thinks will do bad things, you are very likely to have
bugs in your 'guardian' code.  (In this case, you're proposing
incorporating code into tar to predict what the filename-lookup code in
the filesystem will do.)

The thing to do is to *ask* the filename-lookup code in the filesystem:
what file does this filename go to?  That is, do lstat() or readlink()
on each element of the file path to determine if it's a symlink or
not.

We could reasonably assume that this is not time-dependent (opening
ourselves to race conditions) and only lstat() each unique pathname
once.

3. it's possible to thus reject a legitimate tar file:
x/
y -> x
y/z
. . . but I think that's OK.

BTW, der Mouse's tar avoids a lot of these problems.

Kragen

-- 
<kragen@pobox.com>       Kragen Sitaker     <http://www.pobox.com/~kragen/>
The sages do not believe that making no mistakes is a blessing. They believe, 
rather, that the great virtue of man lies in his ability to correct his 
mistakes and continually make a new man of himself.  -- Wang Yang-Ming


