EUPS distrib daily tags (d_YYYY_MM_DD) will now be retired after 30 days

When the d_YYYY_MM_DD eups distrib tags were introduced last year, they were never intended to be long lived as there is no corresponding git tag. While they have been manually cleaned up once or twice, they had accumulated to the point that eups distrib install... was spending an irritating amount of time downloading tags in serial. As of today, daily tag cleanup has been automated and are now retried by being moved into a sub-directory named old_tags. Cleanup happens across all eups.lsst.codes hosted eups package roots, including the “tarballs”.

E.g.: https://eups.lsst.codes/stack/src/tags/old_tags/

1 Like

That’s because of the robots.txt file. Use -U to avoid this (as I’ve said before).

This is the first I’ve heard of a robots.txt in relation to eups distrib but there is not, nor has there ever been, a robots.txt on eups.lsst.codes.

$ curl -I  https://eups.lsst.codes/robots.txt
HTTP/1.1 404 Not Found
Server: nginx/1.13.3
Date: Wed, 02 May 2018 17:15:04 GMT
Content-Type: text/html; charset=iso-8859-1
Connection: keep-alive

$ curl -I  https://eups.lsst.codes/stack/robots.txt
HTTP/1.1 404 Not Found
Server: nginx/1.13.3
Date: Wed, 02 May 2018 17:15:08 GMT
Content-Type: text/html; charset=iso-8859-1
Connection: keep-alive

$ curl -I  https://eups.lsst.codes/stack/src/robots.txt
HTTP/1.1 404 Not Found
Server: nginx/1.13.3
Date: Wed, 02 May 2018 17:15:11 GMT
Content-Type: text/html; charset=iso-8859-1
Connection: keep-alive

$ curl -I  https://eups.lsst.codes/stack/src/tags/robots.txt
HTTP/1.1 404 Not Found
Server: nginx/1.13.3
Date: Wed, 02 May 2018 17:15:14 GMT
Content-Type: text/html; charset=iso-8859-1
Connection: keep-alive

To quote from #dm-square on 2017-06-21:

Robert Lupton [1:48 PM]
Who can remove/fix the robots.txt file on `https://sw.lsstcorp.org`?

…

Joshua Hoblitt [1:57 PM]
@rhl that is ncsa infrastructure

So the problem was brought to square’s attention then.

I don’t recall that discussion. Looking at the history, there is no problem description or mention of -U. Could you describe what problem -U and/or removal of robots.txt is supposed to resolve?

It was also made clear on 2017-06-21 that there is no robots.txt on eups.lsst.codes:

Joshua Hoblitt [10:36 AM]
@rhl  I don't have administrative control of https://sw.lsstcorp.org/.  I want to make https://eups.lsst.codes canonical but it looks like I failed to open an RFC on that.

there is no robots.txt on eups.lsst.codes

I think the problem is downloading large numbers of tag list files, and possibly doing that repeatedly during an installation. If the downloads were done using a mechanism that respected the robots.txt Crawl-delay directive and such a directive were to be present, it is possible that they could be very slow. However, it appears to me that the current code does not respect this directive, and the file does not exist anyway.

-U avoids the problem by not attempting to apply the tags in the first place and thus not needing to download any tag list files.