|
List Info
Thread: generate only pages changed since last static site build?
|
|
| generate only pages changed since last
static site build? |

|
2007-08-11 16:50:02 |
Forresters,
Is it possible to generate only the pages in a Forrest site
whose
corresponding source files have been changed since the last
static
site build? Or at least prevent updating the last
modification time
on generated, static pages that have not been changed since
the last
static site build?
I suppose this might be more of a Cocoon question than
Forrest. In
any case, the practice of generating every page, regardless
of whether
it has been changed, causes problems for publishing the
static site
build because all the local files have a newer modification
time than
the remote files on the Web host's FTP server.
I am using lftp's reverse mirror function to synchronise the
local
site build with the public Web site. I have to tell lftp to
ignore
file timestamps when determining which local files need to
be
uploaded. If I do not, lftp will unnecessarily upload all
local files
because of the newer local timestamps.
Having lftp ignore file timestamps works until a local file
has been
changed without changing the file size. In this case, lftp
does not
check whether the local file has been changed. Instead,
lftp skips
uploading this local file because it has the same size as
the remote
file.
If the last modification time was changed only for pages
that have
been changed, this would permit lftp to use the local
timestamps to
determine which local files need to be uploaded. This also
means that
the same file size problem is avoided.
Thanks,
Brolin
|
|
| Re: generate only pages changed since
last static site build? |

|
2007-08-12 07:17:28 |
On 8/11/07, Brolin Empey <brolin.empey gmail.com> wrote:
> Forresters,
>
> Is it possible to generate only the pages in a Forrest
site whose
> corresponding source files have been changed since the
last static
> site build? Or at least prevent updating the last
modification time
> on generated, static pages that have not been changed
since the last
> static site build?
>
> I suppose this might be more of a Cocoon question than
Forrest. In
> any case, the practice of generating every page,
regardless of whether
> it has been changed, causes problems for publishing the
static site
> build because all the local files have a newer
modification time than
> the remote files on the Web host's FTP server.
>
> I am using lftp's reverse mirror function to
synchronise the local
> site build with the public Web site. I have to tell
lftp to ignore
> file timestamps when determining which local files need
to be
> uploaded. If I do not, lftp will unnecessarily upload
all local files
> because of the newer local timestamps.
>
> Having lftp ignore file timestamps works until a local
file has been
> changed without changing the file size. In this case,
lftp does not
> check whether the local file has been changed.
Instead, lftp skips
> uploading this local file because it has the same size
as the remote
> file.
>
> If the last modification time was changed only for
pages that have
> been changed, this would permit lftp to use the local
timestamps to
> determine which local files need to be uploaded. This
also means that
> the same file size problem is avoided.
>
> Thanks,
> Brolin
>
Hi Brolin,
This doesn't answer the question you've asked but it may
help with
only uploading the changed stuff. Unfortunately, we don't
have a way
to only generate files that have changed right now.
http://forrest.apache.org/docs_0_80/faq.html#checksums
--tim
|
|
| Re: generate only pages changed since
last static site build? |

|
2007-08-12 08:12:05 |
On Sat, 2007-08-11 at 14:50 -0700, Brolin Empey wrote:
> I am using lftp's reverse mirror function to
synchronise the local
> site build with the public Web site. I have to tell
lftp to ignore
> file timestamps when determining which local files need
to be
> uploaded. If I do not, lftp will unnecessarily upload
all local files
> because of the newer local timestamps.
I use rsync to upload local files. Although it also
recognizes all files
as updated, at least the rsync protocol is designed so that
only the
(typically very small) delta between local and remote file
needs to be
transmitted - this is so fast that the difference to not
touching local
files would be almost irrelevant.
I find the feature to not rebuild if the source files have
not changed
valuable for a different reason: it would allow a visitor to
the page
see when it was really modified. Currently only "last
published" is
available.
--
Bye, Patrick Ohly
--
Patrick.Ohly gmx.de
http://www.estamos.de/
|
|
| Re: generate only pages changed since
last static site build? |

|
2007-08-12 14:58:09 |
Tim:
The Cocoon checksum feature sounds like it may work. I will
try it
when I return to work and will let you know the results.
Patrick:
I cannot use rsync because I have access to the remote host
only via
FTP, not SSH. I could use curlftpfs, but I seem to remember
not
having good results with using rsync on a curlftpfs mounted
file
system.
With the help of this list, I hacked the pelt skin to
disable the
"last published" date and insert the Subversion
$Id$ keyword in its
place. This way, the reader (Web site visitor) can see when
the last
change to the page they are reading was committed. I have
been
intending to get this change committed to the Forrest SVN
repository,
but have not yet got there. The revision ID in my hacked
pelt skin is
taken from the result of the SVN keyword substitution of the
$Id$
keyword in a metadata property in the head section of the
source XDocs
file. This means that any version control system with
support for
keyword substitution can be used, not only SVN.
Thanks,
Brolin
|
|
| Re: generate only pages changed since
last static site build? |

|
2007-08-12 20:56:10 |
Tim Williams wrote:
>
> This doesn't answer the question you've asked but it
may help with
> only uploading the changed stuff. Unfortunately, we
don't have a way
> to only generate files that have changed right now.
>
> http://forrest.apache.org/docs_0_80/faq.html#checksums
I use that checksum technique to very good effect to
publish
one of my websites, in conjunction with Forrestbot's
"deploy via scp"
method.
-David
|
|
| Re: generate only pages changed since
last static site build? |

|
2007-08-31 18:43:19 |
Brolin Empey <brolin.empey gmail.com> wrote:
> Tim:
>
> The Cocoon checksum feature sounds like it may work. I
will try it
> when I return to work and will let you know the
results.
The Cocoon checksum feature seems to work, but I have a
configuration
problem with specifying checksums-uri in cli.xconf.
I need checksums-uri to be relative to e.g. the root of my
Forrest
project (the directory containing forrest.properties and
src/) so that
I can use a relative path to a checksums file in my
build/site/
directory.
I do not want to hard-code an absolute path because this
path will
break if I move the SVN working copy of my Forrest site, or
if someone
else checks out a copy of my Forrest site.
I tried setting checksums-uri to just "checksums",
but then the build
process hangs when I run "forrest" to rebuild the
static site because
the JVM encounters a Java IO exception when attempting to
write the
checksums file. This happens because I do not have write
permission
for whichever directory checksums-uri is relative to. I
tried
enabling both verbose and debug output for both Forrest and
the JVM,
but when the JVM encounters the Java IO exception it still
does not
say which path it is trying to write to.
So, my questions are:
1. Which directory is checksums-uri relative to?
2. Is there any way to get the path of the file that could
not be
written when the JVM encounters a Java IO exception? I am
using Sun's
JDK on Linux:
$ java -version
java version "1.5.0_10"
Java(TM) 2 Runtime Environment, Standard Edition (build
1.5.0_10-b03)
Java HotSpot(TM) Client VM (build 1.5.0_10-b03, mixed mode,
sharing)
3. Can I make checksums-uri relative to e.g. the root of my
Forrest
project? I notice I can set e.g. a context directory in
cli.xconf,
but I do not know what else is affected by this setting.
If possible, I would like to avoid having to specify an
absolute path
to a temporary location, such as /tmp/checksums, and copy
the
checksums file back into my build/site/ directory.
Thanks,
Brolin
|
|
| Re: generate only pages changed since
last static site build? |

|
2007-09-21 00:50:54 |
Brolin Empey <brolin.empey gmail.com> wrote:
> So, my questions are:
> <snip>
Can anyone answer at least one of my questions?
If not, it looks like I may have to settle for the kludge
described in
the last paragraph of my previous message.
Brolin
|
|
| Re: generate only pages changed since
last static site build? |

|
2007-09-21 02:06:22 |
On Fri, 2007-08-31 at 16:43 -0700, Brolin Empey wrote:
> Brolin Empey <brolin.empey gmail.com> wrote:
> > Tim:
> >
> > The Cocoon checksum feature sounds like it may
work. I will try it
> > when I return to work and will let you know the
results.
>
> The Cocoon checksum feature seems to work, but I have a
configuration
> problem with specifying checksums-uri in cli.xconf.
>
> I need checksums-uri to be relative to e.g. the root of
my Forrest
> project (the directory containing forrest.properties
and src/) so that
> I can use a relative path to a checksums file in my
build/site/
> directory.
>
> I do not want to hard-code an absolute path because
this path will
> break if I move the SVN working copy of my Forrest
site, or if someone
> else checks out a copy of my Forrest site.
>
> I tried setting checksums-uri to just
"checksums", but then the build
> process hangs when I run "forrest" to rebuild
the static site because
> the JVM encounters a Java IO exception when attempting
to write the
> checksums file. This happens because I do not have
write permission
> for whichever directory checksums-uri is relative to.
I tried
> enabling both verbose and debug output for both Forrest
and the JVM,
> but when the JVM encounters the Java IO exception it
still does not
> say which path it is trying to write to.
>
Word of warning: I have not yet used this feature, so I am
fishing in
the dark!
> So, my questions are:
>
> 1. Which directory is checksums-uri relative to?
I reckon either the project.home or
place.where.cli.file.exist.
>
> 2. Is there any way to get the path of the file that
could not be
> written when the JVM encounters a Java IO exception? I
am using Sun's
> JDK on Linux:
Have you looked in the logs? Normally an IO will throw a
stacktrace
defining which file could not be found. See the logs in
build/webapp/WEB-INF/logs/*.log
> $ java -version
> java version "1.5.0_10"
> Java(TM) 2 Runtime Environment, Standard Edition (build
1.5.0_10-b03)
> Java HotSpot(TM) Client VM (build 1.5.0_10-b03, mixed
mode, sharing)
>
> 3. Can I make checksums-uri relative to e.g. the root
of my Forrest
> project? I notice I can set e.g. a context directory
in cli.xconf,
> but I do not know what else is affected by this
setting.
>
Did you try?
http://marc.info/?l=forrest-dev&m=1065523112039
54&w=2 sounds like it can
be any path: "The checksum file can be stored on a
server, e.g. using an
ftp:// URI (any URI for which a modifiable source
exists)."
So you could use a "checksum server".
Another possibility is to generate the cli.xconf via a
custom ant
build.xml and store the location in a build.properties
file.
salu2
> If possible, I would like to avoid having to specify an
absolute path
> to a temporary location, such as /tmp/checksums, and
copy the
> checksums file back into my build/site/ directory.
>
> Thanks,
> Brolin
--
Thorsten Scherler
thorsten.at.apache.org
Open Source Java consulting, training
and solutions
|
|
| Re: generate only pages changed since
last static site build? |

|
2007-09-24 02:14:39 |
I started this reply ages ago but never got around to
finishing it ...
Brolin Empey wrote:
> Brolin Empey wrote:
> >
> > The Cocoon checksum feature sounds like it may
work. I will try it
> > when I return to work and will let you know the
results.
>
> The Cocoon checksum feature seems to work, but I have a
configuration
> problem with specifying checksums-uri in cli.xconf.
>
> I need checksums-uri to be relative to e.g. the root of
my Forrest
> project (the directory containing forrest.properties
and src/) so that
> I can use a relative path to a checksums file in my
build/site/
> directory.
>
> I do not want to hard-code an absolute path because
this path will
> break if I move the SVN working copy of my Forrest
site, or if someone
> else checks out a copy of my Forrest site.
>
> I tried setting checksums-uri to just
"checksums", but then the build
> process hangs when I run "forrest" to rebuild
the static site because
> the JVM encounters a Java IO exception when attempting
to write the
> checksums file. This happens because I do not have
write permission
> for whichever directory checksums-uri is relative to.
I tried
> enabling both verbose and debug output for both Forrest
and the JVM,
> but when the JVM encounters the Java IO exception it
still does not
> say which path it is trying to write to.
>
> So, my questions are:
>
> 1. Which directory is checksums-uri relative to?
As said in the example cli.xconf file:
"The default path is relative to the core webapp
directory."
So, relative to $FORREST_HOME/main/webapp directory.
I think that we have a problem in Forrest. Some stuff
(e.g. main sitemaps) are relative to the above directory
whereas other stuff is relative to project.home directory.
> 2. Is there any way to get the path of the file that
could not be
> written when the JVM encounters a Java IO exception? I
am using Sun's
> JDK on Linux:
Did you look in the Cocoon logs?
http://forres
t.apache.org/faq.html#logs
> $ java -version
> java version "1.5.0_10"
> Java(TM) 2 Runtime Environment, Standard Edition (build
1.5.0_10-b03)
> Java HotSpot(TM) Client VM (build 1.5.0_10-b03, mixed
mode, sharing)
>
> 3. Can I make checksums-uri relative to e.g. the root
of my Forrest
> project? I notice I can set e.g. a context directory
in cli.xconf,
> but I do not know what else is affected by this
setting.
Sorry, i don't know. It might not even have any effect.
Search the Forrest code. I see one mention of
"webapp"
at main/targets/site.xml line 37 in trunk SVN.
Also search the Cocoon docs and code as mentioned in the
top of the default config file at
main/webapp/WEB-INF/cli.xconf
http:/
/cocoon.apache.org/2.1/userdocs/offline/
http://wiki
.apache.org/cocoon/CommandLine
More discussion should probably happen on the Forrest
"dev" list
as we are getting away from user-land discussion.
-David
> If possible, I would like to avoid having to specify an
absolute path
> to a temporary location, such as /tmp/checksums, and
copy the
> checksums file back into my build/site/ directory.
>
> Thanks,
> Brolin
|
|
| Re: generate only pages changed since
last static site build? |

|
2007-10-06 00:25:02 |
David Crossley <crossley apache.org> wrote:
> I started this reply ages ago but never got around to
finishing it ...
>
> Brolin Empey wrote:
> > If possible, I would like to avoid having to
specify an absolute path
> > to a temporary location, such as /tmp/checksums,
and copy the
> > checksums file back into my build/site/
directory.
I ended up settling on this kludge. My Forrest sites's
cli.xconf are
hard-coded to use /tmp/checksums, which is copied and moved
by pre-
and post-build functions in my forrest.sh wrapper/launcher
and utility
script.
I think it would make more sense to have paths such as
checksums-uri
be relative to project.home. This allows specification of
files
within the user's project, where Forrest is much more likely
to have
write permission when run as a regular user. At least, for
users like
me who install Forrest under /usr/local/, to which regular
users do
not have write permission. Furthermore, it makes more sense
to keep
files such as "checksums" as part of the user's
Forrest project so
that they can be copied with the project and possibly kept
under
version control.
Brolin
|
|
[1-10]
|
|