List Info

Thread: geshi-src/geshi/languages/c NOTES,NONE,1.1




geshi-src/geshi/languages/c NOTES,NONE,1.1
user name
2006-08-08 02:19:32
Update of /cvsroot/geshi/geshi-src/geshi/languages/c
In directory sc8-pr-cvs2.sourceforge.net:/tmp/cvs-serv9926

Added Files:
	NOTES 
Log Message:
Created this file by migrating some extended comments from
c.php because the amount of space taken up in that file by
comments was becoming excessive.

--- NEW FILE: NOTES ---
Here are several notes on C highlighting as originally
contained as extended
comments within c.php.  Mostly this relates to
preprocessor-context
highlighting and the situations in which the C parser
function
GeSHiCCodeParser::parseToken() adjusts it.

== (Un)Highlighted keywords in the preprocessor context ==

It might seem questionable at first whether
declarator/type/qualifier keywords,
standard functions and standard macros or objects will ever
occur, thus
requiring highlighting, within some preprocessor directives
- namely #(el)if,
#ifdef, #ifndef and #undef.  They can and do occur in
practice though because
these directives can be used to test whether at preprocessor
level the keyword,
type or function in question has been subverted (or for a
function, whether
it's been legitimately defined as a macro), and/or to undo
or change that
subversion; for #if/#elif, sizeof should be highlighted in
any case - it's been
categorised as a standard function for GeSHi's purposes.

For #(el)if, a type might also appear as the subject of
sizeof.

It's also debatable whether such tokens should be
highlighted within #error and
#pragma directives - it seems most appropriate that they are
not, because  within
those directives their occurrence can be likened to their
appearance within a
comment.  GeSHiCCodeParser::parseToken() therefore adjusts
those contexts;
their highlighting when the parser is disabled is tolerable
as a minor glitch.

It's less debatable that within a #include filename, these
keywords should not
be highlighted.  That's handled in
GeSHiCCodeParser::parseToken() for <>
includes - quoted includes are already protected by the
string_literal context
(which parseToken() reclassifies).  It's borderline
tolerable that this
incorrect highlighting will appear when the parser is
disabled.

Within a #include where the filename is specified by a
macro, the only keywords
that should be highlighted out of the list at the top of
this section are:
standard macros (because they might be used in a stringising
macro "call"), any
standard functions that are implementable as macros (for the
same reason),
"sizeof" (because it might be used to generate
an argument for a macro "call")
and types (but not qualifiers) where they appear as the
subject of sizeof.  The
remainder have no meaning in preprocessor
macro-"call" context.  Separating out
"implementable-as-a-macro" from the other
standard functions is a longer-term
future task to complete alongside comprehensively filling
out what's missing
from the keyword lists.  Separating qualifiers from types is
another task to
consider.  To start with, GeSHiCCodeParser::parseToken()
disables highlighting
for the context 'declarator-keyword' within #include:s
where the filename is
specified by a macro, and /all/ highlighting is disabled for
the macro name
itself - i.e. highlighting applies only to macro arguments.

The same reasoning of the above paragraph can be applied to
the #line directive
where its "arguments" are specified by a macro:
GeSHiCCodeParser::parseToken()
similarly disables highlighting in that situation.

== Symbols in C preprocessor directives ==

Not all of the symbols added by the call:
   
$context->addSymbolGroup(geshi_c_get_standard_symbols(),
      'c/c/preprocessor/symbol');
have meaning for all preprocessor directives and in some
directives they are
illegal.  This GeSHi C module assumes well-formed input code
so illegal
occurrences need not concern it.

In #(el)if directives, any symbol except the semicolon can
legally occur.
At first it might seem that & has no place either
because at preprocessing stage
no objects exist to take an address of, but & can also
act as a bitwise
operator or be part of the logical && operator.  Due
to the lack of objects it
might also at first seem that [] has no use, however it can
be applied to
string literals for esoteric uses in a preprocessor constant
such as this
expression equating to 1:
"abcd"[1] == 'b'
A semicolon though is only used to end single statements in
code - this can't
apply to a constant preprocessor expression.

In #include and #line directives, the header filename and
new effective source
file name (respectively) may be specified by a macro.  A
macro may take a
constant preprocessor expression as an argument, so by this
reasoning it can be
seen that within #include and #line directives the same set
of symbols can
occur as within an #(el)if directive - namely, anything
except a semicolon.

In a #define, even a semicolon can occur because the macro
can substitute for
code.

#ifdef, #ifndef, #undef, #endif and #else do not allow any
symbol except by
proxy for comments and line continuation slashes.

Likewise for #error and #pragma except that any symbol could
occur as part of
the subsequent (unquoted) freeform text.  These should not
be highlighted, and
thus GeSHiCCodeParser::parseToken() recontextualises them so
that they aren't
highlighted.  Their highlighting when the parser is disabled
is tolerable as a
minor glitch.


------------------------------------------------------------
-------------
Using Tomcat but need to do more? Need to support web
services, security?
Get stuff done quickly with pre-integrated technology to
make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on
Apache Geronimo
http://sel.as-us.falkag.net/
sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
geshi-cvs mailing list
geshi-cvslists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geshi-cvs

[1]

about | contact  Other archives ( Real Estate discussion Medical topics )