I just want to provide an update on the parser.
I have looked at the code I wrote few months ago and it
seems that I
still need to do several things there. I worked on the
changes for the
last few days and my goal is be able to finish major changes
this
week.
The status of the project is as following:
1. There is a Wiki Object Model (WOM) which was greatly
inspired by
XLinq object model to represent parsed result. It is also
used for
intermidiate representation for grammar. WOM consists of the
following
classes:
1.a. WomDocument - this class represents the whole document
and has
methods to read and write WOM to/from XML.
1.b. WomElement - equivalent to XmlElement
1.c. WomElementCollection - collection to represent child
elements of
WomElement. I think I will delete this class in favor of
linked list
of WomElement instances.
1.d. WomName - this is a new class which I did not check in
yet to
represent individual atomic string in WomNameTable.
1.e. WomNameTable - class inherited for XmlNameTable and can
be used
in XmlReader and XmlWriter. It represents list of atomic
strings
stored as WomName objects. It is thread safe class which can
be shared
across multiple threads. It can significantly reduce memory
use. It is
implemented as a singleton.
1.f. WomProperty - name value pair to represent a property
(attribute
in XML terms)of a WomElement.
1.g. WomPropertyCollection - collection of WomProperty
instances used
in WomElement.
1.h. WomXPathNavigator - implements IXPathNavigable
interface and
allows to execute XPath or XSLT against WOM.
2. Parser classes. While WOM can be considered as almost
complete. I
would estimate parser classes to be about 60% done. Parser
is
targeting to implement something similar to "Parsing
Expression
Grammar" (PEG) (see Wikipedia). Which is regular
expressions with
non-terminals (rules). It is very powerful and can cover
broader range
of grammars comparing with LL or LR, but price you pay is
higher
memory utilization and slower speed, which I believe is not
so
significant issue today comparing with situation 20-30 years
ago. The
implementation which I do is inspired by regular expression
implementation in Rotor.
2.a.ParserCharSet - represents a set of characters. It can
be either
character ranges or Unicode character classes.
2.b. ParserEngine - after redesign it will contain all
required tables
to parse source text and method to initiate parsing. It will
have such
tables as instruction list, rule list, list of char sets.
2.c. ParserEngineBuilder - creates an instance of
ParserEngine from grammar WOM.
2.d. ParserEngineProcessor - instance of this class will be
created
every time we parse a new string. It will use ParserEngine
tables to
parse text. The process is very similar to regular
expression parser
engine which executes instruction by instruction and
backtracks if it
cannot match at certain step. It will preserve any
intermidiate rule
results to provide optimal performance as it is done in PEG.
2.d. ParserExpressionReader - reads parser expressions and
produces
WOM of the expression which can be translated by
ParserEngineBuilder
to ParserEngine tables.
2.e. ParserInstruction - individual instruction used for
parsing. It
consists of an instruction code and two arguments.
2.f. ParserInstructionCode - all instrcution codes and
methods to
print them for debug.
2.g. ParserRule - individual rule which consists of rule
name and
start position in instruction table.
2.h. ParserXmlGrammarReader - should read Wiki grammar
written in Xml
and transform it to parser WOM. It can use
ParserExpressionReader to
read expressions written inside of the rules.
3. Number of NUnit tests to test behavior of classes listed
above. I
have implemented extension to NUnit to allow to provide
tests in XML.
So, this is an overview of current state of the parser. I am
currently
refactoring/implementing ParserEngineProcessor. After that I
will need
to implement ParserXmlGrammarReader class and create an
XSLT to
translate WOM to XHTML.
Thank you,
Vlad
------------------------------------------------------------
-------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the
chance to share your
opinions on IT & business topics through brief surveys
-- and earn cash
http://www.techsay.com/default.
php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Flexwiki-users mailing list
Flexwiki-users lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/flexwiki
-users
|