List Info

Thread: Starting with smalltalk




Starting with smalltalk
user name
2006-07-06 15:57:20
> I'll get a book on Smalltalk, take some time to
read-up on the syntax
> and try Squeak to see the difference between GNU and
Squeak.
You can try the tutorial that comes with GNU Smalltalk.

The differences are mostly conceptual.  Plus Squeak has a
huge (and 
sometimes very poorly designed) class library for graphics
and much more.
> Then, I'll get back to you all.
No need to wait.  We're here to help and to understand
where you have 
problems.
> Nice looking commandline parser by the way. I don't
understand it all
> yet, but I'll get there. In the end I'll try to make
a commandline
> arguments parser and post it somewhere.
Heh... I wanted to see how far I was from my (purposedly
exaggerate) 
30-minutes estimate of the time to make one.  So I did it.

Here it is.  220 lines in ~2 hours, slightly less actually,
including 30 
minutes for testing (didn't have time to do SUnit tests, so
they're just 
commands at the end of the file).  No comments for now, I
will add them 
when I commit.  :-P

Paolo
"=====================================================
=================
|
|   Smalltalk command-line parser
|
|
 ===========================================================
==========="


"=====================================================
=================
|
| Copyright 2006 Free Software Foundation, Inc.
| Written by Paolo Bonzini.
|
| This file is part of the GNU Smalltalk class library.
|
| The GNU Smalltalk class library is free software; you can
redistribute it
| and/or modify it under the terms of the GNU Lesser General
Public License
| as published by the Free Software Foundation; either
version 2.1, or (at
| your option) any later version.
| 
| The GNU Smalltalk class library is distributed in the hope
that it will be
| useful, but WITHOUT ANY WARRANTY; without even the implied
warranty of
| MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See
the GNU Lesser
| General Public License for more details.
| 
| You should have received a copy of the GNU Lesser General
Public License
| along with the GNU Smalltalk class library; see the file
COPYING.LIB.
| If not, write to the Free Software Foundation, 59 Temple
Place - Suite
| 330, Boston, MA 02110-1301, USA.  
|
 ===========================================================
==========="


Object subclass: #Getopt
		  instanceVariableNames: 'options longOptions prefixes
args currentArg actionBlock errorBlock'
		  classVariableNames: ''
		  poolDictionaries: ''
		  category: 'Language-Data types'
!

Getopt comment: 
'My instances represent ASCII string data types.  Being a
very common
case, they are particularly optimized.' !


!Getopt class methodsFor: 'instance creation'!

test: args with: pattern
    args do: [ :each |
	self
	    parse: each subStrings 
	    with: pattern
	    do: [  :y |
(x->y) printNl ]
	    ifError: [ (each->'error') displayNl ].
	Transcript nl ]!
    
parse: args with: pattern do: actionBlock
    ^self new
	parsePattern: pattern;
	actionBlock: actionBlock;
	errorBlock: [ ^nil ];
	parse: args!

parse: args with: pattern do: actionBlock ifError:
errorBlock
    ^self new
	parsePattern: pattern;
	actionBlock: actionBlock;
	errorBlock: [ ^errorBlock value ];
	parse: args!

!Getopt methodsFor: 'initializing'!

fullOptionName: aString
    (prefixes includes: aString) ifFalse: [ errorBlock value
].
    longOptions do: [ :k |
	(k startsWith: aString) ifTrue: [ ^k ] ].
    self halt!

optionKind: aString
    | kindOrString |
    kindOrString := options at: aString ifAbsent: [
errorBlock value ].
    ^kindOrString isSymbol
	ifTrue: [ kindOrString ]
	ifFalse: [ options at: kindOrString ]!

optionName: aString
    | kindOrString |
    kindOrString := options at: aString ifAbsent: [
errorBlock value ].
    ^kindOrString isSymbol
	ifTrue: [ aString ]
	ifFalse: [ kindOrString ]!

parseRemainingArguments
    [ args atEnd ] whileFalse: [
	actionBlock value: nil value: args next ]!

parseOption: name kind: kind with: arg
    | theArg |
    theArg := arg.
    (kind = #mandatoryArg and: [ arg isNil ])
	ifTrue: [
	    args atEnd ifTrue: [ errorBlock value ].
	    theArg := args next ].
    (kind = #noArg and: [ theArg notNil ])
	ifTrue: [ errorBlock value ].

    actionBlock value: name value: theArg!
    
parseLongOption: argStream
    | name kind haveArg arg |
    name := argStream upTo: $=.
    argStream skip: -1.

    name := self fullOptionName: name.
    name := self optionName: name.
    kind := self optionKind: name.
    haveArg := argStream nextMatchFor: $=.
    arg := haveArg ifTrue: [ argStream upToEnd ] ifFalse: [
nil ].
    self parseOption: name kind: kind with: arg!

parseShortOptions: argStream
    | name kind ch haveArg arg |
    [ argStream atEnd ] whileFalse: [
	ch := argStream next.
	name := self optionName: ch.
	kind := self optionKind: ch.
	haveArg := kind ~~ #noArg and: [ argStream atEnd not ].
	arg := haveArg ifTrue: [ argStream upToEnd ] ifFalse: [ nil
].
	self parseOption: name kind: kind with: arg ]!

parseOneArgument
    | arg argStream |
    arg := args next.
    arg = '--' ifTrue: [ ^self parseRemainingArguments ].

    (arg isEmpty or: [ arg first ~= $- ])
	ifTrue: [ ^actionBlock value: nil value: arg ].

    argStream := arg readStream.
    (arg at: 2) = $-
	ifTrue: [ argStream next: 2. self parseLongOption:
argStream ]
	ifFalse: [ argStream next. self parseShortOptions:
argStream ]!

parse
    [ args atEnd ] whileFalse: [ self parseOneArgument ]!
  
!Getopt methodsFor: 'initializing'!

addPrefixes: option
    longOptions add: option.
    1 to: option size do: [ :length |
	prefixes add: (option copyFrom: 1 to: length) ]!

rejectBadPrefixes
    longOptions := longOptions asSortedCollection: [ :a :b |
a size <= b size ].

    prefixes := prefixes select: [ :each | (prefixes
occurrencesOf: each) == 1 ].
    prefixes := prefixes asSet.
    prefixes addAll: longOptions!

initialize
    options := Dictionary new.
    longOptions := Set new.
    prefixes := Bag new!

checkSynonyms: synonyms
    (synonyms allSatisfy: [ :each | each startsWith: '-'
])
	ifFalse: [ ^self error: 'expected -' ].

    (synonyms anySatisfy: [ :each | each size < 2 or: [
each = '--' ] ])
	ifTrue: [ ^self error: 'expected option name' ].

    synonyms do: [ :each |
	((each startsWith: '--') and: [ each includes: $= ])
	    ifTrue: [ ^self error: 'unexpected = inside long
option' ] ]!

colonsToKind: colons
    colons = 0 ifTrue: [ ^#noArg ].
    colons = 1 ifTrue: [ ^#mandatoryArg ].
    colons = 2 ifTrue: [ ^#optionalArg ].
    ^self error: 'too many colons, don''t know what to do
with them...'!

atSynonym: synonym put: kindOrName
    | key |
    synonym size = 2
	ifTrue: [ key := synonym at: 2 ]
	ifFalse: [ key := synonym copyFrom: 3. self addPrefixes:
key ].

    (options includes: key) ifTrue: [ self error:
'duplicate option' ].
    options at: key put: kindOrName.
    ^key!

parseSynonyms: synonyms kind: kind
    | last |
    last := self atSynonym: synonyms last put: kind.
    synonyms from: 1 to: synonyms size - 1 do: [ :each |
	self atSynonym: each put: last ]!

parseOption: opt
    | colons optNames synonyms kind |
    optNames := opt copyWithout: $:.
    colons := opt size - optNames size.
    opt from: optNames size + 1 to: opt size do: [ :ch |
	ch = $: ifFalse: [ ^self error: 'invalid pattern, colons
are hosed' ] ].

    kind := self colonsToKind: colons.
    synonyms := optNames subStrings: $|.
    self checkSynonyms: synonyms.
    self parseSynonyms: synonyms kind: kind!

parsePattern: pattern
    self initialize.
    pattern subStrings do: [ :opt | self parseOption: opt ].
    self rejectBadPrefixes!

actionBlock: aBlock
    actionBlock := aBlock!
	    
errorBlock: aBlock
    errorBlock := aBlock!
	    
parse: argsArray
    args := argsArray readStream.
    self parse.
    ^args contents!

!SystemDictionary class methodsFor: 'command-line'!

arguments: pattern do: actionBlock
    ^Getopt
	parse: self arguments
	with: pattern
	do: actionBlock!

arguments: pattern do: actionBlock ifError: errorBlock
    ^Getopt
	parse: self arguments
	with: pattern
	do: actionBlock
	ifError: errorBlock! !

"Getopt new parsePattern: '-B'"
"Getopt new parsePattern: '--long'"
"Getopt new parsePattern: '--longish
--longer'"
"Getopt new parsePattern: '--long --longer'"
"Getopt new parsePattern: '-B:'"
"Getopt new parsePattern: '-B::'"
"Getopt new parsePattern: '-a|-b'"
"Getopt new parsePattern: '-a|--long'"
"Getopt new parsePattern:
'-a|--very-long|--long'"
"Getopt test: #('-a' '-b' '-ab' '-a -b') with:
'-a -b'"
"Getopt test: #('-a' '-b' '-ab' '-a -b') with:
'-a: -b'"
"Getopt test: #('-a' '-b' '-ab' '-a -b') with:
'-a:: -b'"
"Getopt test: #('--longish' '--longer' '--longi'
'--longe' '--lo' '-longer') with: '--longish
--longer'"
"Getopt test: #('--lo' '--long' '--longe'
'--longer') with: '--long --longer'"
"Getopt test: #('--noarg' '--mandatory'
'--mandatory foo' '--mandatory=' '--mandatory=foo'
'--optional' '--optional foo') with: '--noarg
--mandatory: --optional::'"
"Getopt test: #('-a' '-b') with: '-a|-b'"
"Getopt test: #('--long' '-b') with:
'-b|--long'"
"Getopt test: #('--long=x' '-bx') with:
'-b|--long:'"
"Getopt test: #('-b' '--long' '--very-long')
with: '-b|--very-long|--long'"
"Getopt test: #('--long=x' '--very-long x'
'-bx') with: '-b|--very-long|--long:'"
"Getopt test: #('-b -- -b' '-- -b' '-- -b -b')
with: '-b'"
_______________________________________________
help-smalltalk mailing list
help-smalltalkgnu.org

http://lists.gnu.org/mailman/listinfo/help-smalltalk
Unicode String?
user name
2006-07-07 01:18:36
Hi,

I've tried GNU smalltalk and for me it seems good. But I
have a  
problem: current implementation does not support Unicode. It
seems  
that it only supports single byte character only. I've also
tried  
squeak, which seems less faster than GNU smalltalk - I'm
not sure on  
this, this might not be correct - has unicode compatible
string  
implementation and I think this kind of approach is good. Is
there  
any change to have unicode compatible string implementation
in next  
version of GNU smalltalk?

Thank in advance.


_______________________________________________
help-smalltalk mailing list
help-smalltalkgnu.org

http://lists.gnu.org/mailman/listinfo/help-smalltalk
Unicode String?
user name
2006-07-07 07:03:26
Chun Sungjin wrote:
> Hi,
>
> I've tried GNU smalltalk and for me it seems good. But
I have a 
> problem: current implementation does not support
Unicode. It seems 
> that it only supports single byte character only. I've
also tried 
> squeak, which seems less faster than GNU smalltalk -
I'm not sure on 
> this, this might not be correct - has unicode
compatible string 
> implementation and I think this kind of approach is
good. Is there any 
> change to have unicode compatible string implementation
in next 
> version of GNU smalltalk?
What do you need exactly?  The main missing thing is support
for 
Character objects with values above 256.  However if you are
content 
with multibyte character sets like UTF-8, or with Unicode
character 
codes, that's fine.

For character set translation, if you load the I18N package,
GNU 
Smalltalk gets an iconv wrapper.  The main method you need
is 
EncodedStream>>#on:from:to: (e.g. on: 'abc' from:
'UTF-8' to: 'UCS-4').

To extract Unicode character codes from an UCS-4LE encoded
string, you 
can use (ByteStream on: x asByteArray) and send nextLong. 
For 
big-endian, there is no class but I was thinking of adding a
#bigEndian 
method to ByteStream for the next version.

Things that could be useful are
    Integer>>#asUTF8String
    String class>>#utf8FromCodepoint: (same as above)
    String>>#utf8Stream
    UTF8Stream (returns Unicode character codes)
    ... (tell me what you need) ...

Paolo


_______________________________________________
help-smalltalk mailing list
help-smalltalkgnu.org

http://lists.gnu.org/mailman/listinfo/help-smalltalk
Unicode String?
user name
2006-07-07 08:05:18
Hi,

main problem is that for example, if I did create an
instance of  
string like this;

a := 'Some MultiByte Encoded String'.

then

a size

does not answer correct length of string.

However, I will try what you said, thank you

On Jul 7, 2006, at 4:03 PM, Paolo Bonzini wrote:

> Chun Sungjin wrote:
>> Hi,
>>
>> I've tried GNU smalltalk and for me it seems good.
But I have a  
>> problem: current implementation does not support
Unicode. It seems  
>> that it only supports single byte character only.
I've also tried  
>> squeak, which seems less faster than GNU smalltalk
- I'm not sure  
>> on this, this might not be correct - has unicode
compatible string  
>> implementation and I think this kind of approach is
good. Is there  
>> any change to have unicode compatible string
implementation in  
>> next version of GNU smalltalk?
> What do you need exactly?  The main missing thing is
support for  
> Character objects with values above 256.  However if
you are  
> content with multibyte character sets like UTF-8, or
with Unicode  
> character codes, that's fine.
>
> For character set translation, if you load the I18N
package, GNU  
> Smalltalk gets an iconv wrapper.  The main method you
need is  
> EncodedStream>>#on:from:to: (e.g. on: 'abc'
from: 'UTF-8' to:  
> 'UCS-4').
>
> To extract Unicode character codes from an UCS-4LE
encoded string,  
> you can use (ByteStream on: x asByteArray) and send
nextLong.  For  
> big-endian, there is no class but I was thinking of
adding a  
> #bigEndian method to ByteStream for the next version.
>
> Things that could be useful are
>    Integer>>#asUTF8String
>    String class>>#utf8FromCodepoint: (same as
above)
>    String>>#utf8Stream
>    UTF8Stream (returns Unicode character codes)
>    ... (tell me what you need) ...
>
> Paolo



_______________________________________________
help-smalltalk mailing list
help-smalltalkgnu.org

http://lists.gnu.org/mailman/listinfo/help-smalltalk
{Spam?} Re: Unicode String?
user name
2006-07-07 09:17:07
Chun Sungjin wrote:
> Hi,
>
> main problem is that for example, if I did create an
instance of 
> string like this;
>
> a := 'Some MultiByte Encoded String'.
>
> then
>
> a size
>
> does not answer correct length of string.
Well, strlen does not in C, too.  You need mbrlen, and #size
is more 
like strlen than mbrlen.

Also, the result heavily depends on the chosen character
set.  If we 
want to have #utf8Size, that's fine.  But #size should be
the number of 
*bytes*, not of characters.

I'm seeing now if I can add an EncodedStream method that
extracts 
Unicode characters.  Then what you wanted would be something
like

    (EncodedStream wordsOn: 'some string') contents size

for which, of course, we can add a utility method.

Paolo


_______________________________________________
help-smalltalk mailing list
help-smalltalkgnu.org

http://lists.gnu.org/mailman/listinfo/help-smalltalk
[1-5]

about | contact  Other archives ( Real Estate discussion Medical topics )