mentby.com
Blog | Jobs | Help | Signup | Login

Hi all,

I've used Parsec to "tokenize" data from a text file. It was actually
quite easy, everything is correctly identified.

So now I have a list/stream of self defined "Tokens" and now I'm stuck.
Because now I need to write my own parsec-token-parsers to parse this
token stream in a context-sensitive way.

Uhm, how do I that then?

Günther

a Token is something like:

data Token = ZE String
            | OPS
            | OPSShort String
            | OPSLong String
            | Other String
            | ZECd String
              deriving Show


Günther Schmidt Mon, 11 Jan 2010 16:35:52 -0800

Hi, Günther, you could write functions that pattern-match on various
sequences of tokens in a list, you could for example have a look at
the file Evaluator.hs in my scheme interpreter haskeem, or you could
build up more-complex data structures entirely within parsec, and for
this I would point you at the file Parser.hs in my accounting program
umm; both are on hackage. Undoubtedly there are many more and probably
better examples, but I think these are at least a start...

regards, Uwe


Uwe Hollerbach Mon, 11 Jan 2010 19:21:07 -0800

Hi Günther

Get the Parsec manual from Daan Leijen's home page then see the
section '2.11 Advanced: Seperate scanners'.

Though mentioned rarely, Parsec in its regular mode is a scannerless
parser. Unless you have complex formatting problems (e.g. indentation
sensitivity, vis Python or Haskell's syntax) scannerless parsers are
often much more convenient than parsers lexers (see the grammar
formalism SDF for many examples). For Parsec, if you want a separate
scanner there's quite a lot of boilerplate you need to manufacture if
you want to use the technique in section 2.11. Usually I can get by
with the Token and Language modules or do a few tricks with the
'symbol' parser instead.

Parsec is monadic so (>>=) allows you to write context-sensitive
parsers, see section '3.1. Parsec Prim'  for a discussion and example.
Again, writing a context-sensitive parser can often be more trouble
than studying the format of the input and working out a context-free
grammar (if there is one).

Best wishes

Stephen


Stephen Tetley Tue, 12 Jan 2010 00:42:38 -0800

Maybe this can be of help (though it's for Parsec 2): http://therning.org/magnus/archives/367

It's not the only example of this either, tagsoup-parsec is available
on Hackage.

/M

--
Magnus Therning                        (OpenPGP: 0xAB4DFBA4)
magnusï¼ therningï¼=8Eorg          Jabber: magnusï¼ thernigï¼=8Eorg
http://therning.org/magnus         identi.ca|twitter: magthe


Magnus Therning Tue, 12 Jan 2010 01:47:48 -0800

Ð=92 Ñ=81ообÑ=89ении оÑ=8212 Ñ=8FнваÑ=80Ñ=8F 2010 03:35:10 Günther SchmidtнапиÑ=81ал:

That's pretty easy actually. You can use function `token' to define you ow
primitive parsers. It's defined in Parsec.Prim If I'm correctly remember.

Also you could want to add information about position in the source code t
you lexems. Here is some code to illustrate usage:

Haskell-Cafe mailing list
Haskell-Cafe*******/mailman/listinfo/haskell-cafe


Khudyakov Alexey Tue, 12 Jan 2010 10:33:24 -0800



Post a Comment