Site Navigation
Harry's Place Business Site Tools Articles Change Colour
Diary
Almost a blog
Other Sites
The Banana Tree

If you are color blind try the Color chooser Color chooser Color chooser

Flex++

Fisrt of all let me state that flex is not very hard to learn. It takes a bit of getting used to but its fairly simple stuff. I have written this page more for my own requirements and as a reminder for the next time I need to use flex.

Flex++ Errors

I am actually using flex++ because that is what language I am using for The Banana Tree ( TBT ) search engine. This was all done on Debian sarge which caused a bit of a problem for me because there are two packages "flex-old" and "flex". The former is the one I had to use to get this working. When I tried the latter I was always getting the following error.

      bin/compile_flex.sh: line 2: flex++: command not found
      lex.yy.cc: In member function `virtual int yyFlexLexer::yylex()':
      lex.yy.cc:585: `yy_current_buffer' undeclared (first use this function)
      lex.yy.cc:585: (Each undeclared identifier is reported only once for each 
      function it appears in.)
      

The followong error was generated while using the flex-old package.

      lex.yy.cc: In member function `virtual int yyFlexLexer::yylex()':
      lex.yy.cc:569: `cin' undeclared (first use this function)
      lex.yy.cc:569: (Each undeclared identifier is reported only once for each 
      function it appears in.)
      lex.yy.cc:572: `cout' undeclared (first use this function)
      lex.yy.cc: In member function `virtual void yyFlexLexer::LexerError(const 
      char*)':
      lex.yy.cc:1427: `cerr' undeclared (first use this function)
      

This was caused by me making the wild assumption that flex++ would be using using namespace std somewhere in the generated lex.yy.cc file. This of course was a bad assumption and can be corrected by putting the following text in the declaratons part of your lex file.

        using namespace std;
      

The next set of errors I encounterd was

      /tmp/ccAVHrg4.o(.text+0x461): In function `yyFlexLexer::yylex()':
      : undefined reference to `yywrap'
      /tmp/ccAVHrg4.o(.text+0xf9a): In function `yyFlexLexer::yyinput()':
      : undefined reference to `yywrap'
      collect2: ld returned 1 exit status
      

This was also quite easy to fix by adding an option at the top of the delarations part o the lex file

        %option noyywrap
        %{
        #include <stdlib.h>
        using namespace std;
        int num_lines = 0;
        int num_chars = 0;
        int num_words = 0;
        %}
      

The flex++ html parser is working with
g++ version 3.2
flex++ version 2.5.4
It is a very naive implimentation and work on it is far from finished but its starting to come together. I would like it to be able to parse most of the HTML in the wild but this is a bit of a dream considering most HTML in circulation is pretty shit.